3D Math for Games
Vectors, matrices, and quaternions are the easy part. The bugs live in the conventions: row- or column-major, left- or right-handed, where the projection sends Y and how deep the clip space runs. We build the math from scratch in C++ and Rust, then pin down the conventions Vulkan actually wants, so your first triangle comes out right-side up.
01The conventions are the hard part
The vector and matrix algebra in this tutorial is the same algebra you saw in school. What makes 3D math for games its own skill is that a single rotation can be written several equally correct ways, and most of them will render upside down or inside out on your particular GPU. The math is invariant. The conventions are not, and the conventions are where the time goes[1][2].
Four independent choices trip people up, and they are genuinely independent, so you can mix them in any combination:
- Storage order: row-major or column-major. How the numbers sit in memory.
- Vector convention: column vectors (vโฒ = Mยทv, matrix on the left) or row vectors (vโฒ = vยทM, matrix on the right).
- Handedness: left- or right-handed world space. Which way the cross product points.
- Clip space: where +Y ends up on screen and whether the depth range is [โ1, 1] or [0, 1].
We pick one of each and hold it for the entire tutorial: column-major storage, column vectors (vโฒ = Mยทv), right-handed world space, and Vulkan clip space. These are the defaults of the two libraries the code uses, glm for C++[3] and glam for Rust[4]. Stated once; every sample below obeys it.
Working Vec3, Mat4, and Quat in both languages, the modelโviewโprojection chain assembled correctly for Vulkan, a precise account of why your first triangle was upside down, and an honest read on quaternions versus Euler angles, including when NLERP beats SLERP and when it doesn't.
02Vectors
A 3D vector is three numbers, but the two products you do with it carry most of the geometry in a game: the dot product measures alignment, the cross product builds a perpendicular.
New to the notation? Start here
A handful of symbols recur through this whole page. Named once, none of them are hard:
- a, b, v: a vector, a little arrow with a direction and a length, written as a list of numbers like
(3, 0, โ1). - ax, ay, az: the components of a vector, its x, y, and z numbers. The small letter says which axis.
- |a|: the length (or magnitude) of a vector. The bars are the same idea as absolute value: how long the arrow is, always positive.
- ฮธ: theta, a Greek letter. Maths labels angles with Greek letters out of habit; ฮธ is just the angle between two arrows. Later sections also use ฮฉ (omega) for an angle.
- ยท and ร: the two ways to multiply vectors, the dot product (gives a number) and the cross product (gives a new vector). They carry most of this section.
- รข: a letter wearing a hat means a unit vector, same direction, length exactly 1.
Every symbol in the equations below is also hoverable: rest the pointer on any letter to see what it stands for.
The collapses two vectors to a single number that encodes the angle between them:
Two routes to the same number. Left: multiply the matching components and add, pure arithmetic. Right: multiply the two lengths by the cosine of the angle between them. The right-hand form is why the dot product measures alignment: cos ฮธ is 1 when the arrows point the same way, 0 when they are perpendicular, and negative once they point more than 90ยฐ apart. Hover any symbol to see what it stands for.
Need a refresher on sine, cosine, and ฮธ?
ฮธ (theta) is just a name for an angle, the way x is a name for a number. Greek letters for angles is an old habit, nothing deeper.
Cosine (cos) and sine (sin) each take an angle and return a number between โ1 and 1. The one fact that makes the dot product click:
cos 0ยฐ = 1: arrows aligned, dot product at its largest.cos 90ยฐ = 0: perpendicular, dot product exactly zero.cos 180ยฐ = โ1: opposite, dot product most negative.
Sine is the same wave shifted over: sin 0ยฐ = 0, sin 90ยฐ = 1. Sine tracks the perpendicular part, which is why it runs the cross product (area) while cosine runs the dot product (alignment). The widget at the end of this section spins the angle so you can watch both move.
Its sign tells you which half-space b is in relative to a: positive in front, negative behind, zero perpendicular. That single fact powers back-face culling, the NยทL term in diffuse lighting, and "is the enemy in my field of view" checks.
The (3D only) returns a vector perpendicular to both inputs, with length equal to the area of the parallelogram they span:
|a ร b| = |a| |b| sin ฮธ
The result is a new vector pointing perpendicular to both a and b, straight out of the plane they share. Each output component criss-crosses the other two axes; that pattern is the whole formula. Its length equals the area of the parallelogram a and b span, and because that area carries sin ฮธ, the cross product is largest when the arrows are perpendicular and shrinks to zero when they are parallel.
The component formula above is fixed. Whether the result points "up" or "down" relative to your screen depends on whether your basis is left- or right-handed. The right-hand rule only describes a right-handed basis; Unreal's coordinate system follows the left-hand rule and its docs say so explicitly[8]. Don't write "the cross product gives the right-hand-rule normal" without scoping it to the basis you're in.
Projection of a onto b drops a perpendicular from the tip of a onto the line through b:
How much of a points along b? Drop a straight line from the tip of a onto b's direction; the shadow it casts is the projection. The fraction (aยทb)/(bยทb) works out to how many copies of b fit into that shadow, and multiplying by b aims the answer back along b. When b is already unit length the bottom is 1 and the whole thing collapses to (aยทb) b.
Normalization scales a vector to unit length while keeping its direction:
Divide a vector by its own length and the direction survives but the length becomes exactly 1, a unit vector, written with a hat: รข. Directions you compare or feed into lighting should almost always be normalized first. The trap: a zero-length vector has |a| = 0, so this divides by zero and produces a NaN that then spreads through every transform it touches. Guard it.
And when you actually need the angle between two vectors, acos(aยทb) (for unit vectors) is numerically poor near 0 and ฯ; atan2(|aรb|, aยทb) is the robust form.
Drag the blue arrow, or use the sliders. The dot product flips sign as b swings past 90ยฐ, and the parallelogram (the cross product's magnitude) swells to its largest when the arrows are perpendicular:
struct Vec3 { float x, y, z; };
inline float dot(Vec3 a, Vec3 b) { return a.x*b.x + a.y*b.y + a.z*b.z; }
inline Vec3 cross(Vec3 a, Vec3 b) {
return { a.y*b.z - a.z*b.y, // x
a.z*b.x - a.x*b.z, // y
a.x*b.y - a.y*b.x }; // z
}
inline float length(Vec3 v) { return std::sqrt(dot(v, v)); }
inline Vec3 normalize(Vec3 v) {
float len = length(v);
return len > 1e-8f ? Vec3{ v.x/len, v.y/len, v.z/len } : v; // guard the zero vector
}
// In practice: glm::dot, glm::cross, glm::length, glm::normalize.
#[derive(Clone, Copy)]
struct Vec3 { x: f32, y: f32, z: f32 }
fn dot(a: Vec3, b: Vec3) -> f32 { a.x*b.x + a.y*b.y + a.z*b.z }
fn cross(a: Vec3, b: Vec3) -> Vec3 {
Vec3 { x: a.y*b.z - a.z*b.y, // x
y: a.z*b.x - a.x*b.z, // y
z: a.x*b.y - a.y*b.x } // z
}
fn length(v: Vec3) -> f32 { dot(v, v).sqrt() }
fn normalize(v: Vec3) -> Vec3 {
let len = length(v);
if len > 1e-8 { Vec3 { x: v.x/len, y: v.y/len, z: v.z/len } } else { v } // guard zero
}
// In practice: glam's Vec3::dot, ::cross, ::length, ::normalize_or_zero.
03Matrices
A 4ร4 matrix packs a 3ร3 linear part (rotation, scale, shear) and a translation into one object that the GPU can apply to every vertex with a single multiply. Composing transforms is matrix multiplication, and the order matters: TยทR โ RยทT.
The thing that actually causes bugs is that two independent conventions get conflated. Storage order (row- vs column-major) is how the floats sit in memory. Vector convention (column vs row vectors) is which side the matrix goes on. They are orthogonal: any of the four combinations is valid[2].
"Row-major means I pre-multiply" is false. Storage order does not dictate multiply order. glm and glam both store column-major and use column vectors (vโฒ = Mยทv)[3][4]. They are not contradicting each other; the two choices are simply independent.
Three facts follow, and each maps to a real bug:
- With column vectors, a modelโworldโview chain is written right-to-left, in the order the GPU applies it: Mview ยท Mworld ยท Mmodel ยท v. Read it as "model first."
- Translation lives in the 4th column under column vectors, the 4th row under row vectors. Put it in the wrong place and objects fly off to infinity.
- GPUs care about memory layout, not your math convention. GLSL std140 matrices are column-major; upload a row-major matrix without transposing and you silently transpose your transform.
One more scoping note: don't say "every engine stores column-major." The OpenGL and glTF lineage does, and so do glm and glam; DirectX's XMMATRIX is row-major with row vectors. Both are correct; they are different lineages[2].
Poke at it. The widget composes a transform from translation, rotation, and a (possibly non-uniform) scale, and prints the resulting column-major matrix. Flip the application order or push the scale non-uniform and watch the object shear:
// glm: column-major storage, column vectors. Read right-to-left:
// scale first, then rotate, then translate.
glm::mat4 model = glm::mat4(1.0f); // identity
model = glm::translate(model, position); // T
model = model * glm::mat4_cast(rotation); // R (quaternion โ matrix)
model = glm::scale(model, scale); // S
// Equivalent: model = T * R * S; applied to a point as model * vec4(p, 1).
// glam: column-major storage, column vectors. Same order.
let model = Mat4::from_scale_rotation_translation(
scale, // S
rotation, // R (a Quat)
position, // T
);
// Builds T * R * S directly; applied to a point as model * p.extend(1.0).
Wrong answers, and why: column-major storage does not force row vectors (the conventions are independent); the transpose-on-upload symptom is a memory-layout mismatch, not a quaternion or near/far issue.
04Coordinate spaces & handedness
A vertex makes a journey: object space โ world space โ view (camera) space โ clip space โ NDC โ screen. Each arrow is one matrix, except the last two, which are the perspective divide and the viewport transform the GPU does for you.
Handedness is fixed by the basis orientation. With +X right and +Y up: if +Z points toward you, the system is right-handed; if +Z points into the screen, it's left-handed. Formally, the sign of (X ร Y) ยท Z.
You also have to say which axis is up and which is forward, and the units. Three shipping conventions, all different:
- glTF: right-handed, +Y up, +Z forward, meters, radians[7].
- Unity: left-handed, +Y up, +Z forward, conventionally meters[9].
- Unreal: left-handed, +Z up, +X forward, world unit = 1 centimeter[8].
Import a model across a handedness boundary and one axis flips, which inverts triangle winding and therefore which faces get culled. Unreal's docs call out the Y-axis inversion on import from right-handed tools[8].
05Model โ view โ projection
The three matrices that get a model onto the screen. Model takes object space to world space. View takes world to eye space, and is the inverse of the camera's world transform. Projection takes eye space to clip space, a homogeneous space where the divide hasn't happened yet.
After the vertex shader writes clip-space (x, y, z, w), the GPU does the perspective divide (xyz / w) to get NDC, then the viewport transform to get framebuffer pixels. The projection matrix is where the API's clip-space conventions live, and this is where OpenGL and Vulkan differ in ways that produce upside-down or depth-broken renders.
Two consequences for the projection matrix:
- The [0, 1] depth range changes the matrix's third row, which is why libraries ship distinct variants: glm gates it with
GLM_FORCE_DEPTH_ZERO_TO_ONE(and exposesperspectiveRH_ZOvsperspectiveRH_NO)[3]; glam splitsMat4::perspective_rh(zero-to-one, for Vulkan, WebGPU, Direct3D) fromperspective_rh_gl(negative-one-to-one, for OpenGL)[4]. - The Y-down NDC is handled one of a few ways: negate
gl_Position.yin the shader, bake a Y-flip into the projection, or set a negativeVkViewport.height, which negates clip-space Y in the viewport transform. Negative viewport height is core since Vulkan 1.1[5].
Vulkan's NDC is right-handed. The visible flip people hit is the Y-down framebuffer mapping, which is separate from the depth range, which is separate again from your world-space handedness[5]. Conflating the three is the dominant OpenGLโVulkan porting bug. And reversed-Z is yet another, orthogonal, precision technique; it is not the same as the [0, 1] range.
The widget renders the same spinning scene under selectable conventions. Switch the clip-space target without compensating and watch exactly which thing breaks:
// Define this BEFORE including glm so depth maps to [0, 1] for Vulkan:
// #define GLM_FORCE_DEPTH_ZERO_TO_ONE
glm::mat4 proj = glm::perspective(glm::radians(60.0f), aspect, 0.1f, 1000.0f);
proj[1][1] *= -1.0f; // flip Y: glm assumes GL's Y-up NDC, Vulkan's is Y-down
// Alternative to the flip: set a negative VkViewport.height.
// glam's perspective_rh already targets [0, 1] depth (Vulkan/wgpu/D3D).
let mut proj = Mat4::perspective_rh(60.0_f32.to_radians(), aspect, 0.1, 1000.0);
proj.y_axis.y *= -1.0; // flip Y for Vulkan's Y-down NDC (or use a negative viewport height)
// Use perspective_rh_gl instead if you ever target OpenGL's -1..1 depth.
06Quaternions
A is four numbers that encode an orientation. A unit quaternion q = (w, x, y, z) represents a rotation of angle ฮธ about a unit axis n as:
A quaternion's four numbers are not arbitrary: pack the cosine of half the angle into the first slot, and the axis scaled by the sine of half the angle into the other three. The orientation is stored directly, with no sequence of separate spins to go wrong, which is exactly what dodges gimbal lock (next subsection).
The half-angle is intrinsic, not a quirk: it falls out of the sandwich product used to rotate a vector, vโฒ = q v qโปยน (with v written as a pure quaternion, and qโปยน = the conjugate for unit q).
Two properties matter constantly. First, double cover: q and โq are the same rotation. That's harmless when you apply a quaternion and decisive when you interpolate two (next section). Second, the budget: four floats versus nine for a 3ร3 matrix, and composition is one quaternion product versus a matrix multiply.
Versus Euler angles, and gimbal lock
Euler angles store an orientation as three sequential rotations (yaw, pitch, roll). They read nicely in an inspector and fail in a specific way:
Gimbal lock is the loss of one degree of freedom that happens when the middle rotation of an Euler sequence reaches ยฑ90ยฐ, which lines up the first and third rotation axes so they spin about the same direction[10]. It is a property of the three-angle representation, not of 3D rotation. Quaternions and rotation matrices don't gimbal-lock because they don't decompose orientation into three sequential axis rotations.
Say it precisely. Gimbal lock is not "caused by quaternions being missing" or "by 90ยฐ rotations in general"; it is the middle-axis-at-ยฑ90ยฐ degeneracy of a three-angle sequence. And resist the loose claim that "quaternions have no singularities": they have the ยฑq double cover. What they don't have is gimbal lock.
The rig below is three nested gimbals driven by Euler angles. Drive the middle ring to 90ยฐ and the outer and inner rings snap parallel; yaw and roll then produce the same motion. Flip to the quaternion track and the same target orientation is reached with no lock:
Versus rotation matrices, and drift
Both represent the same rotations, and the GPU ultimately wants a matrix for the vertex transform, so quaternions are usually converted to a matrix at the end of the chain. Pick by operation: quaternions are cheaper to store, compose, interpolate, and renormalize; matrices are cheaper to apply to many vectors, which is why skinning shaders consume matrices.
Repeated quaternion multiplication accumulates floating-point error, so a quaternion that should stay unit-length slowly drifts off the unit sphere; left uncorrected, it introduces scale/shear distortion when converted to a matrix. The fix is to renormalize periodically, every few multiplications, not necessarily after every single one. Rotation matrices drift the same way (they lose orthonormality) and need re-orthonormalization rather than mere column-normalization.
07Interpolating rotations
Blending two orientations is the daily job of an animation system. There are three contenders, and the "obvious correct" one is not always the right pick.
SLERP (spherical linear interpolation, Shoemake 1985[11]) walks the great-circle arc between two unit quaternions at constant angular velocity:
ฮฉ (omega) is the angle between the two orientations, read straight off their dot product (cos ฮฉ = qโ ยท qโ). The two sine weights slide from all-qโ at t = 0 to all-qโ at t = 1, and because they ride the curve of the sphere rather than a straight chord, the blend turns at a steady rate the whole way across.
LERP interpolates the components in a straight line: a chord through the sphere, not along it. The result is not unit length and its rotation speed is non-uniform. NLERP is LERP followed by a renormalize; it rides the same arc as SLERP but not at constant angular velocity, running faster in the middle and slower at the ends[12].
For small angular steps (adjacent animation keyframes, per-frame blends), NLERP is indistinguishable from SLERP, cheaper, and commutative, which is why many animation runtimes use it[12][13]. SLERP's constant-velocity property only becomes visible across large single interpolations, like a slow camera orbit over a wide arc. The framing (Blow, after Shoemake) is a three-way tradeoff: you want commutativity, constant velocity, and minimal torque, but you can't have all three. SLERP gives the last two; NLERP gives commutativity and minimal torque[13].
The widget animates t from 0 to 1 over real seconds and draws all three paths plus a live angular-velocity strip. Crank the angle to large values to make the methods diverge, and toggle off the shortest-path fix to see the classic bug:
Wrong answers, and why: normalizing the inputs doesn't address which arc you travel (the double cover does); dropping to LERP trades the long-way bug for a non-unit, variable-speed result. And NLERP is the one without constant angular velocity, so that can't be its advantage.
08Building & decomposing transforms
Most engines store a transform as translation, rotation (a quaternion), and scale, and build the matrix on demand. glTF defines a node's transform as exactly TยทRยทS with the rotation as a quaternion[7].
Composing is the easy direction:
With column vectors you read the chain right to left, the order it actually runs: scale the object, then rotate it, then translate it into the world. Swap two of these and the result changes, because rotation and non-uniform scale do not commute (the ยง3 widget shows the shear).
Decomposing back out is where the edges are: translation is the last column, scale is the length of each of the three basis columns, and the rotation is what's left after dividing the columns by their scales.
Under shear or negative (mirrored) scale, you cannot cleanly split a matrix into rotation plus positive scale. glam's to_scale_rotation_translation documents that the input "is expected to be non-degenerate and without shearing, or the output will be invalid"[4]. Two more from the same family: non-uniform scale doesn't commute with rotation, so TยทRยทS โ TยทSยทR; and lighting needs the inverse-transpose of the upper 3ร3 to transform normals under non-uniform scale, not the model matrix itself.
09Random numbers
Procedural placement, loot rolls, particle jitter, AI variation: games lean on random numbers, and the C standard library's rand() is the wrong tool for all of them.
rand() is implementation-defined: there's no required algorithm and no quality guarantee. Common implementations are LCGs whose low-order bits are weak, and RAND_MAX can be as small as 32767[15]. On top of that, rand() % n introduces modulo bias unless n divides RAND_MAX + 1[15].
Reach for a modern small-state generator. PCG applies a permutation to an LCG's output: small fast state, strong statistical quality, multiple independent streams[14]. The xoshiro/xoroshiro family is comparably fast with tiny state. Both are plenty for gameplay.
- For a bounded range, use rejection sampling or Lemire's multiply-shift, not
%, or you reintroduce bias. - These are not cryptographically secure. Fine for loot and particles; never for anti-cheat seeds or security.
- Reproducible replays need a fixed seed and a fixed algorithm. That's another reason to avoid
rand(), whose algorithm varies across platforms and compilers.
// PCG32 (O'Neill). State is 64 bits; output is a permuted 32-bit word.
struct Pcg32 {
uint64_t state = 0x853c49e6748fea9bULL;
uint64_t inc = 0xda3e39cb94b95bdbULL; // stream selector, must be odd
uint32_t next() {
uint64_t old = state;
state = old * 6364136223846793005ULL + inc; // LCG step
uint32_t xorshifted = (uint32_t)(((old >> 18) ^ old) >> 27);
uint32_t rot = (uint32_t)(old >> 59);
return (xorshifted >> rot) | (xorshifted << ((-rot) & 31)); // permute
}
};
// PCG32 (O'Neill). Same constants, same permutation.
struct Pcg32 { state: u64, inc: u64 } // inc (stream) must be odd
impl Pcg32 {
fn next_u32(&mut self) -> u32 {
let old = self.state;
self.state = old.wrapping_mul(6364136223846793005).wrapping_add(self.inc);
let xorshifted = (((old >> 18) ^ old) >> 27) as u32;
let rot = (old >> 59) as u32;
xorshifted.rotate_right(rot) // the permutation, as one intrinsic
}
}
// In practice: the `rand_pcg` / `rand` crates ship this and the bounded sampling.
These generators show the core step only. A shipping RNG adds: unbiased bounded sampling (rejection or Lemire), a documented seeding routine, float generation in [0, 1), and a seek/stream API for reproducible replays. Use the library versions (rand_pcg in Rust, the reference PCG headers in C++) rather than hand-rolling those parts.
10Pitfalls
Almost every bug in this tutorial is a convention mismatch wearing a costume. The symptom, and the convention behind it:
11What's next
This is Phase 2 of Build a Game Engine. The math here runs every frame inside the loop, so the next module is The Game Loop & Time, then the Platform layer and your first triangle in Vulkan, where the clip-space conventions from ยง5 stop being abstract. The floating-point determinism thread, the reason a fixed timestep alone doesn't guarantee identical results across machines, continues in IEEE-754 Floating Point.
- Eric Lengyel. Foundations of Game Engine Development, Vol. 1: Mathematics. Terathon Software, 2016. The primary textbook reference for the vector, matrix, and quaternion fundamentals here.
- Tomas Akenine-Mรถller, Eric Haines, Naty Hoffman, et al. Real-Time Rendering, 4th ed., ch. 4 "Transforms." realtimerendering.com. Supports the row- vs column-vector conventions, transform composition, and quaternions.
- G-Truc. glm manual. github.com/g-truc/glm. Column-major storage, column vectors, right-handed and [โ1,1] depth defaults;
GLM_FORCE_DEPTH_ZERO_TO_ONEand the_ZO/_NOprojection variants. - Cameron Hart. glam documentation. docs.rs/glam. Column-major storage;
perspective_rh([0,1]) vsperspective_rh_gl([โ1,1]); the shear/degenerate warning onto_scale_rotation_translation. - Matthew Wellings. "The New Vulkan Coordinate System." matthewwellings.com. Vulkan's Y-down NDC, [0,1] depth, right-handed NDC, and the negative-viewport-height Y-flip.
- The Khronos Group. Vulkan Specification, vertex post-processing (clip volume and viewport transform). docs.vulkan.org. The clip-volume bound 0 โค zc โค wc.
- The Khronos Group. glTF 2.0 Specification. registry.khronos.org. Right-handed, +Y up, meters and radians, node transform as TยทRยทS with a quaternion rotation.
- Epic Games. "Coordinate System and Spaces in Unreal Engine." dev.epicgames.com. Left-handed (left-hand rule), +Z up, world unit = 1 cm, Y inverts on right-handed import.
- Unity Technologies. "Quaternion and Euler Rotations in Unity." docs.unity3d.com. Left-handed, +Y up; rotations stored internally as quaternions with Euler shown for editing.
- "Gimbal lock." Wikipedia. en.wikipedia.org/wiki/Gimbal_lock. The loss of one DOF when the middle Euler axis reaches ยฑ90ยฐ, as a representation artifact.
- Ken Shoemake. "Animating Rotation with Quaternion Curves." SIGGRAPH 1985, Computer Graphics 19(3), 245โ254. dl.acm.org. The origin of SLERP (and SQUAD); constant-angular-velocity interpolation.
- Arseny Kapoulkine. "Approximating slerp." zeux.io. NLERP follows the SLERP arc but not at constant velocity; small-angle equivalence and the sin ฮฉ โ 0 fallback.
- Jonathan Blow. "Understanding Slerp, Then Not Using It." number-none.com. The commutativity / constant-velocity / minimal-torque tradeoff and the case for NLERP in games.
- Melissa O'Neill. "PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation." 2014. pcg-random.org. The permutation-on-LCG construction and its statistical quality.
- Paul Hsieh. "Misconceptions about rand()." azillionmonkeys.com.
rand()is implementation-defined with weak low bits and a possibly small RAND_MAX;% nintroduces modulo bias.