Build a Game Engine · Networking

Networking: Netcode & Rollback

Multiplayer is a fight against physics. Latency has a hard floor (the speed of light), packets are lost and reordered, and players cheat. No netcode beats those constraints; it hides them. This module is the toolkit: prediction and reconciliation to feel responsive, interpolation to look smooth, lag compensation to land hits, and rollback to make a fighting game playable across a continent.

Time~55 min LevelSenior PrereqsThe Game Loop (the fixed timestep), Floating Point (cross-machine determinism), and Compression (packing packets). StackC++ & Rust

01Why it's hard

Five constraints: latency, jitter (variance in latency), packet loss, bandwidth, and cheating^[1]. Latency has a floor you cannot beat: light in fiber travels at roughly 200,000 km/s, so New York to London (~5,500 km) is about 55 ms round-trip of pure propagation, before any routing or queuing. The job of netcode is not to remove that delay but to make the game feel good despite it. The network feeds each peer the inputs or state for the fixed tick it's about to run.

02UDP vs TCP

Games run the realtime stream over UDP, not TCP. TCP guarantees reliable, in-order delivery, and that guarantee is the problem: a single lost packet causes head-of-line blocking, TCP won't hand over any later data that already arrived until the retransmit lands (≥1 RTT later), freezing movement on both ends^[2].

"Never use TCP" is wrong, "never for time-critical data" is right

TCP is fine for non-realtime channels: login, matchmaking, chat, patch downloads, REST calls. The precise rule (Fiedler's) is never use TCP for time-critical data. For the realtime stream, games build a thin partial-reliability layer on UDP: sequence numbers plus an ack bitfield (each packet carries the latest received sequence and a 32-bit field of the prior 32, so every ack is effectively sent 32 times for redundancy) and a priority queue that resends only the few messages flagged reliable, the realtime stream is never blocked^[3].

03The three models

Three archetypes, each correct for a different genre, along two axes: what crosses the wire (inputs vs state) and who is authoritative (peers vs a server).

ModelWire / authorityGenre & tradeoff

Deterministic lockstepInputs only; peers (P2P)RTS. Tiny bandwidth, but needs perfect determinism + waits for the slowest peer

Client-server authoritativeState; server is truthFPS. Cheat-resistant; needs prediction to feel responsive

State synchronizationInputs + stateThe middle ground: state corrects drift, so no perfect determinism needed

04Lockstep & determinism

In , every peer runs the identical simulation and only inputs (commands) cross the wire, so bandwidth is proportional to input size, not world-object count^[4]. Age of Empires passed commands rather than per-unit state and shipped 1,500 units on a 28.8k modem; passing state would have capped it near 250^[5]. Commands are scheduled a couple of turns ahead so transmission overlaps simulation.

Two hard costs, and determinism is mostly a float problem

Lockstep runs at the speed of the slowest, laggiest peer (everyone needs turn N's inputs before simulating turn N), and a single non-deterministic divergence desyncs everyone, permanently, and compounds over time. That determinism is hard primarily because of floating point across machines, transcendentals not correctly rounded, non-associativity under compiler/SIMD reordering, -ffast-math, x87 vs SSE (the Floating Point tutorial covers the mechanism)^[6]. The fixes are fixed-point math or tightly controlled float. A fixed timestep is necessary but not sufficient, and using doubles doesn't fix it. This is why many engines avoid lockstep.

05Prediction & reconciliation

Under an authoritative server, waiting for the round-trip means input lag equal to the full RTT. applies your own input immediately to your local copy, tagging each input with a sequence number and storing it in a pending buffer^[7]. That creates the disagreement problem the server fixes.

Reconciliation replays, it does not just snap

The server's state update carries the sequence number of the last input it processed. On receipt the client: (1) snaps to the authoritative state, (2) discards pending inputs up to that ack, (3) replays the still-unacknowledged inputs on top. Replaying is what keeps your predicted position correct when the server agrees, you only see a correction/rubber-band when it genuinely disagreed (a misprediction, e.g. you got shoved). Prediction makes your character responsive; it does nothing for other players (that's interpolation, §6).

The character follows a moving input; raise the latency. With no prediction it lags behind your input by the network latency. Prediction with no reconciliation feels instant but never corrects its mispredictions, so it drifts off the authoritative position and never recovers. Reconciliation is the half that makes prediction usable: it snaps the character back onto authority and replays your unacked inputs:

The blue ring is where the server says you actually are; the green dot is what your client draws; the colored bar between them is the prediction error. No prediction: the character renders the late server state, so it trails your input by the network latency. Prediction only: your own input is applied instantly, but the server keeps applying input the client could not predict (another player shoved you, or the server resolved a collision you had not seen), and with no reconciliation that error accumulates: the green dot drifts off the blue ring and never recovers. Prediction + reconciliation: same instant response, but every server update the client snaps to authority and replays your unacked inputs, so the error collapses back toward zero. The snap is invisible when the server agreed and pops as a rubber-band only when it genuinely disagreed. The input buffer is keyed by timestamp here; the code panel below keys the same snap-and-replay by sequence number.

Client prediction + reconciliation (replay, not snap)

std::deque<Input> pendingInputs;          // unacknowledged inputs, in order
uint32_t inputSequence = 0;
State predictedState;                       // what we render locally

void onLocalInput(float dx, float dt) {
    Input input{ ++inputSequence, dx, dt };
    applyInput(predictedState, input);      // PREDICT: apply now, don't wait for the server
    pendingInputs.push_back(input);          // keep it until acked
    sendToServer(input);
}
void onServerState(const State& serverState, uint32_t lastProcessedInput) {
    predictedState = serverState;            // RECONCILE 1: snap to authority
    while (!pendingInputs.empty() && pendingInputs.front().sequence <= lastProcessedInput)
        pendingInputs.pop_front();           // drop what the server already ran
    for (const Input& input : pendingInputs)
        applyInput(predictedState, input);  // RECONCILE 2: REPLAY unacked inputs
}

fn on_local_input(&mut self, dx: f32, dt: f32) {
    self.input_sequence += 1;
    let input = Input { sequence: self.input_sequence, dx, dt };
    apply_input(&mut self.predicted_state, &input);  // PREDICT immediately
    self.pending_inputs.push_back(input);
}
fn on_server_state(&mut self, server_state: State, last_processed: u32) {
    self.predicted_state = server_state;             // RECONCILE: snap to authority
    while self.pending_inputs.front().is_some_and(|i| i.sequence <= last_processed) {
        self.pending_inputs.pop_front();             // drop acked
    }
    for input in &self.pending_inputs {
        apply_input(&mut self.predicted_state, input);  // REPLAY unacked
    }
}

06Interpolation

You can't predict other players (they stop, turn, and accelerate unpredictably). Instead, buffer their timestamped snapshots and render them at now − interpolationDelay, between the two snapshots that bracket that render time^[8]. Source's default is cl_interp 0.1 = 100 ms of view delay, sized so a single lost snapshot still leaves two to interpolate between^[9].

You see others in the past, deliberately

Interpolation delay is latency you add on purpose for smoothness, a tradeoff, not a flaw. It's interpolation between two real snapshots, not extrapolation, so it only coasts/extrapolates (and can overshoot) when packets are lost. And it's coupled to lag compensation: the server subtracts this same delay when it rewinds time to validate your shots (§8).

A remote entity moving: compare raw snapshots (teleporting) to interpolation (smooth, but lagging), and drop packets to see it coast:

07Snapshots & delta

A client-server engine takes a snapshot of world state each tick and sends each client the delta against the last snapshot that client acknowledged^[10]. Quake 3 keeps the last 32 snapshots per client and deltas against the client's last acked one; if none is acked (heavy loss), it deltas against a zeroed baseline, which is just a full update. A lost ack self-heals, the next delta is computed from an older baseline (bigger, but correct), not necessarily a forced full resend.

Quantize and bit-pack the fields

Then quantize each field to the bits it needs (cross-ref Bit Shifting and Compression): map a position float over a known range to an integer, Fiedler's example packs x,y,z into 18/18/14 bits (~2 mm precision) instead of 96, and orientation into a smallest-three quaternion (2 bits for the largest-component index + 3×9 bits = 29 bits vs 128)^[11]. Quantization is lossy and bounded, picking the range and precision is the design choice, and out-of-range values clamp.

08Lag compensation

When validating a hit, the authoritative server rewinds the other players to where the shooter saw them. Valve estimates the shooter's view time as Command Execution Time = Current Server Time − Packet Latency − Client View Interpolation, note it subtracts the interpolation delay from §6, the two systems are coupled^[9]. The server keeps about 1 second of position history, moves the candidates back, tests the hit, and restores them.

It favors the shooter, by design

Lag compensation makes you hit what your screen showed, at the cost of the target: you can be killed by an attacker you can no longer see because you already ducked behind cover, on the shooter's machine you were still exposed. Valve is explicit that this "can't be solved in general because of the relatively slow packet speeds." It's a deliberate tradeoff (shooter feel vs target fairness), not a bug, and implementations typically rewind only players/hitboxes with bounded history.

09Rollback

(GGPO, the fighting-game standard) runs a deterministic sim and predicts the remote player's input (assume they keep doing what you last heard, "carry-forward"), simulating forward immediately so the game feels offline-responsive. When the real input arrives and differs, it rolls back to the saved state at that frame and re-simulates forward to the present with the corrected input^[12].

Not lag-free, and it demands two things at once

Rollback hides remote latency but doesn't erase it, a misprediction produces a visible correction/teleport. It requires both a fully deterministic sim and the ability to save/restore the entire game state cheaply every frame (usually one contiguous struct memcpy). The cost scales with the misprediction window (latency in frames, minus any input delay): at 60 fps with three frames of input delay and a 300 ms tolerance you may re-simulate up to 15 frames inside one 16.6 ms display frame, its own spiral of death if the resim exceeds budget^[13]. It predicts the remote input; your own is applied directly. Best fit: 2-player P2P with small state, not a 64-player authoritative shooter.

The widget predicts the remote input; when a real input arrives that differs, the sim rolls back N frames and re-simulates. Raise latency to grow the rollback window:

The rollback core (predict remote, roll back, re-simulate)

GameState savedStates[MAX_ROLLBACK];        // ring of per-frame saves (must be cheap to copy)
Input remoteInputs[MAX_ROLLBACK];            // last known/predicted remote input per frame
int confirmedFrame = -1;                    // last frame with a REAL remote input

Input predictRemote() {                       // carry-forward: keep doing what we last heard
    return remoteInputs[confirmedFrame >= 0 ? confirmedFrame : 0];
}
void advance(int frame, Input local) {
    savedStates[frame % MAX_ROLLBACK] = currentState;     // SAVE before stepping
    simulate(currentState, local, predictRemote());     // step with the PREDICTED remote input
}
void onRemoteInput(int frame, Input real) {
    if (real == remoteInputs[frame % MAX_ROLLBACK]) { confirmedFrame = frame; return; } // right
    remoteInputs[frame % MAX_ROLLBACK] = real;
    currentState = savedStates[frame % MAX_ROLLBACK];     // ROLL BACK to that saved frame
    for (int f = frame; f <= presentFrame; ++f)         // RE-SIMULATE forward to now
        simulate(currentState, localInputAt(f), f == frame ? real : predictRemote());
    confirmedFrame = frame;                               // the visible pop happens here
}

fn predict_remote(&self) -> Input {                // carry-forward prediction
    self.remote[self.confirmed.max(0) as usize % MAX_ROLLBACK]
}
fn advance(&mut self, frame: usize, local: Input, state: &mut GameState) {
    self.saved[frame % MAX_ROLLBACK] = *state;          // SAVE
    simulate(state, local, self.predict_remote());          // step with predicted remote
}
fn on_remote_input(&mut self, frame: usize, real: Input,
                   state: &mut GameState, present: usize, local_at: impl Fn(usize) -> Input) {
    if real == self.remote[frame % MAX_ROLLBACK] { self.confirmed = frame as i32; return; }
    self.remote[frame % MAX_ROLLBACK] = real;
    *state = self.saved[frame % MAX_ROLLBACK];             // ROLL BACK
    for f in frame..=present {                            // RE-SIMULATE to present
        let remote = if f == frame { real } else { self.predict_remote() };
        simulate(state, local_at(f), remote);
    }
    self.confirmed = frame as i32;
}

10Choosing a model

No model is universally best, each is chosen for its constraints:

Lockstep → RTS: huge unit counts, P2P, bandwidth-bound, can pay the determinism price.
Client-server + prediction/interpolation/lag-comp → FPS: needs an authoritative, cheat-resistant server.
Rollback → fighting games: 2-player P2P, small deterministic state, latency-hiding above all.

The engine seam

The network thread receives and parses packets and hands inputs/snapshots to the simulation thread over a lock-free SPSC queue, so wire I/O never stalls the fixed-timestep loop. Every model here sits on top of that fixed tick and the determinism discipline from the Floating Point tutorial.

Wrong answers, and why: reconciliation replays unacked inputs (not pure snapping; and prediction != interpolation); and rollback isn't lag-free and needs determinism + cheap state save/restore, so it fits 2-player P2P, not a 64-player authoritative shooter.

11Pitfalls

"Never use TCP"Never for time-critical data. TCP is fine for login/chat/downloads.

Lockstep sending stateIt sends inputs; that's the bandwidth win. State is the snapshot model.

"Fixed timestep gives determinism"Necessary, not sufficient. Floats diverge across machines.

Reconciliation that only snapsReplay the unacked inputs, or you stutter every packet.

Predicting other playersInterpolate them (render the past); predict only your own.

Delta vs the latest sent snapshotDelta against the last ACKED snapshot, not the latest sent.

"Rollback is lag-free"It hides latency; mispredictions pop. Needs determinism + cheap save/load.

One model for everythingLockstep/RTS, client-server/FPS, rollback/fighting. Pick per constraints.

12What's next

Players can share a world. One engine-systems module remains before the finale: Tooling, dev UI and profiling, the in-engine tools to see and tune everything you've built. Then the 3D-game capstone assembles the whole engine into a game. The full path is on the series hub.

Glenn Fiedler. "What Every Programmer Needs To Know About Game Networking." gafferongames.com. The P2P→client-server history, the lockstep-waits-for-the-slowest-peer point, and that one tiny difference desyncs everyone.
Glenn Fiedler. "UDP vs. TCP." gafferongames.com. Head-of-line blocking; "never use TCP for time-critical data"; TCP for non-critical services.
Glenn Fiedler. "Reliable Ordered Messages." gafferongames.com. Sequence numbers + the 32-bit ack bitfield, and priority-based partial reliability over UDP.
Glenn Fiedler. "Deterministic Lockstep." gafferongames.com. Inputs-only, bandwidth proportional to input not object count, and the cross-platform determinism warning.
Paul Bettner and Mark Terrano. "1500 Archers on a 28.8: Network Programming in Age of Empires and Beyond." GDC 2001. gamedeveloper.com. The lockstep RTS case: pass commands, 1,500 units, run as fast as the slowest machine.
Glenn Fiedler. "Floating Point Determinism." gafferongames.com. Why cross-machine float results diverge and the forced-precision / wrapped-transcendental fixes (also cited in the Floating Point tutorial).
Gabriel Gambetta. "Client-Side Prediction and Server Reconciliation." gabrielgambetta.com. Per-input sequence numbers, the server echoing the last-processed input, and replaying unacked inputs.
Gabriel Gambetta. "Entity Interpolation." gabrielgambetta.com. Rendering other entities in the past by interpolating between two bracketing snapshots.
Valve. "Source Multiplayer Networking" / "Lag Compensation." developer.valvesoftware.com. The 100 ms cl_interp, the Command-Execution-Time formula, the 1 s history, and the favor-the-shooter "behind cover" tradeoff.
Fabien Sanglard. "Quake 3 Source Code Review: Network." fabiensanglard.net. The 32-snapshot ring, delta against the last acked snapshot, and the zeroed baseline as a full update.
Glenn Fiedler. "Snapshot Compression." gafferongames.com. Position quantization over a bounded range and the smallest-three quaternion.
GGPO. ggpo.net (open source: github.com/pond3r/ggpo). Input prediction + speculative execution and re-simulation from the point of divergence; the deterministic-sim + save/load requirement.
SnapNet. "Netcode Architectures Part 2: Rollback." snapnet.dev. State as a contiguous memcpy, carry-forward prediction, input decay, and the 15-frames-in-16.6 ms resim budget.