Build a Game Engine · The Runtime Spine

Input Systems ()

Reading a key looks trivial until you ship to a player on an AZERTY keyboard, a drifting thumbstick, or a 1000 Hz mouse. Good input is a small pipeline: drain the OS events, fold them into a per-frame snapshot, shape the analog values, and resolve physical inputs to game actions. We build that pipeline in C++ and Rust, and fix the dead-zone bug almost every first engine ships.

Time~50 min LevelBeginner to mid PrereqsThe Game Loop (input is sampled per frame) and Platform & Window (events come from there). StackC++ (SDL3) · Rust (gilrs / winit)

01Where input fits

Two delivery models coexist, and a good engine uses both. The platform layer pushes discrete events (key down, key up, mouse moved, gamepad connected) onto a queue; the game loop also samples current device state once per frame (is W held right now? what's the left-stick vector?)^[1].

The usual shape: at the top of each frame, drain the event queue (this is where the platform layer hands you events), fold the transitions into a per-frame snapshot plus an event list, and let gameplay read the snapshot. Events are authoritative for transitions (a tap shorter than a frame, the exact down/up); polling is convenient for "held right now."

What you'll have by the end

A per-frame input snapshot fed by the event stream; movement bound to physical key positions so it survives non-QWERTY layouts; a gamepad path that works across XInput, DirectInput, and Linux; a correct scaled-radial with a response curve; an that decouples bindings from gameplay; and an input buffer that makes controls feel right.

02Scancodes vs keycodes

A identifies the physical key position, independent of layout. A keycode (virtual key) names the symbol the key produces under the current layout. SDL is explicit: scancodes "reference the physical location on the keyboard," while a keycode "is based on the keyboard layout"^[2].

Bind movement to scancodes, shortcuts to keycodes

If you bind WASD to keycodes, a French AZERTY player presses ZQSD to move, because the keys at the W/A positions produce Z/Q. Bind movement to scancodes and the keys under the left hand stay put on any layout. Symbolic shortcuts ("press I for inventory") want keycodes, so the letter matches the prompt^[3]. winit draws the same line: physical_key (a KeyCode, the position) vs logical_key (a Key, the meaning), and its logical key is deliberately not affected by Ctrl, so Ctrl+C still reports the character "c"^[4].

One more keyboard detail: when a key is held, the OS emits auto-repeat key-down events after a delay. That's a typing convenience. For gameplay you ignore it: one-shot actions fire only when the event is not a repeat (SDL's repeat flag, winit's KeyEvent.repeat), and held movement comes from polling state per frame, not from repeat events^[4].

03Raw mouse & pointer lock

There are two mouse signals. The cursor position (for UI) is filtered through OS pointer acceleration and the Control Panel mouse-speed slider. Mouselook wants the opposite: raw, unaccelerated deltas.

On Windows, the Raw Input API delivers WM_INPUT messages with relative deltas that, unlike WM_MOUSEMOVE, are "not subject to the effects of mouse speed set in Control Panel"^[5]. For an FPS camera you also lock the cursor: winit's CursorGrabMode::Locked pins it in place (you hide it yourself), versus Confined, which only keeps it inside the window^[6].

Accumulate deltas, don't overwrite

A 1000 Hz mouse delivers many move events per frame. If you store only the latest one, you throw away most of the motion. Sum the deltas across the frame. Raw input's value isn't lower latency; it's that the deltas are unaccelerated and unfiltered, which is what a camera wants.

04Gamepads

Think of one standard gamepad (two sticks, two triggers, a d-pad, face and shoulder buttons) and let a library map every real device onto it. The reason the library matters is the backend zoo underneath.

On Windows, XInput is the modern Xbox-style API with the two triggers as independent axes. Legacy DirectInput, for the standard gamepad mapping, often combines both triggers onto one shared axis; Microsoft notes this "does mean it is not possible to see all possible trigger combination values through DirectInput... To test the trigger values separately, you must use XInput"^[7]. SDL hides all of it behind a standard model with separate left and right trigger axes, using a community mapping database^[8]; Rust's gilrs does the same and, helpfully, names face buttons by position (South/East/North/West) to sidestep the Xbox-vs-Nintendo A/B swap^[9].

Read a gamepad: poll for connect, sample state per frame

// SDL3: axes are int16 (-32768..32767); triggers are 0..32767.
SDL_Event event;
while (SDL_PollEvent(&event)) {
    if (event.type == SDL_EVENT_GAMEPAD_ADDED)
        gamepad = SDL_OpenGamepad(event.gdevice.which);   // standard-mapped handle
}
if (gamepad) {
    float rawX = SDL_GetGamepadAxis(gamepad, SDL_GAMEPAD_AXIS_LEFTX) / 32767.0f;
    float rawY = SDL_GetGamepadAxis(gamepad, SDL_GAMEPAD_AXIS_LEFTY) / 32767.0f;
    bool jump = SDL_GetGamepadButton(gamepad, SDL_GAMEPAD_BUTTON_SOUTH);
}

// gilrs returns already-normalized f32 in [-1, 1].
while let Some(Event { id, event, .. }) = gilrs.next_event() {
    if let EventType::Connected = event { active = Some(id); }
}
if let Some(pad) = active.map(|id| gilrs.gamepad(id)) {
    let raw_x = pad.value(Axis::LeftStickX);   // already -1.0..1.0
    let raw_y = pad.value(Axis::LeftStickY);
    let jump  = pad.is_pressed(Button::South);  // South == A on an Xbox pad
}

05Dead zones

A thumbstick never rests at exactly zero, so you must ignore a small region near center. How you ignore it is the single most-botched piece of gamepad code, and the canonical reference is Josh Sutphin's^[10].

Axial (per-axis): zero each axis independently. This makes a square dead zone that clips diagonals: a gentle diagonal can fall inside the square on both axes and read as zero, or snap to a cardinal. Correct only for grid or 4-way movement.
Radial: zero by vector magnitude. A circular dead zone, no cardinal snapping, but the output jumps from 0 to a finite value the instant you cross the boundary.
Scaled radial (the right default for free aim): after the radial cut, re-scale so output ramps from 0 at the boundary to 1 at full deflection.

eq. 1 \cdot scaled-radial dead zone v out = v̂ \cdot (|v| - deadzone 1 - deadzone)

Take the raw stick magnitude, subtract the dead-zone radius, then divide by what's left of the range so the output ramps from 0 right at the boundary to 1 at full deflection. Multiplying by the unit direction v̂ keeps the aim direction while the fraction supplies the new length. Without the divide the stick would jump straight to a finite value at the boundary; this is the re-scale that lets you walk slowly. Hover any symbol to see what it stands for.

Two traps: the square clips diagonals, and you must re-scale

The per-axis square dead zone is the classic bug. And even with a radial cut, forgetting to re-scale means the stick jumps to a finite output the moment it leaves the dead zone, so slow movement is impossible. The formula above is the fix. XInput even publishes recommended dead-zone constants (7849 for the left stick, on a ±32767 range), but those are starting points to tune, not physics^[7].

Drag the stick. Switch dead-zone shapes and watch the diagonal get clipped, then watch the output ramp smoothly once you scale:

Wrong answers, and why: a bigger square dead zone worsens diagonal clipping; a response curve changes sensitivity, not snapping; and both scancodes and keycodes carry up/down, they differ in physical-vs-symbolic meaning.

06Response curves & smoothing

Two more shaping steps run on the post-dead-zone magnitude. A response curve (output = magnitude^k, k > 1) gives finer control near center while keeping the full range at the edge. A low-pass filter smooths jitter.

The one-pole low-pass is a single line, an exponentially weighted moving average^[16]:

eq. 2 \cdot one-pole low-pass y i = α \cdot x i + (1 - α) \cdot y i-1, 0 \leq α \leq 1

Each frame's smoothed value is a blend of the new reading and the last smoothed value, weighted by α. At α = 1 there's no smoothing (output equals input); push α toward 0 and the output leans on its own history, so it's smoother but trails the input. The subscript i−1 just means 'the value from the previous frame'.

Smoothing trades latency for smoothness

A smaller α smooths harder but makes the filtered signal lag the input more. Use it for drifty sticks or gyro; avoid it on anything where input lag is felt directly, like a mouselook camera. And if your timestep varies, α has to be frame-rate compensated or the smoothing strength drifts with the frame rate. Engines expose both of these as per-binding modifiers (Unreal's Input Modifiers, Unity's Processors)^[12]^[13].

The widget feeds a noisy signal through both. Crank the smoothing and watch the filtered line lag the flicks:

07Action maps

Don't let gameplay name keys. Let it ask "is Jump triggered?" and resolve that through a table from physical inputs to semantic actions, scoped by context (menu, on-foot, vehicle). This is what makes rebinding, multi-device support, and accessibility possible.

Steam Input states the principle bluntly: the API "only ever tells you 'this action just happened'... The developer never gets inputs directly"^[11]. Unreal's Enhanced Input models it as Input Actions plus Mapping Contexts you push and pop, with per-binding modifiers and triggers^[12]; Unity's Input System uses Action Maps with bindings and processors^[13]. The minimal data shape:

Physical input → action

enum class Action { MoveForward, Jump, Fire };

struct PhysicalInput {                // tagged so one action binds keyboard OR pad
    enum class Source { Scancode, GamepadButton } source;
    int code;                          // SDL_Scancode (a position) or a button
    bool operator==(const PhysicalInput&) const = default;
};
// One context; swap the whole map for menu/vehicle.
std::unordered_map<PhysicalInput, Action, Hash> bindings;

#[derive(Clone, Copy, PartialEq, Eq, Hash)]
enum Action { MoveForward, Jump, Fire }

#[derive(Clone, Copy, PartialEq, Eq, Hash)]
enum PhysicalInput {              // one action binds keyboard OR pad
    Key(KeyCode),                    // physical position, survives AZERTY
    Pad(Button),
}
// One context; swap the whole map for menu/vehicle.
let bindings: HashMap<PhysicalInput, Action> = HashMap::new();

08Input buffering

Input buffering widens timing windows so the game registers the input the player meant. It is a feel feature, not added latency: it changes when an input is accepted, not when the result renders.

Motion windows. A fighting-game quarter-circle counts if its directions land within an N-frame window. Community frame-data analysis puts Street Fighter 6's quarter-circle window around 11 frames (and reports it samples input several times per frame), looser than its predecessor^[15].
Coyote time. A jump still fires for a few frames after you walk off a ledge. Celeste's designer documents this kind of forgiveness (coyote time, jump buffering, corner correction) as deliberate feel^[14].
Jump buffering. A jump pressed just before landing fires on the landing frame instead of being dropped^[14].

Tap the inputs. The move fires only if the sequence lands inside the buffer window; widen it and the same timing now succeeds:

09Sampling & latency

Polling once per frame can miss a transition shorter than the frame, a tap that goes down and up between two samples. The event stream carries every transition with a timestamp, which is why fast-twitch genres lean on events (and sub-frame sampling) rather than per-frame polling alone.

High-frequency mice deliver many events per frame; accumulate them rather than taking the last. And felt input latency is multi-causal (the sampling phase, the render queue, the display), so per-frame sampling adds up to about half a frame on average (a full frame worst case) on top of the rest. Timestamps let you order inputs and age the buffer from §8, which is literally a timestamped ring of recent inputs.

10Pitfalls

WASD moves wrong on AZERTYMovement bound to keycodes (symbols) instead of scancodes (positions).

Diagonal aim snaps / diesA per-axis square dead zone clips diagonals; use scaled-radial.

Stick jumps to a valueDead zone cut without re-scaling; output never ramps from zero.

Triggers read as one axisReading an Xbox pad through DirectInput; use XInput or SDL/gilrs.

Lost mouse motionOverwriting the per-frame delta instead of summing 1000 Hz events.

Camera feels laggyA low-pass filter on mouselook; smoothing trades latency for smoothness.

Held key auto-firesActing on OS key-repeat events for one-shot actions; check the repeat flag.

Can't rebind keysGameplay names physical keys directly instead of going through actions.

11What's next

That completes the runtime spine: a loop, a window, and input. The next phase is resources, turning source art into fast-loading data: the Asset Pipeline and Compression, building on the existing File Streaming tutorial. Then the renderer. The full path is on the series hub.

Jason Gregory. Game Engine Architecture, "Human Interface Devices (HID)." gameenginebook.com. Reading and processing input in an engine: events vs polling, dead zones.
SDL. SDL_Scancode. wiki.libsdl.org. Scancode = physical key position (USB-HID-based); keycode = layout-dependent symbol.
SDL. Best Keyboard Practices. wiki.libsdl.org. Bind movement to scancodes (WASD survives AZERTY), symbolic shortcuts to keycodes.
winit. KeyEvent and the keyboard module. docs.rs/winit. physical_key vs logical_key, the repeat flag, and logical key being unaffected by Ctrl.
Microsoft. "About Raw Input" and RAWMOUSE. learn.microsoft.com. Relative deltas not subject to Control-Panel mouse speed; batched retrieval for high-rate mice.
winit. CursorGrabMode. docs.rs/winit. Locked (pins the cursor for mouselook) vs Confined (keeps it in the window).
Microsoft. "Comparison of XInput and DirectInput Features." learn.microsoft.com. DirectInput combines the triggers onto one axis; XInput keeps them independent; XInput dead-zone constants.
SDL. SDL_GamepadAxis / SDL_GetGamepadAxis. wiki.libsdl.org. The standard gamepad model with separate left/right trigger axes over any backend.
gilrs. Crate documentation. docs.rs/gilrs. Cross-platform gamepad input in Rust; normalized axes; positional face-button names (South/East/North/West).
Josh Sutphin. "Doing Thumbstick Dead Zones Right." joshsutphin.com. Axial vs radial vs scaled-radial dead zones and the re-scaling formula.
Valve. Steam Input "General Concepts." partner.steamgames.com. The action-based abstraction: the API reports actions, never raw inputs; action sets for context.
Epic Games. "Enhanced Input in Unreal Engine." dev.epicgames.com. Input Actions, Mapping Contexts, and per-binding Modifiers (dead zone, smoothing) and Triggers.
Unity Technologies. Input System: Action Bindings. docs.unity3d.com. Action Maps, bindings, deadzone/scale Processors, composite bindings.
Maddy Thorson. "Celeste & Forgiveness." maddymakesgames.com. Coyote time, jump buffering, and corner correction as deliberate feel features.
EventHubs, reporting Loïc "WydD" Petit's input analysis. "Breakdown of why players struggle with SF6 inputs." eventhubs.com. Street Fighter 6 motion-input buffer windows and multi-sample-per-frame input.
"Low-pass filter," Wikipedia (discrete-time / exponential moving average). en.wikipedia.org. The one-pole filter y = αx + (1−α)y_prev and the smoothing-versus-lag trade.