All tutorials Mighty Professional
Tutorial 14 ยท Engine Architecture

ECS from Scratch

The architecture that replaced deep OOP inheritance hierarchies in game engines. Entity is an ID. Component is data. System is a function that transforms matching components every tick. No virtual dispatch, no pointer chasing, no diamond inheritance. Flat arrays, cache-friendly iteration, and trivial parallelism. We build one from scratch, measure why it is fast, and trace the design through Overwatch, Unity DOTS, Bevy, Flecs, and EnTT.

Time~50 min LevelMid to senior PrereqsYou can read C++ and Rust. Basic memory model awareness (cache lines, sequential vs random access). Comfortable with bitwise operations. HardwareNone. A feel for cache hierarchy helps in sections 5 through 7.

01Why ECS

Game objects in a shipping engine carry a variable set of behaviors: this one has a transform and a mesh, that one has a transform, a mesh, a rigid body, and an AI controller. The classical OOP approach models this with inheritance. Twenty years of shipped games demonstrated that deep inheritance hierarchies produce diamond problems, fat base classes, and cache-hostile memory layouts that cost real frame time at scale. ECS is the replacement.

The core proposition: separate identity from data from behavior. An entity is a lightweight ID. A component is a plain data struct attached to that ID. A system is a function that runs over all entities matching a component query. No inheritance. No virtual dispatch. Components live in flat, typed arrays. Systems iterate those arrays sequentially. The CPU prefetcher sees a predictable stride. The scheduler sees declared read/write sets and can parallelize automatically.

The results show up in frame time. Unity's DOTS benchmarks report iterating 100,000 entities with a simple Position+Velocity update in roughly 0.3 ms on a modern desktop CPU, versus 3+ ms for the equivalent MonoBehaviour approach[4]. That is an order-of-magnitude improvement on identical logic, driven entirely by memory layout and dispatch cost.

What you'll have by the end

Working knowledge of both major ECS storage strategies (archetypes and sparse sets), when to pick each, and how to implement them. Generational indices for safe entity recycling. Query matching by bitset intersection. The structural change problem and command buffer pattern. System scheduling and automatic parallelism. And the case studies: Overwatch's gameplay ECS, Unity DOTS, Bevy's parallel executor, Flecs' relationship model, EnTT's sparse-set design, and Unreal's Mass Entity framework.

02A short history

The component pattern predates the term "ECS" by over a decade. The timeline of the ideas that converged into the architecture shipping in engines today:

2002
Scott Bilas, "A Data-Driven Game Object System," GDC 2002. Built for Dungeon Siege at Gas Powered Games. Over 7,300 unique object types, 100,000+ placed objects across a continuous world. Bilas proposed assembling game objects from data-driven components instead of inheriting from a class hierarchy. No engineer required to create a new object type.[1] This is the earliest widely cited talk on component-based game objects.
2007
Adam Martin, "Entity Systems are the future of MMOG development," T-Machine blog. A five-part blog series that named the pattern and argued for strict separation: entities hold no data, components hold no behavior, systems hold no state.[2] Martin's taxonomy (entity as ID, component as data, system as logic) became the canonical definition the community adopted.
2017
Timothy Ford, "Overwatch Gameplay Architecture and Netcode," GDC 2017. Blizzard's Overwatch shipped on a custom ECS. Ford described how the ECS curtails complexity even as the team adds new heroes with radically different abilities. The deterministic simulation is built on the ECS tick model: systems run in a fixed order, each reading and writing declared component sets.[3]
2018
Catherine West, RustConf 2018 closing keynote: "Using Rust for Game Development." Walked through the OOP-to-ECS transition in Rust, showing how the borrow checker makes traditional mutable-object-graph architectures painful and ECS natural. Widely credited with sparking the Rust gamedev ECS wave that produced Bevy, Hecs, and Legion.[5]
2018
Unity announces DOTS (Data-Oriented Technology Stack). An archetype-based ECS integrated into the Unity editor. Chunk-allocated archetype tables, the Burst compiler for auto-vectorized system code, and the C# Job System for multi-threaded scheduling. Shipped iteratively from 2018 through the Entities 1.0 release in 2023.[4]
2017
Michele Caini releases EnTT. A header-only C++ ECS built on sparse sets rather than archetypes. Each component type gets its own sparse-set pool. Adding and removing components is O(1) with no table migration. Used in Minecraft (Bedrock Edition) by Mojang.[7]
2019
Sander Mertens releases Flecs v1.0. A C/C++ ECS with first-class entity relationships, query caching, and an archetype storage backend. Mertens' "Building an ECS" blog series[8] is the most detailed public documentation of archetype-storage internals, covering table layout, edge graphs for archetype transitions, and query optimization.
2020
Bevy 0.1 released by Cart (Carter Anderson). A Rust game engine with an archetype-based ECS at its core. Bevy's scheduler automatically parallelizes systems based on declared read/write access to component types. The ECS design draws from prior Rust crates (Legion, Hecs) but integrates scheduling, resources, and change detection into one system.[6]
2022
Unreal Engine 5.0 ships Mass Entity. An archetype-based ECS framework built by Epic's AI team for large crowd simulations. Chunk-based memory layout sized for 128-byte cache lines (anticipating next-gen hardware; current x86 and ARM use 64-byte lines). Integrated with Unreal's existing actor/component model via Mass Entity traits.[10]

03The OOP problem

The classical game-object hierarchy starts reasonable: GameObject at the root, RenderableObject inherits from it, PhysicsObject inherits from it, Character inherits from both. By the time you have 200 object types across a shipped game, the hierarchy is 6 to 12 levels deep. The problems are structural, not cosmetic.

The "everything is a GameObject" model that Unity (pre-DOTS) and many custom engines used is a partial fix. It replaces inheritance with composition at the object level, but the components themselves are still polymorphic, heap-allocated, and pointer-chased. The iteration pattern (for each entity, fetch its component by type, call a virtual method on it) is fundamentally the same pointer chase.

04Entities, Components, Systems

An ECS has three concepts and zero inheritance.

Entity: a . Typically a 32-bit or 64-bit integer split into an index (slot in an array) and a generation counter. The entity itself stores nothing. It is a key into the component tables.

Component: a plain data struct. No methods, no vtable, no inheritance. struct Position { float x, y, z; }; is a complete component. Components are stored in typed, contiguous arrays. The storage strategy (how those arrays are organized) is the subject of sections 5 and 6.

System: a function (or callable) that queries a set of component types and iterates all entities matching that query. A movement system declares "give me every entity with Position and Velocity" and runs position += velocity * dt for each one. Systems have no per-entity state. They read and write components; the ECS runtime provides the iterator.

This separation has three consequences that matter for performance:

  1. Homogeneous arrays. All Position components are stored in one flat array. Iteration is a sequential scan. The CPU prefetcher sees a constant stride and loads ahead.
  2. No virtual dispatch. A system is a single function pointer. It runs in a tight loop over flat data. No indirect call per entity.
  3. Declarative access. Each system declares which component types it reads and which it writes. The scheduler can run two systems in parallel if their access sets don't conflict. This is mechanical, not hand-tuned.
// Minimal ECS usage pattern (pseudocode).
// Create entities
auto player = world.spawn();
world.add<Position>(player, {0, 0, 0});
world.add<Velocity>(player, {1, 0, 0});
world.add<Health>(player, {100, 100});

auto prop = world.spawn();
world.add<Position>(prop, {5, 0, 0});
world.add<StaticTag>(prop, {});

// Movement system: iterates entities with Position AND Velocity.
// The prop (no Velocity) is excluded automatically.
world.system<Position, const Velocity>(
    [](auto& position, const auto& velocity) {
        position.x += velocity.dx * dt;
        position.y += velocity.dy * dt;
        position.z += velocity.dz * dt;
    }
);

05Storage strategy 1: Sparse sets

A maps entity IDs to component data using two arrays. The sparse array is indexed by entity ID and stores the index into a dense array. The dense array stores entity IDs (and, in parallel, component values) packed contiguously with no gaps.

All three core operations are O(1):

EnTT[7] uses one sparse set per component type. The trade-off: the sparse array is sized to the maximum entity ID, so it can consume significant memory if entity IDs are large. Paging the sparse array (allocating it in fixed-size pages on demand) mitigates this. The swap-and-pop removal does not preserve insertion order, which matters if you need deterministic iteration order across runs.

Live ยท Sparse set operations
dense count
0
sparse size
16
utilization
0%
The sparse array is indexed by entity ID. The dense array is packed: no gaps, no holes. Removal swaps the target with the last element and pops, keeping the dense array contiguous in O(1). The cost is that iteration order changes on every removal. EnTT provides a sort() operation to restore order when needed (useful for render-order-dependent iteration).
Sparse set allocator
template<typename T>
struct SparseSet {
    static constexpr uint32_t INVALID = UINT32_MAX;

    std::vector<uint32_t> sparse;   // entity ID -> dense index
    std::vector<uint32_t> dense;    // packed entity IDs
    std::vector<T>          values;  // component data, parallel to dense

    void ensure_sparse(uint32_t entityId) {
        if (entityId >= sparse.size())
            sparse.resize(entityId + 1, INVALID);
    }

    bool has(uint32_t entityId) const {
        return entityId < sparse.size()
            && sparse[entityId] != INVALID
            && dense[sparse[entityId]] == entityId;
    }

    void add(uint32_t entityId, T value) {
        ensure_sparse(entityId);
        sparse[entityId] = static_cast<uint32_t>(dense.size());
        dense.push_back(entityId);
        values.push_back(std::move(value));
    }

    void remove(uint32_t entityId) {
        if (!has(entityId)) return;
        auto idx  = sparse[entityId];
        auto last = static_cast<uint32_t>(dense.size() - 1);
        if (idx != last) {                            // swap with last
            dense[idx]  = dense[last];
            values[idx] = std::move(values[last]);
            sparse[dense[idx]] = idx;                // fix swapped entity's sparse entry
        }
        dense.pop_back();
        values.pop_back();
        sparse[entityId] = INVALID;
    }

    T& get(uint32_t entityId)       { return values[sparse[entityId]]; }
    const T& get(uint32_t entityId) const { return values[sparse[entityId]]; }
};
pub struct SparseSet<T> {
    sparse: Vec<Option<usize>>,  // entity ID -> dense index
    dense:  Vec<u32>,             // packed entity IDs
    values: Vec<T>,              // component data, parallel to dense
}

impl<T> SparseSet<T> {
    pub fn has(&self, entity_id: u32) -> bool {
        let eid = entity_id as usize;
        eid < self.sparse.len()
            && self.sparse[eid].is_some()
            && self.dense[self.sparse[eid].unwrap()] == entity_id
    }

    pub fn add(&mut self, entity_id: u32, value: T) {
        let eid = entity_id as usize;
        if eid >= self.sparse.len() {
            self.sparse.resize_with(eid + 1, || None);
        }
        self.sparse[eid] = Some(self.dense.len());
        self.dense.push(entity_id);
        self.values.push(value);
    }

    pub fn remove(&mut self, entity_id: u32) {
        if !self.has(entity_id) { return; }
        let idx  = self.sparse[entity_id as usize].unwrap();
        let last = self.dense.len() - 1;
        if idx != last {
            self.dense.swap(idx, last);
            self.values.swap(idx, last);
            let swapped = self.dense[idx] as usize;
            self.sparse[swapped] = Some(idx);
        }
        self.dense.pop();
        self.values.pop();
        self.sparse[entity_id as usize] = None;
    }
}

06Storage strategy 2: Archetypes

An groups entities by their exact component set. All entities with exactly {Position, Velocity} live in one table. All entities with {Position, Velocity, Health} live in another. Each table is a set of contiguous arrays, one per component column, plus an entity ID column. Iterating all entities with Position and Velocity means finding every archetype whose component set is a superset of {Position, Velocity} and scanning each matching table sequentially.

This is the storage model Unity DOTS[4], Flecs[8], and Unreal Mass Entity[10] use. The core trade-off vs sparse sets: iteration is a pure sequential scan (the CPU prefetcher's best case), but adding or removing a component moves the entity's data from one archetype table to another. That move is the problem (section 8).

Live ยท Archetype table visualizer
archetypes
0
entities
0
selected
none
Each archetype table stores entities that share exactly the same component set. Click the canvas to cycle through entities. Adding a component to the selected entity moves it to a different archetype (or creates a new one if no archetype with that component set exists yet). Removing a component does the same in reverse. This table-migration cost is the fundamental price of archetype storage.

07Iteration and cache coherence

The performance argument for ECS reduces to one claim: iterating flat, typed arrays is faster than pointer-chasing through polymorphic objects. The gap is not algorithmic (both are O(n)); it is entirely in the constant factor, dominated by cache line utilization.

In an OOP hierarchy, each game object is heap-allocated. Iterating "all objects with a physics component" dereferences a pointer per object. Each pointer leads to a different address. The CPU loads a 64-byte cache line for each dereference; if the useful data is 16 bytes (a Position struct), 48 bytes of each line are wasted. Worse, successive objects are rarely adjacent in memory, so every dereference is a potential L1 miss (roughly 4 ns on modern hardware) or L2 miss (roughly 12 ns), possibly an L3 miss or DRAM access (60 to 100+ ns)[9].

In an archetype ECS, all Position values for entities in one archetype are packed in a contiguous float[]. Iterating it is a sequential scan. The hardware prefetcher detects the stride and loads ahead. Every byte of every cache line contains useful data. For a 12-byte Position struct (3 floats), roughly 5 positions fit per 64-byte cache line. At L1 hit latency, the amortized cost per entity is a fraction of a nanosecond. The ratio between pointer-chasing and sequential access is often 50x to 200x in practice, depending on working set size and cache pressure from other systems.

Live ยท OOP pointer chase vs ECS sequential scan
OOP cache hits
0
OOP cache misses
0
ECS cache hits
0
ECS cache misses
0
The OOP side accesses entities in random order (simulating heap-allocated objects scattered across memory); each access to a new cache line is a miss. The ECS side scans sequentially, reusing each line for several entities before advancing. Both visit all 64 entities exactly once, so the only difference is memory layout. A miss stalls that side while the line is fetched, so the sequential (ECS) side finishes first while the pointer-chasing (OOP) side is still grinding through misses. The per-miss stall is set to 8ร— a hit here so the clip stays watchable; a real DRAM miss runs roughly 50โ€“200ร— an L1 hit, so the true gap is wider.

08Component add/remove: the structural change problem

In archetype storage, adding a component to an entity means moving its data from the current archetype table to a different one (the archetype that has the old set plus the new component). Removing a component does the same in reverse. Each move copies every component value for that entity. If the entity has 8 components totaling 200 bytes, that is a 200-byte memcpy per .

A single move is cheap. A thousand moves per frame (spawn 500 enemies, each with an add-component-on-spawn pattern) is not. The solutions:

In sparse-set ECS (EnTT), structural changes are cheaper: adding a component inserts into a per-type sparse set (O(1)), removing swaps and pops (O(1)). No table migration. This is the primary advantage of sparse sets over archetypes for workloads with frequent component add/remove (particle systems, buff/debuff stacking, short-lived effects).

09Queries and query caching

A query is the interface between a system and the storage. A query descriptor says: "give me every entity that has all of these component types and none of those." The ECS runtime resolves this into a set of archetype tables whose component sets satisfy the constraints.

The resolution step is a bitset intersection. Assign each component type a bit index. Each archetype stores a bitmask of its component types. The query "With(Position, Velocity), Without(Static)" becomes:

// Query: With(Position, Velocity), Without(Static)
uint64_t withMask    = (1 << POSITION_BIT) | (1 << VELOCITY_BIT);
uint64_t withoutMask = (1 << STATIC_BIT);

for (auto& archetype : allArchetypes) {
    bool hasAll    = (archetype.mask & withMask) == withMask;
    bool hasNone   = (archetype.mask & withoutMask) == 0;
    if (hasAll && hasNone) {
        iterateArchetype(archetype);           // sequential scan of matching table
    }
}

This outer loop (over archetypes) is cheap: dozens to low hundreds of archetypes in a typical game, each checked by two bitwise ANDs and two comparisons. The inner loop (over entities in each matching archetype) is the sequential scan that does the real work.

Query caching avoids re-running the archetype match every frame. On first execution, the query finds all matching archetypes and stores pointers to them. When a new archetype is created (because some entity got a novel component combination), the ECS tests it against all cached queries and adds it where it matches. Flecs[8] and Bevy[6] both cache queries this way.

Live ยท Query matcher
archetypes matched
0/0
entities matched
0
The WITH bitmask requires all marked bits to be set in the archetype. The WITHOUT bitmask requires all marked bits to be clear. Two bitwise ANDs per archetype. The entity count shows how many entities the system would iterate after matching. Toggle the component constraints to see archetypes drop in and out of the match set.

10Relationships and entity references

Pure ECS stores flat data. But games need structure: parent/child hierarchies (scene graph), targeting (missile locked onto a ship), inventory (item inside a container), socket attachment (weapon in hand). These are all relationships between entities.

The simplest approach: store a Parent component containing the parent's entity ID. This works for parent/child. Flecs[8] generalizes this into first-class relationships: a component type can be parameterized by a target entity. (ChildOf, parent_entity) is a relationship pair that acts as a component. Entities with the same relationship pair land in the same archetype, so "find all children of entity X" is an archetype query.

The danger with entity references in components: the referenced entity may be destroyed. A Target component pointing to entity 42 is a dangling reference after entity 42 is freed. Generational indices (section 12) detect this at resolve time. Flecs additionally supports "on delete" hooks: when the target of a relationship is destroyed, the relationship component is automatically removed from all entities that reference it.

11Scheduling and parallelism

Each system declares which component types it reads and which it writes. Two systems can run in parallel if they have no write-write or read-write conflict on the same component type. A system that reads Position and writes Velocity can run in parallel with a system that reads Health and writes Damage, because the component sets are disjoint.

The scheduler builds a dependency graph of systems. Edges encode conflicts: if system A writes Position and system B reads Position, B depends on A (or vice versa, depending on declared ordering). The scheduler topologically sorts this graph and dispatches independent systems to worker threads. Bevy's multi-threaded executor[6] does this automatically each frame. The developer writes systems with declared access; the engine parallelizes them.

This only works because ECS access is declarative. In an OOP architecture, a method on a GameObject can touch any field on any other object through a pointer. The engine has no way to know what a method accesses without running it. In ECS, the query signature is the access declaration. The scheduler reads it statically.

// Bevy system declarations (Rust). The scheduler reads the type signature
// to determine access: Query<&Position> = read Position, Query<&mut Velocity> = write Velocity.

fn movement_system(
    mut query: Query<(&mut Position, &Velocity)>,
    time: Res<Time>,
) {
    for (mut position, velocity) in &mut query {
        position.x += velocity.dx * time.delta_seconds();
        position.y += velocity.dy * time.delta_seconds();
    }
}

fn damage_system(
    mut query: Query<&mut Health, With<DamageReceiver>>,
    damage_events: EventReader<DamageEvent>,
) {
    // Reads DamageEvent, writes Health. No overlap with movement_system.
    // The scheduler runs both in parallel.
}

// Registration: the engine reads the function signatures at compile time.
app.add_systems(Update, (movement_system, damage_system));

12Generational indices

Entity IDs must be recycled. A game that spawns and destroys thousands of projectiles per second will grow the slot array unboundedly unless IDs are reused, since per-ID storage (sparse arrays, component slots) scales with the high-water mark of the index. A monotonically increasing 32-bit index would also exhaust the space in days at sustained high spawn rates. The standard solution: a .

The entity allocator maintains an array of slots. Each slot has a generation counter and an alive flag. A free list tracks available slots. Allocating pops a slot off the free list and returns {index, generation}. Freeing increments the generation and pushes the slot back onto the free list. Any saved reference that holds the old generation will fail the generation check on resolve.

This prevents the ABA problem in entity references: slot 5 held enemy A (generation 0), enemy A was destroyed (generation bumped to 1), slot 5 was reused for projectile B (generation 1). A stale reference to {index: 5, generation: 0} correctly fails to resolve, even though slot 5 is alive again.

Live ยท Generational index allocator
alive
0
free slots
12
saved refs
0
Each allocation saves a reference (index + generation). Freeing a slot bumps its generation. Resolving a saved reference checks the generation against the slot's current generation. If they disagree, the reference is stale: the entity it pointed to was destroyed, and the slot may have been reused for a different entity. This is how ECS implementations detect use-after-free without garbage collection.
Generational index allocator
struct Entity {
    uint32_t index;
    uint32_t generation;
};

struct EntityAllocator {
    struct Slot { uint32_t generation; bool alive; };

    std::vector<Slot> slots;
    std::vector<uint32_t> freeList;

    Entity allocate() {
        uint32_t index;
        if (!freeList.empty()) {
            index = freeList.back();
            freeList.pop_back();
        } else {
            index = static_cast<uint32_t>(slots.size());
            slots.push_back({0, false});
        }
        slots[index].alive = true;
        return { index, slots[index].generation };
    }

    void free(Entity entity) {
        if (!isAlive(entity)) return;
        slots[entity.index].alive = false;
        slots[entity.index].generation++;           // invalidate stale refs
        freeList.push_back(entity.index);
    }

    bool isAlive(Entity entity) const {
        return entity.index < slots.size()
            && slots[entity.index].alive
            && slots[entity.index].generation == entity.generation;
    }
};
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
pub struct Entity {
    pub index: u32,
    pub generation: u32,
}

pub struct EntityAllocator {
    slots: Vec<(u32, bool)>,        // (generation, alive)
    free_list: Vec<u32>,
}

impl EntityAllocator {
    pub fn new() -> Self {
        Self { slots: Vec::new(), free_list: Vec::new() }
    }

    pub fn allocate(&mut self) -> Entity {
        let index = if let Some(idx) = self.free_list.pop() {
            idx
        } else {
            let idx = self.slots.len() as u32;
            self.slots.push((0, false));
            idx
        };
        self.slots[index as usize].1 = true;
        Entity { index, generation: self.slots[index as usize].0 }
    }

    pub fn free(&mut self, entity: Entity) {
        if !self.is_alive(entity) { return; }
        let slot = &mut self.slots[entity.index as usize];
        slot.1 = false;
        slot.0 += 1;                                  // bump generation
        self.free_list.push(entity.index);
    }

    pub fn is_alive(&self, entity: Entity) -> bool {
        let idx = entity.index as usize;
        idx < self.slots.len()
            && self.slots[idx].1
            && self.slots[idx].0 == entity.generation
    }
}
What's intentionally missing

The allocator above is the minimal viable version. Production implementations add: packing index and generation into a single 64-bit integer (Bevy uses 32 bits index + 32 bits generation), tombstone detection for double-free, configurable generation width (some engines use 16-bit generation + 16-bit index for tighter packing), and atomic operations for thread-safe allocation.

13Case studies

Overwatch (Blizzard, 2016)

Timothy Ford's GDC 2017 talk[3] describes a custom ECS built for Overwatch. Systems run in a fixed tick order. Each system declares its component reads and writes. The ECS enables the deterministic simulation that powers Overwatch's netcode: given the same inputs, the same sequence of system ticks produces the same game state. Hero abilities that would be nightmares in a deep inheritance hierarchy (Genji's deflect interacting with every projectile type) are implemented as systems that query component combinations rather than as method overrides on a base Projectile class.

Unity DOTS

Unity's archetype ECS stores entities in 16 KiB chunks[4]. Each chunk belongs to one archetype. Within a chunk, component arrays are laid out in order at the component-type level: all Position structs contiguous, then all Velocity structs, then all Health structs. The Burst compiler auto-vectorizes system loops over these arrays, emitting SIMD instructions without manual intrinsics. The C# Job System schedules jobs across worker threads based on declared component access, similar to Bevy's approach but in the C# / .NET runtime.

Bevy (Rust)

Bevy[6] uses archetype storage with a multi-threaded system executor. Systems are plain Rust functions. Their parameter types encode the query: Query<&Position, &mut Velocity> means "read Position, write Velocity." The executor builds a dependency graph from these signatures and dispatches independent systems to a thread pool. Change detection is built in: each component column tracks the tick at which it was last written, so systems can query "give me only entities whose Health changed since last frame."

Flecs (C/C++)

Flecs[8] is an archetype-based ECS with first-class relationships. The (ChildOf, parent) relationship pair acts as a component: entities with the same parent share an archetype, making "find all children of X" an archetype-table scan. Flecs caches query results and maintains an archetype graph where edges represent "add component C" or "remove component C" transitions, enabling O(1) archetype lookup on structural changes. Sander Mertens' "Building an ECS" blog series[8] provides the most detailed public documentation of these internals.

EnTT (C++)

EnTT[7] is the primary example of a sparse-set ECS. Each component type has its own sparse set pool. No archetype tables, no table migration on add/remove. The trade-off: iteration over multiple component types requires intersecting multiple sparse sets (iterating the smallest set and looking up each entity in the others). Used in Minecraft Bedrock Edition. Michele Caini's "ECS Back and Forth" series documents the design decisions in detail.

Unreal Mass Entity

Mass Entity[10] is Epic's archetype-based ECS framework, integrated into Unreal Engine 5. Built by the AI team for crowd simulation (the "Matrix Awakens" demo). Chunk-based allocation sized at 128 bytes per cache line, 1024 lines per chunk. Interoperates with Unreal's existing Actor/Component model via "Mass Entity traits" that bridge between the ECS world and traditional UObjects.

14Pitfalls

15What's next

16Sources

  1. Scott Bilas. "A Data-Driven Game Object System." GDC 2002, Gas Powered Games. gamedevs.org/uploads/data-driven-game-object-system.pdf. The earliest widely cited talk on assembling game objects from data-driven components. Built for Dungeon Siege (7,300+ object types).
  2. Adam Martin. "Entity Systems are the future of MMOG development." T-Machine blog, Part 1: September 2007. t-machine.org. Five-part series that codified the entity-as-ID, component-as-data, system-as-logic taxonomy.
  3. Timothy Ford. "Overwatch Gameplay Architecture and Netcode." GDC 2017, Blizzard Entertainment. gdcvault.com. Describes the custom ECS powering Overwatch's deterministic simulation, system ordering, and netcode.
  4. Unity Technologies. "DOTS - Data-Oriented Technology Stack." unity.com/dots. Archetype-based ECS with Burst compiler and C# Job System. The Entities 1.0 package shipped in 2023.
  5. Catherine West. "Using Rust For Game Development." RustConf 2018, Closing Keynote. kyren.github.io/2018/09/14/rustconf-talk.html. Walked through OOP-to-ECS in Rust; credited with catalyzing the Rust ECS ecosystem (Bevy, Hecs, Legion).
  6. Carter Anderson et al. "Bevy Engine." bevyengine.org. Rust game engine with archetype ECS, automatic parallel system scheduling, and change detection. Source code at github.com/bevyengine/bevy.
  7. Michele Caini (skypjack). "EnTT: Gaming meets modern C++." github.com/skypjack/entt. Sparse-set ECS. Used in Minecraft Bedrock Edition. Caini's "ECS Back and Forth" series: skypjack.github.io.
  8. Sander Mertens. "Building an ECS" blog series and Flecs. ajmmertens.medium.com. Most detailed public documentation of archetype storage internals, edge graphs, and query caching. Flecs source: github.com/SanderMertens/flecs.
  9. Jeff Dean. "Numbers Everyone Should Know." Originally from a Stanford CS295 talk, c. 2009; popularized by Jonas Bonรฉr's gist. gist.github.com/jboner/2841832. L1 cache reference ~1 ns, L2 ~4 ns, main memory ~100 ns. The canonical source for memory-hierarchy latency ballparks.
  10. Epic Games. "Mass Entity in Unreal Engine." Unreal Engine 5 Documentation. dev.epicgames.com. Archetype-based ECS framework for crowd simulation, chunk-allocated at cache-line-aligned boundaries.
  11. Louis Cox, Benjamin Williams, Jay Vickers, Davin Ward, Christopher Headleand. "Run-time Performance Comparison of Sparse-set and Archetype Entity-Component Systems." CGVC 2025. diglib.eg.org. Academic benchmark comparing sparse-set vs archetype ECS. Confirms archetypes excel at iteration; sparse sets at composition changes.

See also