Intrusive Lists in Doom 3

Doom 3 was released in 2004, which I can't believe was a decade ago. Great game, atmospheric and scary, and a worthy successor of the first two games in the series. Several years after the initial release, in 2011, Doom 3's source code was published on Github. It can be fun to poke around the repository to learn how things are done in video game land. In this post, we'll look at how Doom uses linked lists.

It's a bit long-winded, so I split the post into sections for convenience:

Instruive vs. Non-intrusive

In general, Doom avoids using the C++ Standard Template Library, but the case of linked lists is especially interesting. Doom doesn't just provide a replacement container with functionality similar to std::list, it has something conceptually different - an intrusive list.

std::list is non-intrusive. It means that you can store pretty much anything in it without having to do any additional work. For example, if you have a structure like this:

  
    struct Point2D {
      float x;
      float y;
    };
  
  

it's trivial to create list of points and store a bunch of points in it:
  
  std::list point_list;
  point_list.emplace_back(Point2D{1.0f, 0.1f});
  point_list.emplace_back(Point2D{0.0f, 0.1f});
  
  

Non-intrusive lists are easy to understand because they are intuitive. They conveniently conform to the abstract idea of a list as a thing that you can "put" other things into.

Intrusive lists are a bit trickier than that. Instead of being stored separately, the pointers to the next and previous elements are stored within your structure. Like the name suggests, an intrusive list intrudes into your structure's definition, like this:

  
  struct Point2D {
    float x;
    float y;
    PointListNode list_node;
  };
  
  

If you want your structure to be in two lists at the same time, you have to add another field:
  
  struct Point2D {
    float x;
    float y;
    PointListNode list_node;
    PointListNode another_list_node;
  };
  
  

The structure for PointListNode might look something like this (oversimplified):
  
  struct PointListNode {
    PointListNode *next;
    PointListNode *prev;
    Point2D *owner;
  };
  
  

Now, this looks like it makes our code less generic and kinda weird, but it has some important benefits, which we're going to discuss below.

Benefits of Intrusive Lists

Less Memory Management

First of all, consider how a non-intrusive list like std::list is likely to be implemented. Under the hood, it will have some structure, like "ListNode" which is comprised of the object that the list holds as well as pointers to the next and previous elements. Something like this:

  
  template 
  struct ListNode {
    T object;
    ListNode *next;
    ListNode *prev;
  };
  
  

What happens when you add items to a non-intrusive list? A ListNode structure silently gets created and initialized. Of course, it must occupy some memory, so that memory has to be dynamically allocated. Thus, an insertion operation results in a memory allocation being made behind the scenes. We have a similar situation with removal: every time you remove your object from the list, the corresponding ListNode has to be cleaned up, which is another call to the memory manager.

Now imagine a situation when you are playing your favorite shoot-em-up, and you just got an awesome power-up for your machine gun that makes it shoot eight projectiles in different directions at once, at a higher rate. Now you can unleash a furious torrent of deadly fire on your digital opponents, but let's think about what's going on behind the scenes. There's a ton of projectiles on screen, and each of them is probably a member of multiple lists, for example : list of all entities, list of renderable things, list of moving things, and possibly others. Every time a new projectile is created it has to be added to a bunch of lists. There can be a lot of projectiles, they appear and disappear quickly, and every time that happens we incur the cost of allocating/deallocating memory multiple times.

With intrusive lists, you don't have to pay that price, because the space for your list nodes is already there within your object, so the allocation is done only once.

Easier Removal

In the preceding scenario, a projectile can be a member of several lists. Projectiles get destroyed very very often (fly off-screen, explode on contact) and as a result, they have to be erased from their list. A big plus of intrusive lists is that if you have to remove your element from multiple different lists, you don't have to go and search each list individually, since each membership is represented by a field in your object.

Less Dereferencing

Suppose we have a list, and we want to do something with every object within that list.

Here's what happens when we use a non-intrusive list (note that in this example I assume the worst-case scenario: every time we dereference a pointer we get a cache miss).

  • We have a ListNode structure. Since our objects can be members of multiple lists, the ListNode contains a pointer to the object, not the object itself. So we have to dereference the pointer, fetch the object from memory, and do our stuff.
  • Next, we dereference the "next" pointer of our ListNode structure and load the next ListNode from memory.
  • Repeat from step 1 unless we're at the end of the list

Here's what happens when we use an intrusive list:

  • We have the object. No need to dereference anything at this point, just do our stuff.
  • Dereference the appropriate "next" pointer, get the next object from memory.
  • Repeat from step 1 unless we're at the end of the list.

This halves the amount of potential expensive cache misses.

The above is incorrect because I forgot the part where you have to dereference the "owner" pointer of the intrusive list's node.

Looking at Doom's Implementation

Doom's implementation can be found here. We have a class template idLinkList (which represents the intrusive list node), and the only template parameter is called "type". Let's look at the private data members (comments added by me):


  
  // Pointer to the first node in the list.
  idLinkList * head;

  // Pointer to the next node.
  idLinkList * next;

  // Pointer to the previous node.
  idLinkList * prev;


  // Pointer to the object holding this node.
  type *owner;
  
  

The default ctor initializes owner to NULL and sets next, prev and head to "this". This is an important point. Doom's linked list is not only intrusive, it is also circular. It means that the "next" pointer of last element in the list points to the list's "head" node (and the head's "prev" points to the last element). An empty list consists of a "head" node with its "next" and "prev" pointers pointing to the head node itself. The head can't be removed from a list using idLinkList::Remove and can be thought of as a "handle" for the list. Thus, the default ctor creates an empty list. I think there actually is a reason for the list to be circular and I'll try to explain it below.

Let's take a look at the most important functionality of idLinkList- adding and removing things. First, we have the "InsertAfter" method, which inserts the node into a list after a given node.


  
  template< class type >
  void idLinkList::InsertAfter( idLinkList &node ) {
    Remove();
    prev = &node;
    next = node.next;
    node.next = this;
    next->prev = this;
    head = node.head;
  }
  
  

This should be fairly obvious. First of all, the node is removed from its current list, and then we update some pointers. The Remove() method is also pretty short:
  
  template< class type >
  void idLinkList::Remove() {
    prev->next = next;
    next->prev = prev;
    next = this;
    prev = this;
    head = this;
  }
  
  

The interesting thing to note here is how there are no "if" statements in these functions. This is because of the list's circular nature: the next and prev pointers always point to something, whereas in a non-circular there would be cases when "next" and "prev" are null, which need to be checked for.

There are two nice things about this approach. First, it leads to really simple and segfault-proof code. Second, it produces code without any jump instructions. This makes branch misprediction impossible, therefore avoiding a potential stall in the CPU's instruction pipeline. If this sounds like mumbo-jumbo to you, here's a more detailed explanation: modern processors don't execute just one instruction at a time. At any given moment, there are multiple instructions "in flight", at different stages. For (a very simple) example: once the first instruction has been decoded and started executing, the processor can start decoding the second one while the first one is still executing. This boosts performance, but the problem arises when there is a conditional jump in the code. The processor has to know whether the jump will be taken before actually executing it, in order to decide which instructions to start shoving into the pipeline. Modern processors have a special component called branch predictor that essentially makes this decision, except it doesn't always make the right prediction. A branch misprediction results in a performance hit. The functions we've just looked at avoid branching altogether, and therefore can never cause a misprediction.

Some Links

UPDATE: some people have pointed me to this great article that explains why vectors of pointers have better perf (see the "Linked Structures" section).


Like this post? Follow this blog on Twitter for more!