In depth: Memory debugging for Unity

For the release of our latest Serious Game, Garfield’s Count Me In, we suspected we had a memory leak. After about one hour of gameplay our game crashed when playing on tablets or phones. We have released several games before using Unity. When we previously had a memory leak they were easily traceable and fixable. For this game the problem was more spread out and some of the causes were not that obvious. What was obvious that after each scene switch the used memory increased.

This article describes the tools and techniques we used to find and solve the memory leaks that we had. The second half of the article describes the causes of the memory leaks and how to avoid them.

Garfield’s Count Me In

Garfield’s Count Me In is a game for children to lean arithmetic. The learning method on which the game is based is specifically for children who have difficulty learning. The game adapts the exercises to the skill level of the player. This keeps the game and exercises fun and interesting.

GarfieldsCountMeInScreenshot
In game screenshot of Garfield’s Count Me In

The game is split into two parts. The first part is the planet on which Garfield and his friends are trapped. This planet is a large world through which Garfield progresses encountering new areas with puzzles to solve. The second part is comprised out of the exercises which the player must solve to gain resources to progress through the world.

Tools

Besides Unity’s default profiler we used two other tools for tracking memory.

Unity Memory Profiler

gcmi Memory Profiler
Unity memory profiler snapshot

The default profiler from Unity doesn’t give nearly enough information to get to the core of many memory problems. The first tool we used is the Memory Profiler from Unity themselves (https://bitbucket.org/Unity-Technologies/memoryprofiler). This tool allows you to make a snapshot from memory. It gives you an overview of how much memory was used for which systems and allows you to zoom in on areas to get more detailed view. It also keeps track of the objects which are referencing the memory you have selected.

While this tool gives a good overview of memory usage at a certain time it does not have built in functionality to compare two snapshots.

Unity Heap Dump/WinMerge

To get a better view of the difference between two memory snapshots we use Unity Heap Dump (https://github.com/Zuntatos/UnityHeapDump). This tool dumps heap memory to text files but doesn’t look at asset memory. We made a slight modification to change the folder where it stored this dump in such a way that its using a unique folder for each consecutive memory dump.

Using WinMerge we could compare folders with different memory dumps and find the problems we were facing.

Techniques

our initial attempts to find the memory leaks, which consisted out of a divide-and-conquer strategy of eliminating candidates that might cause the leak, we decided that we needed to adjust our strategies, here are the two most important techniques that we used.

Reduce data

The timing of measuring our memory footprint was a big factor in how effective we could search for discrepancies. To increase the signal to noise ratio we added an extra stage to the scene loading process. After unloading the previous scene, we unload as much memory as possible and force the Garbage Collector to collect memory. After this we take the memory snapshots, before loading the next . This reduced the total amount of memory we had to investigate to find the memory that was actually leaked.

Basic Technique

Our basic technique for taking memory snapshots was to make a minimal build of the game. After starting the game, we followed a simple routine.

  1. Play the game up until the part you want to test
  2. Leave the scene and return to the scene in question. This ensures that all persistent memory that needs to be loaded is in memory before our first snapshot
  3. Play the part you want to test
  4. Leave the scene. When the scene is unloaded take a memory snapshot
  5. Return to the scene in question and repeat from step 3.

We would repeat this as often as needed.

Results

There were several causes for the memory leaks we found, the most common cause was that persistent data kept GameObjects alive through events. These were the most important cases. I included a minimal example of the problem where possible.

OnDestroy not called in Base class

In Unity scripting, there are several event functions that get executed in a predetermined order as a script executes, such as: Awake, Start, Update and OnDestroy. When a MonoBehaviour is destroyed OnDestroy is called. It is however not immediately obvious how it is called. From a pure C# standpoint MonoBehaviour would contain a virtual Method which is overridden in the derived class, but this is not how Unity implemented this. Instead for each class derived from MonoBehaviour it checks which of these functions are present and stores a reference to it. If a parent and derived class both implement such a function privately only the reference to the function in the derived class will be stored.

This introduced a problem when we derived an abstract class from MonoBehaviour which holds a structure with references to other GameObjects. This structure got cleared in OnDestroy. During initial development and testing we never encountered the situation where we would derive a class which also implements an OnDestroy. We did however implement this for Garfield’s Count Me In. Only the OnDestroy method in the derived class was called and not the one of the abstract parent class.

This problem was easily fixed by making OnDestroy a protected virtual method in this abstract class, which generated a warning when also implementing this method in a derived class without using override.

public class BaseClass : MonoBehaviour {
       private void OnDestroy() {
             ClearReferences();
       }
}

public class DerivedClass : BaseClass {
       // By implementing this method, the BaseClass.OnDestroy is not
       // called any more. This generates a warning, but that is easy
       // to miss.
       private void OnDestroy() {
             //Do stuff
       }
}

 

 

External access to data

We all know encapsulation is important, but for some reason it still slips through the cracks sometimes. We have UI elements which display formatted text from multiple sources. If one of these sources changes its text, all the UI elements need to be informed of this. So, when a source is added to the UI element, the UI element registers to the change event of the source text.

Somewhere along the way we updated this system, but some legacy code remained behind, which allowed the source element to be changed without unregistering the UI element from the change event.

In this case the source elements are always ScriptableObject which are persistent in memory, meaning that each time this happened the ScriptableObject would keep the reference to the UI element during the whole lifetime of the game.

This problem was solved by fixing the encapsulation and unregistering the UI element each time a source element was replaced or removed.

public class UIElement {
// By having this List public, an external component can
// remove items without unregistering the OnChanged event!
public List<Source> sources;

public void Add(Source source) {
       sources.Add(source);
       source.OnChanged += OnChanged;
       }

       public void Remove(Source source) {
             sources.Remove(source);
             source.OnChanged -= OnChanged;
       }
}

Register Events twice

Even when having done encapsulation correctly, in this case by calling delegates through an event, things can go wrong. From memory profiling we saw certain objects were being kept in memory. The only plausible reference to other objects was kept through an event. Through debugging we checked that we removed the delegate from the event, but the object was kept in memory. It turns out that through a mistake in refactoring the delegate was added on two different occasions to the event, but only removed once.

public class Observer : MonoBehaviour {
       public Subject subject;

       private void Awake() {
             subject.onChanged += Foo;
       }

       private void Start() {
             subject.onChanged += Foo;
       }

       private void OnDestroy() {
             // The event is removed once, but it
             // was registered twice. One reference remains!
             subject.onChanged -= Foo;
       }
}

 

DontDestroyOnLoad classes

We have a collection of classes that are persistent between scenes. These classes present a risk, because they can keep alive references to GameObjects in scenes that are already unloaded. Most of the problems described above were also linked to persistent classes. For now, we don’t have a solution for this. To mitigate the risk, we keep the number of persistent classes as low as possible and have a more rigorous review process for these classes.

Garbage Collection Quirks

For game development, garbage collection is a mixed blessing. If too much garbage is generated during gameplay it will generate lag spikes, the advantage is that you do not have to keep track of memory yourself, but in this case, it seemed the Garbage Collector didn’t keep track of the memory either.

We used a quad tree structure, with nodes referring back to the quad tree. For some reason when we unreferenced the quad tree, the Garbage Collector didn’t pick this up. The tree kept the nodes alive and vice versa. Our best guess at this moment is that extensive use of Generics in this structure led to a failure of the Garbage Collector. Implementing a method which released all reference between these two structures solved the problem, unfortunately we are still not sure why this occurred in the first place.

Conclusion

When we started this process, we hoped that we would find a single problem, with a single solution. We have released several games before and when we had memory leaks they were easily traceable. This case was different. Memory leaks were present in different systems that were introduced at multiple stages of the development process. Although the most common cause was linked to the use of events.

To be able to track these kinds of leaks more pro-actively we have created a tool which always checks the memory on the background during the development process. When returning to a previously visited scene, it checks if the memory use has grown, if so it gives the user a message. This way we get alerted to the problem the moment we introduce a leak.

Share this post on:
TwitterLinkedInFacebook