Finding Memory Leaks in Ruby

November 27th, 2006

If you suspect you have a memory leak in a Ruby on Rails app, you’re probably going to have a hard time:

  1. Proving that you do have a leak
  2. Finding it

(The “fixing it” part probably is easy.) I had to do both activities because of a memory leak I suspected in wishlisting, and found very little assistance on the web as to how to go about it. A Google search revealed two possible helpful hints. One is this blog post by Scott Laird which offers you a script that can dump all of your in-memory strings to a file every 10 seconds. I spent a good amount of time playing with this, and generally concluded (as many of the comments suggest) that I couldn’t make much sense of the output. There were thousands of differences, and I didn’t know which were “ok” and which were signs of problems. Another possibility is this commercial tool for watching Ruby’s memory. I didn’t try it at all because it was Windows-only, which meant I’d be installing/testing my app on a non-production platform, and also because the screenshots made me think I’d be in much the same place as I was with Scott’s tool. The tools simply aren’t as refined as you’ll find in the Java or C++ worlds.
So… I set off to solve this problem my own way. Step 1 was proving that I had a leak. To do that, I decided, I wanted my app to run in as close to a production environment as possible. This meant that it had to run in Mongrel (not some test container), on Linux, and in production mode. Like Java, Ruby runs in a Virtual Machine which is automatically garbage collected. Memory gets collected when the VM decides it should be. This level of mystery hardly lends itself to a repeatable test, so I took a cue from the Java profiling tools and decided that explicitly running the garbage collector after each request would be the simplest way for me to identify what’s not being collected. Using httperf I could trivially execute repeated requests against any of my actions in a very black-box way, and using the output from ps I could see how much memory was in use by the ruby process.

So I appended GC.start to the end of the action I wanted tested, fired up mongrel, and started sending HTTP requests at it. This was all scripted (a script I may eventually post once I really get it working the way I’d like) so that mongrel would start on a non-standard port, httperf would send one request to get a baseline, and then httperf would send repeated requests to the same action to monitor growth. RSS is measured before and after the test via ps.

I was pleased to see in the basic case (requesting our home page) the difference was consistently 0KB of memory leaked. It gave me some comfort that 1) we weren’t leaking anywhere, and 2) the test is capable of producing a zero result. I moved on to the search page, which can have a lot of stuff on it. I hammered on the search phrase “ipod nano” in the same test framework. The RSS was growing repeatedly, at roughly 46KB per request (given a sample size of roughly 15 requests per batch of tests). Although that number seemed quite high, it was a red flag that something was not right. I considered this proof enough that something was leaking.
The next step was to find out what was causing the leak. I did this the brute force way. I yanked out code until the leak went away, and added code back until the leak came back. Because this involves a lot of trial and error, it is crucial that you have a test framework set up so that you can fire off a battery of tests with minimal effort and see if the results change. In the end, I found this bit of code in a helper method was causing the leak:

newdimensions = Hash[:width=>constrainW, :height=>constrainH]

It was a local variable which was created inside the helper to store the height and width of an image. Since it always had only two entries, it probably didn’t need to be a hashtable - I just prefer the look of code that has words for indexes instead of numbers. So, I switched the code over to an Array:

newdimensions = [constrainW, constrainH]

And the leak went away. I don’t fully understand why this was causing a problem, although there is some talk of bugs in Ruby’s implementation of constructing Hashes.

So, that’s my story. I’d like to beef up the script so that it can explore more parts of an application and test everything for memory leaks, but I haven’t gotten there yet. At that point, I will probably release the script for public consumption. Until then, consider this post one resource among the very few that seem to be out there on how to detect memory leaks in Ruby.

2 Responses to “Finding Memory Leaks in Ruby”

  1. dave Says:

    Have you tried writing better code?

  2. cunheise Says:

    thx, your article is very good.

Leave a Reply