me: Your regularly scheduled reminder that Database Cleaner’s
deletionstrategy is so much faster than the
my coworker: You should write a blog post about it!
— from Twitter
One of the Rails apps I maintain at work has always had an amazingly slow Cucumber test suite. And when I say amazingly slow, I mean really amazingly slow:
91 scenarios (91 passed) 545 steps (545 passed) 14m50.612s
It’s so bad that I don’t run the Cucumber tests, ever. If it’s going to fail, our CI server will pick it up - I won’t. No-one has ever known why they were slow - they were nearly all written several years ago, when Cucumber was just becoming popular, and it seemed to be taken for granted that it was just ‘one of those things’.
When doing some major refactoring lately I needed to run the Cucumber tests repeatedly, while rewriting some of them and fixing others. I thought I was going to lose my mind while waiting for them, so I decided to take the time out to look at why they were so slow.
I started with the
env.rb file, which holds the configuration for the test suite. Abridged, it looked like this:
It looked fairly innocent (apart from “those pesky tests that don’t work with capybara-webkit”), but my eyes zeroed in on the Database Cleaner configuration.
I have a love-hate relationship with Database Cleaner, especially in the last few months. I’ve done a lot of reading about why it works sometimes and not others, and what’s the best way to configure it, including this post by Avdi Grimm. One article I read (I thought it was that one, but apparently not…) mentioned that some people get better results with the
deletion strategy, while some people swear by
truncation. So I thought I’d try the
deletion strategy instead - what harm could it do, except perhaps waste another 15min of my time?
Changing that one configuration option had an astonishing effect:
91 scenarios (91 passed) 545 steps (545 passed) 5m10.963s
That’s a saving of nearly 10 minutes on a 15min test suite. But why is it such a radical difference???
This app is using PostgreSQL as its data store, and I suspect that the two commands run by the two different strategies (
DELETE FROM products vs
TRUNCATE products) don’t do the same thing in PostgreSQL. This post on StackOverflow (and the subsequent discussion on the PostgreSQL performance mailing list) is a great explanation of exactly what the two commands do, and how they differ.
The tl;dr version of the threads are as follows:
TRUNCATEwill give you back your tables in near-pristine state. The table gets emptied, the index is reset, the foreign keys are updated, a vacuum is run, and everything is back to as if you had just run a
CREATE TABLEcommand. It has virtually a fixed cost per table, no matter how much data is in it.
DELETEis quicker and dirtier - it just deletes the data, nothing more. Indexes aren’t reset, foreign keys arent updated, vacuuming isn’t run, and you’re left with your table in a non-pristine state. The speed of running a
DELETEis also very dependent on the amount of data in it.
TRUNCATE would be the better approach, for database consistency, efficiency and cleanliness, but for our testing purposes, do we really need it? We don’t.
- Each of our tests will only create a couple of rows in a couple of tables, meaning the
DELETEto empty them will be very very fast, much faster than a
- The indexes and vaccuuming are irrelevant - PostgreSQL will run its automated vaccuums often enough that this will never cause an issue.
- The foreign keys would be an issue if we weren’t emptying the entire database at the same time, after every single test. Because we are emptying the entire database, we won’t have any dangling foreign key references to cause problems.
So for all uses of Database Cleaner, I believe that the preferred cleaning strategy in all of your Rails apps should be
truncation. See how much time you can save in your test suite runs!