I've been working on developing a new database CI tool suite, and was talking with a friend about his DB woes. We talked about what problems he was having, where they occurred, and how long they were taking to resolve. His database was really simple. The scripts were pretty much always correct. But some scripts might be forgotten, and his problems usually revolved around tracking down missing changes.
I realized my solution didn't fit his problems at all. I had never worked in an environment that was all that simple. There were always mistakes in the scripts. And it was always painful to correct them. When things went wrong, it was like the DB became the black hole of engineering hours, sucking away everyone's time. We'd try to repair the mistake by hand. Or if we couldn't figure out how to repair it, we'd have to restore from a snapshot of production. And while all these repairs were going on, the engineering department was pretty much down.
Making mistakes was so painful. And trying to use any of the database migration tools out there didn't seem to solve my problems. In some cases, it even made them worse. I always ended up resorting to rolling my own custom database tools.
So that got me thinking... how do you decide what kind of solution you need?
While there are many complex factors that drive DB development costs, there are two that seem to characterize the problem space quite well: frequency of mistakes, and cost of recovery.
When mistakes are rare, and recovery costs are cheap, existing migration tools are the perfect solution. Most of the effort is spent in keeping databases up to date, and making sure all the scripts are deployed as the changes evolve. Migrations are excellent at solving that problem, and handle it beautifully.
When mistakes are costly, migration tools can still be very helpful, but depending on how rarely mistakes occur, and how costly they are, migrations might not be enough. Augmenting a simple migration tool might be a good option.
When mistakes are frequent, there's an entirely different kind of problem going on. Developers aren't struggling with deploying the scripts, they're struggling to create correctly working scripts. Migration tools tend to be intolerant of deploying and recovering from broken scripts. And the tools can make recovery even more cumbersome by imposing additional constraints.
When frequent mistakes are also expensive to resolve... well, life is pain. There's really only two strategies to a less painful life--figure out how to make fewer mistakes, or figure out how to make mistakes easier to recover from.
I haven't found much help in this space in either the open source or commercial market, which is why I set out to try to fill that gap. And I've built custom tools for solving similar problems several times over now, and had the opportunity to make a lot of mistakes. Now I get to do something with all that learning, and hopefully help reduce some of the pain out there. :)