Wednesday, September 14, 2016

Recovery Full or Simple

Once you've got your databases identified and organized into a spreadsheet and accomplished the preliminary cleanup, the next step to getting your act together is to determine the recovery model on each database. Well, maybe some basic instance tuning, but we can leave that for another blog post.

You only get two choices for the recovery model, full or simple, but I still can't believe how much difficulty folks have when choosing between the two. One thing you'll notice right off the bat is software developers are your worst resource for intelligently making this choice. You need to ping them for critical information to measure certain volatility features, but after that the choice really should be up to you. You're the one that will be managing disk space, backups, and restores.

There are only two key questions you need to ask yourself to make a final determination on what recovery model to choose for each database: 1) would it kill me if I was asked to restore this database to a precisely specific point in time and could only get it to the closest day? 2) Am I willing to manage log file backups and shrinks on this puppy?

If your answer to both questions is yes, then you need the full recovery model. Otherwise, use the simple model. But getting to the answer of these two questions is really the bulk of what I'd like to chat about.

Despite all the confusing things you will read about the recovery model, the gist of the difference between full and simple is the ability to perform "point in time" recovery. On the simple model you can restore whatever backup you want to your database, then any differential you ran, and you'll get the state of the database naturally from when you made that last differential backup. On the full model, you can additionally restore all or portions of the log file backup(s) to reflect the transactions that happened after your last differential.

Therein lies the rub: on simple recovery your log files don't grow much, but on full they can get quite large. On databases set to full recovery you will need to periodically shrink the log files, and of course you need to add them to your maintenance plan to back them up as well.

This is where that spreadsheet of databases is helpful. If a production database exhibits a highly combined level of Write Coupling and Software Volatility then you'll likely need to bite the bullet and set it on full recovery model. Very seldom will I set a development, QA, or staging database to this model though. A better way to manage high software volatility on a development database is to make sure your developers are using good quality source control. If a production database is written to at random times by all sorts of batch and OLTP systems or your developers are constantly rolling out production upgrades, then you'd best stick with full recovery.