Skip to main content

6 Letters for SQL Disaster Emergencies : RPO and RTO

 


As I discussed previously, the most important thing for any database is the database itself. To secure the data properly, you need to backup it regularly with a proper backup plan.

Let’s say you hit a disaster and the last proper backup is 10 hours old. So, you are going to lose 10 hours’ worth of data even if you can restore it instantly. How much data you lose depends on which 10 hours you are dealing, your most busy business time, or a relatively less busy hour? This is what we call the Recovery Point Objective.

Now come to restoring the 10-hour old backup that you have in your hand. How much time you need to get to the last proper backup? How much time it takes for restore? How much time for changing the App connection string and other configurations? In short, how long the end-users are going to wait before they can access the database again? This is what we call Recovery Time Objective.

RPO and RTO are two very important topics for any Database Management System. Setting RPO and RTO poorly can lead to devastating business situations. Today we will try to understand RPO and RTO by answering some questions.

  • What are RPO and RTO?
  • How can we set RPO and RTO for our business?
  • What are the technologies and tools involved in achieving the RPO and RTO we have set?
  • How much cost is involved here?
  • How can we automate our backups and more importantly, restores properly?

Hang on tight, we are going to cover it one by one.

What are RPO and RTO?

RPO

The recovery point objective is how much worth of data you lose if you hit a disaster? We often think it's measured in Megabytes. No. It's measured in time. If your last good backup is 10 hours old, your RPO is 10 hours. You can lose any amount of data based on when the disaster takes place.

RTO

Recovery time objective means how much time you need to restore the last good backup and make you database back online. It's measured in time.

How can we set RPO and RTO for our business?

RPO

You need to ask yourself how much worth of data your business is comfortable to lose in case of disaster? It’s really the task to discuss with the business stakeholders, but you need to set it yourself first before discussing it with them. I am warning you; your business stakeholders are going to say that they don't want to lose any data at all. Hold on tight, I am going to show you how you can manage your business stakeholders so that they can agree to lose some data(which means disagreeing to spend more) or agree to spend more (which means disagree to lose data), more on that in the later section.

RTO

You can calculate RTO by answering and summing down the following:

  • How much time to get access to the last good backup?
  • How much time to move the backup to the destination server?
  • How much time needed for restoring the database?
  • How much time to make app configuration changes?

Sum all these required times and allow some buffer time, since you never know what more challenges you are going to face while making the database back online. Its huge stress, trust me since the end users are waiting for you.

What are the tools and technologies involved in achieving the RPO and RTO we have set?

For optimal RPO you can use any of the following or a combination of the following:

If your business data is highly valuable and you need very low RPO,

  • SQL Server Always On high availability.
  • Asynchronous database mirroring
  • Go to Cloud, like Azure or Amazon AWS

If you are comfortable with losing some data,

  • Find your current RPO with sp_blitzbackup, an open source dba script
  •  Follow my backup plan article for planning a proper backup strategy Backup SQL Database like Batman
  •  Maintain backup maintenance script, you can try ola hallengren's automate scripts
  • Test your backup script, and I am damn serious about it.
  •  Implement proper notification whenever a backup fails. You can perform it in SSMS maintenance plan. This can notify the responsible persons when a backup fails.

How much cost is involved?

When you try to have near-zero RPO and RTO, it’s going to cost you more. Always on can be enabled with synchronous mirroring. Whenever the primary node fails, you instantly move to the secondary node. SQL Server 2019 have launched with 5 Always on Synchronized nodes, which is cool, since it had 3 nodes in 2017. In always On, you need to configure your App connection string to Always On listener, see how easy it is to setup connectionstring on Availibility Group

My point is, show your management how much the tools are going to cost if you want to achieve a good RPO and RTO, so that they can decide either to spend more money or lose more data.

Automate Backup and Restores

  • First use sp_blitzbackup to see what are you currently at
  • Use ola hallengren's maintenance script to maintain backups and restores
  • Use dbatools.io to perform the same things if you are comfortable with PowerShell.

Comments

Most Loved Posts

SQL Schema Compare with Visual Studio (A complete Guide)

Introduction When you're working on your Dev Database, an urgent issue comes along, and you instantly solve it by changing Scheme in the Staging Database or Production Database :3, few more these type of patching and you're completely out of sync! A lot of paid alternatives are there like SQL Data Compare by RedGate, but my first choice is Visual Studio's SQL Data Tools. In the following article, I tried to image-describe the steps for SQL Data Tool. Like I said before, there are lots of handly DBAtools out there to compare Schema between two DB Sources. I would like to discuss how you can compare two SQL Server DB with Visual Studio. Make sure you have SQL Server Data tools checked while installing Visual Studio.