Skip to main content

6 Letters for SQL Disaster Emergencies : RPO and RTO

 


As I discussed previously, the most important thing for any database is the database itself. To secure the data properly, you need to backup it regularly with a proper backup plan.

Let’s say you hit a disaster and the last proper backup is 10 hours old. So, you are going to lose 10 hours’ worth of data even if you can restore it instantly. How much data you lose depends on which 10 hours you are dealing, your most busy business time, or a relatively less busy hour? This is what we call the Recovery Point Objective.

Now come to restoring the 10-hour old backup that you have in your hand. How much time you need to get to the last proper backup? How much time it takes for restore? How much time for changing the App connection string and other configurations? In short, how long the end-users are going to wait before they can access the database again? This is what we call Recovery Time Objective.

RPO and RTO are two very important topics for any Database Management System. Setting RPO and RTO poorly can lead to devastating business situations. Today we will try to understand RPO and RTO by answering some questions.

  • What are RPO and RTO?
  • How can we set RPO and RTO for our business?
  • What are the technologies and tools involved in achieving the RPO and RTO we have set?
  • How much cost is involved here?
  • How can we automate our backups and more importantly, restores properly?

Hang on tight, we are going to cover it one by one.

What are RPO and RTO?

RPO

The recovery point objective is how much worth of data you lose if you hit a disaster? We often think it's measured in Megabytes. No. It's measured in time. If your last good backup is 10 hours old, your RPO is 10 hours. You can lose any amount of data based on when the disaster takes place.

RTO

Recovery time objective means how much time you need to restore the last good backup and make you database back online. It's measured in time.

How can we set RPO and RTO for our business?

RPO

You need to ask yourself how much worth of data your business is comfortable to lose in case of disaster? It’s really the task to discuss with the business stakeholders, but you need to set it yourself first before discussing it with them. I am warning you; your business stakeholders are going to say that they don't want to lose any data at all. Hold on tight, I am going to show you how you can manage your business stakeholders so that they can agree to lose some data(which means disagreeing to spend more) or agree to spend more (which means disagree to lose data), more on that in the later section.

RTO

You can calculate RTO by answering and summing down the following:

  • How much time to get access to the last good backup?
  • How much time to move the backup to the destination server?
  • How much time needed for restoring the database?
  • How much time to make app configuration changes?

Sum all these required times and allow some buffer time, since you never know what more challenges you are going to face while making the database back online. Its huge stress, trust me since the end users are waiting for you.

What are the tools and technologies involved in achieving the RPO and RTO we have set?

For optimal RPO you can use any of the following or a combination of the following:

If your business data is highly valuable and you need very low RPO,

  • SQL Server Always On high availability.
  • Asynchronous database mirroring
  • Go to Cloud, like Azure or Amazon AWS

If you are comfortable with losing some data,

  • Find your current RPO with sp_blitzbackup, an open source dba script
  •  Follow my backup plan article for planning a proper backup strategy Backup SQL Database like Batman
  •  Maintain backup maintenance script, you can try ola hallengren's automate scripts
  • Test your backup script, and I am damn serious about it.
  •  Implement proper notification whenever a backup fails. You can perform it in SSMS maintenance plan. This can notify the responsible persons when a backup fails.

How much cost is involved?

When you try to have near-zero RPO and RTO, it’s going to cost you more. Always on can be enabled with synchronous mirroring. Whenever the primary node fails, you instantly move to the secondary node. SQL Server 2019 have launched with 5 Always on Synchronized nodes, which is cool, since it had 3 nodes in 2017. In always On, you need to configure your App connection string to Always On listener, see how easy it is to setup connectionstring on Availibility Group

My point is, show your management how much the tools are going to cost if you want to achieve a good RPO and RTO, so that they can decide either to spend more money or lose more data.

Automate Backup and Restores

  • First use sp_blitzbackup to see what are you currently at
  • Use ola hallengren's maintenance script to maintain backups and restores
  • Use dbatools.io to perform the same things if you are comfortable with PowerShell.

Comments

Most Loved Posts

SQL Data Tools - Compare Data

Compare Data between two tables SQL Server Database with the same schema architecture can differ in different environments like Dev, Staging, and Production, especially in configuration tables. Let's see how we can easily sync the data in two different tables.

How to deal with Slow SQL Server due to Autogrowth issue

  Why you should not stick to SQL Server’s default Initial file size and autogrowth We hear a lot of these statements : My SQL Server is running slow My Production DB was fine when we started, But it is staggeringly slow now My Business end users are frustrated to wait too long Well, there are lots of reasons why your SQL Server might be slow. Setting the Autogrowth option to default is definitely one of the vital ones which we seem to ignore most of the time. Slow SQL Server and Tortoise SQL Server provides you with some default settings for autogrowth when you install it for the first time. These default cases are defined with increment by 8MB or by 10%. You need to change it to suit your own needs. For Small application, this default value might work but as soon as your system grows, you feel the impact of it more often. What Happens SQL Server Files needs more space SQL Server Requests the Server PC for more space The Server PC takes the request and asks the SQL request...