Last
weeks I received a lot of questions from colleagues and customers about backup
and disaster recovery in the new Exchange Server 2013. These questions really
seemed to focus on the fact that organizations still have a pretty old
understanding about backup and recovery. All customers still want to have item
level backup while their data usage is growing.
So I
thought, this is a good opportunity to write an article about backup and
disaster recovery (DR) with Exchange Server 2013 (Exchange).
Introduction
First of
all you can divide backup primarily into two main concerns:
- You'll probably need backup to perform a point in time restore based on a single item or complete mailbox.
- In any enterprise production environment you'll need a solution that provides you a solution to recover your data in case of an emergency.
In the
old days the solution to the first concern in Exchange was to buy and implement
a backup solution that provided you single item backup and recovery. This
feature enabled IT organizations within a company to restore a single or
multiple items back into a user's mailbox in case the user accidently deleted
the item.
The
demand for this solution was high so everybody implemented it and performed
well within the requirements. However a few years ago mail data demand began to
grow and backup time windows began to shrink because of hypes like "The
new way to work" and/or "Work/Life integration". These hypes
created more flexible work times and therefore a shorter backup windows. Also
users kept their e-mail into their mailbox until the end of times.
These
developments began to create some challenges for IT organizations to handle
backup of mail data within the boundaries of time provided.
When the
years went by Microsoft optimized it's database structure and implemented new
features in Exchange to cope with these problems. This resulted in even bigger
mailbox databases, but the mindset of organizations concerning the backup of
mail data did not change. Even today customers want to have single item backup
in their Exchange environment. And when you ask the question, how many times
did you use this functionality the past year, they can't give you a real
answer.
The
second concern is how you need to cope with outage and emergency and getting
you're data back (Disaster Recovery or Emergency Recovery). To describe this
concern I'll give you a short explanation about DR.
DR can
best be divided into two objectives:
- RPO (Recovery Point Objective) and
- RTO (Recovery Time Objective).
RPO
RPO is
the maximum tolerable period in which data might be lost from an IT service due
to a major incident. In other words how much data (measured in time) is an
acceptable loss in case of an emergency.
RTO
RTO is the duration of time in which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity. In other words in how much time does the service(s) need to be restored in case of an emergency.
RTO
RTO is the duration of time in which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity. In other words in how much time does the service(s) need to be restored in case of an emergency.
So how
does this all related to Exchange Server 2013? Well I will try to explain this
in the following paragraphs.
Backing up Exchange Server 2013
Third party backup solutions
At the
moment of writing this article the support of third party backup
solution/providers to backup Exchange Server 2013 is marginal. The following
table gives you a better understanding of the most common ("enterprise
ready") backup solutions and their support of Exchange Server 2013.
Note:
From a Microsoft statement all backup solutions need to make use of the Volume
Shadow copy Service(VSS) in order to create a successful and consistent backup.
For more information about these requirements click here.
As you
can see there isn't much support from third party products for Exchange Server
2013 yet. Why suppliers of backup software don't have a solution yet is
unclear. But the question is, is this a potential problem when you want your
organization to move forward in implementing Exchange Server 2013? Personally I
think not. Better saying, I personally don't think you'll need a third party
backup solution at all! And why is that you say?
|
Solution
|
Supported?
|
Level
|
1.
|
Symantec
NetBackup
|
Support from version 7.5.0.6.
|
Database
|
2.
|
Symantec
BackupExec
|
Support from version 2012 Service Pack 2
|
N/A
|
3.
|
NetApp
SnapManager
|
Supported
in version 7 and higher
|
Database
|
4.
|
CommVault
|
Supported
in version 9 and higher
|
Database
|
5.
|
VEEAM
|
No
support. Support is going to be in version 7. Release date unknown
|
N/A
|
6.
|
HP
Dataprotector
|
Support from version 8.
|
Database
|
7.
|
EMC
Avamar/Networker
|
No
support
|
N/A
|
8.
|
IBM
Tivoly Storage Manager
|
No
Support
|
N/A
|
Well the
explanation is pretty simple. In Exchange (of course if you design it properly)
all features to eliminate both backup concerns are built into Exchange. In the
next paragraphs I will go deeper into it, so keep on reading ;).
Exchange Item Restore
When you
ask your customers or the management of your organization if it is really
necessary to have their single items back from backup in case of a user error,
they probably say yes. But if you ask them till what point in time, they most
of the time don't have a direct answer. If you then ask them if they are
comfortable to have a restore period of let's say 1 month for recoverable items
they probably say that it is ok. You have to keep in mind restoring single
items has limitations. In case of a single item restore (not possible yet in
combination with Exchange Server 2013) this brings long backup times and
probably performance loss.
Exchange
however has the ability to keep deleted items for a specific period of time.
This is called retention policies. By default all deleted item's (by means
items that are removed from the users "Deleted Items" folder) are
saved for 14 days. This means that users are able to restore them within 14
days themselves from within Outlook.
So to for
fill the need to restore single items you can simply use or extend the
retention policy for recoverable items. This is done on the database. You'll
however have to keep in mind that you'll need to calculate this in your mailbox
storage requirements design.
The
advantage of this approach are numerous:
- It saves you a lot of time to backup single items with any software;
- It saves you storage in case of snapshot backups on storage level;
- It saves you storage on your backup tapes;
- It saves your IT Helpdesk the burden to answer call's about restore of single items;
- And last but probably the most important, users don't have to call the IT department anymore. They can do it themselves! And that means, one step forward in pissing of users ;).
Exchange HA and Site Resiliency
Great!
And what about Disaster Recovery I hear you say? Well Exchange has a built-in
solution for that to. It will require you to think well about your design so I
only describe the features and technologies needed to achieve the goal.
Since
Exchange Server 2010 there is a new thing called Database Availability
Groups or DAG's. DAG's are the successor
of the pain in the ass Continues Replication Cluster (CCR) which was available
in Exchange Server 2007. Exchange Server 2013 the use of DAG's is continued and
improved. With a DAG you can create High Available passive copies of your
mailbox databases over up to 16 Exchange Mailbox Servers. The advantage of a
DAG is that (although MS Cluster Services is still used on the background) the
configuration is relatively simple. You'll need however extra storage for every
copy of the database. It is also possible to divide your DAG's over separated
Data Centers to ensure services continue to be available and data loss is kept
at a minimum. This tackles your direct HA requirement.
But what
if for whatever reason your active database gets corrupted? Are my passive
copies then also affected? Uhhh yes they probably are. The reason for this is
that each copy of an active database in a DAG is seeded (kept up-to-date) by
using transaction log shipping. If corruption is inserted in a database the log
will simply be played into a copy too.
But don't
worry there is a solution for this and that's called "lagged copies".
In every DAG you can create next to regular HA copies a Lagged Copy. A lagged
copy simply means that you tell Exchange to insert a lag (delay in time) before
it commit's changes to the database. Therefore if data gets corrupted in a
database the lag will ensure the corruption is not directly in the lagged copy.
The use
of Lagged Copies are there since Exchange Server 2007. And therefore also in
Exchange Server 2010. However lagged copies where a bit hard to handle in
Exchange Server 2010. Also if the organization needs a 0 day RPO it was simply
not possible because the logs where gone if all "normal" copies of
the databases where not there anymore and therefore the mail queue was empty.
In
Exchange Server 2013 this issue is solved by a feature called Safety
Net. Safety Net is the successor of the Transport Dumpster and is a layer
that is not a part of the databases or the DAG. What Safety Net does is when a
transaction is required (incoming or outgoing mail for example) it holds the
message until the message is delivered in all the copies (including the lagged
copy) of the databases in a DAG.
This all
basically means that without any backup software you can tackle item level
restore and you can reach a 0 day RTO and RPO together. Of course your design
needs to be right and you'll need enough data centers and servers to do the job
for you.
Accreditations
A special
thanks to Martijn Moret (Data
Management Consultant at PQR, @MMMoret)
to provide me a table of all backup providers and their support of Exchange
Server 2013.Updates
09-07-2013: Updated support matrix for Symantec NetBackup and HP Dataprotector09-08-2013: Updated support matrix for Symantec BackupExec