DISASTER RECOVERY
When did you last check the DR Plan?
Business Continuity Planning
By Steve Harcourt, Senior Information Security Consultant of Redstor
Introduction
Steve Harcourt
reflects on events that
may require you to
review your DR plan.
Some of us will get around to it
tomorrow. Some of us did it a while
back and think it’s probably still
up-to-date. A small number of us
checked it today and know that it’s
up-to-date. The DR Plan so often ends
up being the poor relation in the family.
We all know it’s something we should be
reviewing regularly, but that’s not always
what happens in practice.
There are numerous events that
are obvious and clearly require you to
review the DR Plan, such as a building
or data centre move, application and
system migrations or changes to key
third party service providers. But what
about some of the less obvious events?
Even something that’s apparently quite
trivial may need to be reviewed to
determine if any changes are required
to the DR Plan.
A Technical Scenario
That ageing old Windows Server 2003
machine has finally been replaced with
a shiny new Windows Server 2012. The
critical data was all migrated and the
application still runs, although a new
version of the software was required
for compatibility with Server 2012. So,
what about the implications on DR and
what questions might I need to answer?
Is the IP address still the same? Is the
backup method still the same? Did you
have some kit earmarked for DR and
will that still be able to cope with the
much larger O/S data size and system
resources such as CPU and RAM?
Hopefully this was something that had
already been considered and factored in
during the commissioning of the kit, but
although you are confident everything is
OK, have you actually been able to find
time to test out your theory since the
new kit was bought online?
Other environmental issues seem
insignificant at first, but when did
you last check out the emergency
contact numbers for the water, gas and
electricity supplies to the buildings? And
that phone number for the company
who provides mobile generators, is it still
correct? Do you have a copy of the staff
home and mobile phone numbers, and
is that still up-to-date?
Regarding walk-through DR tests,
it is probably fairly easy to think of
scenarios such as the domain control
dies, or the critical application becomes
corrupted, and then play through the
steps needed to recover those systems,
but what about non-technical scenarios
that impair your ability to support the
business?
A Gas Leak
Have you reviewed your Recovery Time Objective recently?
14 NETCOMMS europe Volume V Issue 5 2015
The local fire department discover
a gas leak near your office and as a
precaution they need to turn off all
power to the area and have closed the
roads until the environment is safe.
So, for an hour or two that probably
isn’t going to be a problem, but what
if that was for 3 days? Some of your
systems might be accessible from home
via secure links, or maybe your email
system is already in the cloud. What
about the hardware that your staff uses
to do their job? Do you have staff that
deal with payments and use secure
handheld devices to confirm online
banking transactions? How might
this affect your cash flow should it be
unavailable and have you got a backup
plan in place with the bank?
Do any of your staff have specific
software installed on their work
machines and cannot function without
it? Can you remotely redirect the phone
system to other numbers? And what
about notifying your customers that
there is a problem? Can you get to the
website from an external connection?
Or do you even want it to be public
knowledge that you have a problem?
Recovery Point Objective
You perform a full backup of every
critical server at least once per week and
have incremental backups usually once
per day. You are confident that each
individual server can be restored from
the last backup, but you don’t have a full
DR site in place and have never tested
restoring every server at the same time.
This is not an unusual scenario, but
potentially unanswered questions relate
to RPO and RTO. RPO, your Recovery
Point Objective, may require that you
restore separate, but dependent systems
and servers to the same point in time.
So is it OK that your critical application
server is restored to Tuesday night at
3am, the web server front end is restored
to 12noon, and the domain controller is
restored from the 6pm backup? Maybe
that’s OK, but has anything changed
in those systems since they were
implemented 5 years ago?
Have you reviewed your RTO
(Recovery Time Objective) recently?
How quickly does the business need to
get the systems back up and running?
So, maybe you know you can restore
each server in a couple of hours,
but what if you need to restore all
15 servers? And have you reviewed
the priority of each restore and the
dependencies between servers?
Summary
The message really is that if you don’t
have a dedicated person looking after
disaster and business continuity planning
then maybe try and put some time to
one side and check out the details. And
if you perform tests on a subset of your
environment maybe think about some
non-technical issues too.
www.netcommseurope.com