High availability and disaster recovery

Within the hierarchy of business continuity management, which is concerned with everything from the safety of staff to the functioning of the business at the highest level during and after a disaster, disaster recovery supports business continuity management in the provision of core technology systems.
As such, disaster recovery deals with actual technology platforms and issues such as the speed of recovery.
Cayman IT consulting and professional services company MCS provides high availability solutions that ensure minimum system downtime and disaster recovery solutions, which can be seen as complementary, explains Chris Eaton, the vice president sales and marketing for the firm.
“Disaster recovery from an IT infrastructure perspective has to be tied in closely with high availability, because if you build a high availability infrastructure in your live environment, something that is inherently quite resilient, in the event that something happened, it is almost always easier to then recover it in the DR scenario.”
The high availability architecture that a company should have in place already can then help mitigate some of the impact in the event of a disaster.
What differentiates providers in the disaster recovery arena is that there are practices that offer consultancy and advice and those that deliver solutions. MCS delivers both, says Eaton, from a business continuity management plan and the identification of disaster recovery technologies that suit the plan to the execution and provision of high availability and disaster recovery solutions.
“So not only do we deliver you with weighty tomes of paper if that is what you need for compliance because there are some regulatory requirements, but we can help you execute the architecture that you would need to effect a proper recovery through our core technology practice.”

Recovery time and recovery point objectives
In a typical business continuity and disaster recovery assignment MCS would first determine a client’s disaster recovery objectives according to two criteria: the recovery time objective, i.e. how quickly systems and data need to be recovered, and the recovery point objective, or to what point in the past data needs to be recovered? In other words the recovery point objective defines the acceptable amount of data loss measured in time.
All of a client’s systems, for example email, are then categorised according to the question of how long the business can operate without email to how old do the emails that are recovered need to be to be able to work. Some systems may not tolerate any interruption at all.
The high availability and disaster recovery architecture that needs to be put in place is then determined by those two factors, as well as the available budget and the cost of the technology required.

Technology solutions
MCS has experience with a wide range of best practice solutions that are then selected and combined to fit the client’s individual needs and objectives.
“It is unlikely that someone is going to come to us and ask us for something we have never done before,” says Eaton.
“You start to build up best practice technologies and they are not cookie cutter solutions, but we are not reinventing the wheel for every customer.”
When selecting a supplier of disaster recovery solutions, says Eaton, it is important for clients to ensure that they work with someone who is authorised by the industry leader, as is the case with MCS.
He mentions a solution by NSI called Double Take, which replicates change data that occurs in a server environment from a live environment into the DR environment without taking up too much bandwidth.
“The key is that bandwidth is expensive where we live, so anything that keeps the amount of bandwidth usage down, but enables you to synchronise data replication so that the information that is in the live environment is pretty much the same as the data that is in the DR environment is good to have.”
What companies are really interested in is protecting data. Often storage area networking, which is an architecture that links remote computer storage devices to servers, can be a simple solution, but one that used to come at a cost that was out of reach for many companies. However, the cost and complexity of SAN has decreased considerably during the past years.
Depending on the existing high availability live infrastructure, which can range from less efficient but cheaper direct attached storage to the more mature technology of network attached storage and storage area networking, SAN may be the right solution in a DR environment.
MCS is working with Compellent, who have a very scalable solution, says Eaton, because it can work with both ISCSI, an IP networking storage standard for linking data storage facilities, and fibre channel based storage networking. Another vendor MCS has good experience with is Hewlett-Packard.
“But we are trying to be vendor agnostic,” says Eaton. “We will work with whomever to provide the advice that people need. We do act as a trusted advisor for clients and if they come to us with a problem statement we will give them vendor agnostic recommendation on what is the best fit for them, accepting that we know what best practice is in the industry.”

Another factor influencing technology selection is bandwidth. “It might be a good goal to have storage to storage replication, buckets of data replicating to each other, rather than complicated servers,” says Eaton, ”but if the cost of bandwidth is enormously high to support the volumes of data that you are producing, then that is going to become a crippling solution from a budgetary perspective.”
The solution for this could be bandwidth optimisation as provided by Riverbed, for whom MCS is an accredited regional partner. Riverbed offers a bandwidth compression solution which could make a real difference in terms of bandwidth cost.
“If you can get a 3:1 compression on a 4 Meg pipe you might be able to get a 12 Meg performance out of it. If you are paying what is the going rate for a 4 Meg pipe that can actually look quite attractive.”
It may also make a difference when choosing between a warm and hot environment rather than a cold site option.

Cold, warm, hot
In simplistic terms, it used to be the case that a disaster recovery environment from an infrastructure point of view was just a set of servers ready to be turned on, explains Eaton. The company would then turn up at the DR site with back-up tapes, containing hopefully a reasonable replication of the data from the live environment. The problem with this approach is that a relatively high percentage of back-up tapes fail.
“And if we are dealing with mission critical data, trying to recover a business, the last thing you want is a tape failure to trip you up,” says Eaton.
In contrast to this cold environment, a hot DR environment is fully synchronised with servers that are up and running and fully accessible but stored in a safe location.
Anything that is written in the live environment is directly updated in the DR environment.
The so-called warm DR environment might consist of stand-by servers with yesterday’s or last week’s data on it. It can also mean servers that are turned on now and again. The type of recovery scenario depends again entirely on the requirements of a client and the available budget.

Location of the DR site
Where this cold, warm or hot recovery site is located is also something MCS can advise on.
There may be legal reasons why a company would not want to store a recovery site in a third-party data centre in the US, as the bar that has to be crossed in terms of the legal process to get access to the data is not particularly high there. A number of MCS clients have therefore opted for Canada, explains Eaton.
“We advise clients to perform their own due diligence, we are not lawyers, but we can say in our experience other organisations with similar requirements have made these choices.”
MCS works actively with a number of data centres in Europe, particularly Ireland, the Channel Islands, Switzerland as well as Halifax in Nova Scotia and Curacao. Eaton emphasises that the firm does not secure any income from its recommendations, so people can be confident that they are getting consultative advice.
“We would only recommend data centres that we have physically seen and inspected and that comply with industry standards,” he adds.
Whether a data centre should be based on or off island depends on the scenario and the risk that a company is mitigating. If a company mitigates the risk of a local fire the DR site does not need to be based off island, but some organisations operate in an environment, where they recognise that they must have a recovery scenario in place that takes their data outside of the auspices of Cayman, says Eaton.
However, to base a data centre off island, a financial services company should also take appropriate advice from CIMA to ensure that it complies with any local legislation, he advises.

Client access
Finally, the disaster recovery environment must provide for client access. There are remote access solutions by Microsoft that CMS has experience with but the IT consultancy is also certified to deploy CITRIX based solutions, which provide low bandwidth access to what effectively looks and feels the live office environment.
”At MCS one of the things that we are very fortunate with is that we had really low staff turnover and so we still have a number of consultants that are with us today, who were with us when Ivan struck in 2004,” says Eaton. “One of the take-aways from that experience is that there is nothing like actually having actioned real recoveries in a real disaster.”
Hurricane Ivan has also brought home that one has to architect for high availability and DR, he concludes. It is no longer just a tick in the compliance box.