Skip navigation

Tag Archives: contingency

Cyber and terrorist attacks currently appear to dominate Business Continuity (BC) thinking, but over the weekend we had a classic example of a good old fashioned failure of a critical IT system causing major disruption and some resulting poor incident management that compounded the problem. The company involved was British Airways (BA), and I say poor incident management because this is what the public has perceived and what BA customers experienced. No doubt there will be an internal BA investigation into what went wrong, but as a BC professional I’d love to know about three aspects of the incident and BA’s response:

  1. How long did it take from the initial failure of the system for the IT support technicians to realise that they were dealing with a major incident, who did they escalate the incident to (if anyone), were the people designated to handle major incident contactable, and was the problem compounded by the fact that BA’s IT had been outsourced to India?
  2. The system that failed is so critical to BA’s operations that it must have had a Recovery Time Objective (RTO) of minutes, or at worst, a couple of hours. To achieve this, BA should have put in place a duplicate live version of the system (Active/Active). Either BA did not have such a recovery option in place (I’m guessing that they had a replica – Active/Passive), which implies that they failed to understand the need to have a very short downtime on the system, or it had not been properly tested and failed when required.
  3. Why were the communications with customers  (people who were booked on BA flights) handled so badly? BA must have a plan to communicate with passengers, but was this dependent on the very system that failed?

For me, even before the inquest takes place, the major lesson to be learned is that the effectiveness of an organisation’s BC and incident response plans can only be assured by actually using the plans and responding to incidents. If you don’t want to find this out in response to a real incident, then you need to run realistic and regular exercises so that every aspect of your response is tested and the people involved know what to do. It doesn’t matter how good your Business Continuity Management (BCM) process is, how closely aligned to ISO 22301 it is, how good the result of the latest BC audit, or how much documentation you have. It’s your ability to respond effectively and recover in time that matters.

BA have suffered damage to their reputation , how much is yet to be seen. They will have suffered financial damage, and when the London Stock Market opens for trading we’ll see how much it has affected their share price. Maybe BA do run realistic and regular exercises. If they do, they should have identified the issues with the systems and incident response that were encountered over the weekend and acted on the lessons learned.

 

 

As most people are only too well aware, the way that we find and use information is going through a radical and fundamental change, which is being driven by the Internet. What doesn’t seem to have permeated the world of Business Continuity though, is that this change is revolutionising the Business Continuity Plan.

Not too many years ago, in our house, we used to keep a telephone directory and combined bus and train timetable near our front door, close to where we had our telephone. Today, we have neither of those things, and if we want to find a telephone number or the time of a bus or train we’ll simply use the Internet, and rapidly find what we’re looking without wading through pages and pages of small print trying to decipher how the directory or timetable is organised before getting to the information that we want. We also had the depressing problem of finding out later on that we’d looked up the information in a document that was out of date, and that one of the family had inadvertently thrown away the new version and kept the old one.

Telephone directories and timetables are just two examples of documents that are being used by fewer and fewer people, and most of those are older people who find it hard to change a lifetime’s habits. Using printed documents to find information is becoming a thing of the past, as anyone who mixes with youngsters will confirm. Why then, do we persist with documents in the world of Business Continuity, what’s wrong with just finding the information that we need from the Internet?

The problems of document based Business Continuity Plans are only too well known. Unfortunately, more often than not, they are difficult to use in a crisis, contain unnecessary information, and are out of date. What we really need is something that is simple to use, delivers exactly what is required, and provides the latest information. That is an App.

An App is short for an Application, and is quite simply a piece of software designed to fulfil a particular purpose, and is downloaded by a user to a computing device from which it can be used. Apps can be used to obtain information, and when designed to provide the information required to respond to an incident, they are an ideal and powerful tool.

Don’t make the mistake of thinking that holding a Business Continuity Plan as a PDF document and making it available on the Internet via an App is the same thing as an App designed to enable someone to respond to an incident, it’s not. You don’t look up the time of a train on the Internet by opening up a PDF document and searching through it, do you?
A Business Continuity App can provide responders with clear, action orientated, and time-based direction, while allowing quick access to relevant and up to date support information. Exactly what we want to achieve.

This revolution has profound consequences for world of Business Continuity, and if you’d to find out what these are, then come and listen to me present at the BCI World Conference and Exhibition in November. The Business Continuity Plan, as a document, is dead, long live the Business Continuity App.

Despite my best efforts, I’m still unable to kill off the myth about “80% of companies without recovery plans failing within 18 months of having a disaster”. The myth comes in many statistical guises, and the latest example appears in a white paper from AVG, the online security company, which contains the quote from Touche Ross “The survival rate for companies without a disaster recovery plan is less than 10%”.

Depressingly, this quote is used by a large number of organisations that should know better, and is usually stated in the format “A Touche Ross study found that the survival rate for companies without a disaster recovery plan is less than 10%”. I have tried very hard to find this Touche Ross study, but to no avail. Touche Ross has not existed as a separate company since 1989 when it became Deloitte Touche , so this is hardly a recent study, even if it actually exists.

I have searched the Deloitte web site and cannot find any reference to the study in question, and have now made contact with Deloitte to ask if they can try and find the study, and whether or not they stand by the quote. Watch this space!

Finally, there is real concrete evidence that an organisation’s ability to recover is central to its immediate survival. Not its ability to recover after an incident, but its ability to demonstrate its recovery capability as perceived by others before any incident occurs. Business Continuity is now firmly center stage.

According to The Times, senior UK government officials “want the Co-operative Bank to be sold to a bigger player that could stabilise its IT system, which is feared to be so precarious that the bank could not cope with a serious problem.” For years I’ve been telling senior executives that not being able to demonstrate the existence of credible and tested Business Continuity arrangements could mean the difference between survival and failure, and now I can point to a real example. Business Continuity is not just for use in response to an incident – it must be demonstrable to interested parties well before any incident takes place.

Apparently, In the risk factors disclosed in its annual report, the Co-operative Bank has stated that “whilst a basic level of resilience to a significant data outage is in place, the bank does not currently have a proven end-to-end disaster recovery capability”. How many organisations can really hand on heart state that they have a proven end-to-end disaster recovery capability? Not that many.

Business Continuity has been practised in the banking industry for more than 25 years, and many of today’s accepted Business Continuity ideas and practices started in banking. Where banking leads in Business Continuity, other industries follow.

How long will it be before organisation’s in other industries are put at risk because they do not have a proven end-to-end disaster recovery capability?

Another day, another politician that thinks that contingency plans shouldn’t be developed. This time it’s the head of the European Commission, Jean-Claude Juncker, who has told his officials not to work on contingency plans for Greece’s possible exit from the euro. Why? Apparently it’s because the plans could be leaked and cause turmoil in financial markets.

In other words, Europe’s top politician has effectively told everyone that he believes that Business Continuity planning is a dangerous discipline and that Business Continuity Plans should not be developed just in case they are leaked to the media.

Trying to sell the benefits of investing in Business Continuity is hard at the best of times, but now we have Jean-Claude Juncker and his helpful ideas. It’s not as bad as the person who once told me that he didn’t want to develop a Business Continuity Plan as it was tempting fate, but it’s getting close.

The Bank of England has just been heavily criticised in a report by Deloitte into the unprecedented day-long
collapse of its Real-Time Gross Settlements system last October. Deloitte that found that the Bank’s officials had never rehearsed what would happen in the event of the platform going down for any length of time, and to compound the problem, Deloitte also discovered that the three Bank of England executives with responsibility for the system were all out of the country on the day the outage happened. Not only did the system fail, but the Bank had virtually no crisis management plans in place to deal with the incident.

Unfortunately, in my experience of providing Business Continuity services to a wide variety of organisations over many years, one of the constant themes that I come across is  the failure to exercise recovery plans. It’s not a point blank refusal to run an exercise that’s the problem, instead it’s the constant postponement that eventually results in the failure to exercise a recovery plan.

All sorts of good reasons are given for postponing an exercise, from the understandable fact that everyone is just too busy at the present time to the ludicrous idea that the recovery shouldn’t be exercised until it is known to work (which came first, the chicken or the egg?) And so it goes on, month after month, year after year, with everyone saying that they intend to run an exercise, but with nobody committing to a date or time.

Don’t get me wrong, I do have clients that do exercise their recovery plans, but they are in a minority and they don’t exercise every plan as often as they should. I’ve tried all sorts of ideas to overcome this problem, but none of them seemed to have worked. Is this just a fact of life, or can something really be done to make sure that recovery plans are exercised on a regular basis?

The prevailing view of the Business Continuity (BC) community is that the only benefits of not having a Business Continuity Plan (BCP) are that you’ll be saving a small amount of time and money, but with huge downsides if you ever suffer from an incident that causes major disruption to your operations. But this may have to be revised as a result of fire at a Dogs’ Home in Manchester in the UK last Thursday evening.

The fire, which was tackled by more than 30 firefighters, was a tragic event that killed about 60 animals. Some 150 dogs were saved, and from all the reports it looks as if the staff did not have a pre-prepared BCP. However, the public rallied round after the Dogs’ Home asked for people to provide temporary foster care for the rescued dogs. Large numbers of people turned up to help, volunteers at the site began collecting dog food, bedding and other items donated by the public, and a JustGiving account set up by the Manchester Evening News raised more than £1.2m. In fact so many people tried to turn up to help that the Cheshire Police tweeted: “High Volume of Vehicles at Cheshire Dogs Home to adopt dogs following the recent tragic fire. Avoid area if travelling.”

Volunteers are saying they have been overwhelmed by the response and that they now have rooms full of dog food, blankets, crates and baskets, and although many members of staff say they’re devastated by the fire, there’s a sense of optimism and comradeship as as fosterers turn up to take dogs home.

The net result seems to be that the Dogs’ Home is far better off than if they had had a BCP that clicked seamlessly into operation and hadn’t had to ask for help. So, before you decide to spend time and money on developing a BCP, ask yourself if you should just wait until an incident happens and hope that help and assistance will be provided by the public. Maybe this would only happen in the UK and to a Dogs’ Home. I wouldn’t recommend that a bank tries it!

As a Business Continuity professional, I was very disappointed to learn the other day that a major international organisation has publicly denied that it has a Business Continuity Plan (BCP) for the only product that it provides. Every other major international organisation that I come across is very proud of the fact that they have put in place measures to protect their product and services, and hence the interests of their stakeholders, by developing and maintaining effective BCPs.

And who is this organisation? None other than the International Olympic Committee (IOC). The IOC’s vice president John Coates described Rio’s planning as “the worst I have experienced”, and although the IOC has formed an emergency task force in a bid to bring Rio up to speed, he has denied reports in the London Evening Standard that London organisers had been contacted to see if the facilities built for the successful 2012 Games could be used again in two years’ time should the Brazilian city fail to reach its construction deadlines. “There’s absolutely no plan B,” he said. “There’s just absolutely no alternative of going back to another city. We’ll work through this and we’ll get to Brazil.”

Who needs Business Continuity eh? Just tell everyone that it won’t happen, and if it does, just work the the problems as they arise and carry on regardless.

 

I was attending a local Business Continuity Institute (BCI) forum the other day when someone mentioned the fact that there had been a ‘flu pandemic the other year. From a technical world health view this is correct, but from a Business Continuity (BC) perspective in the UK, I believe that this is dangerously misleading. As a consequence, I stated the view that as far as BC professionals are concerned, there was no ‘flu pandemic.

Why do I hold this view? Well, quite simply, the ‘flu pandemic did not cause any more disruption to UK organisations than the ‘flu normally does in any year. In other words, it was a “business as usual” type of disruption, which could be treated by local management as just one of those day to day issues that need to be handled. Yes, I know that lots of organisations, particularly in the public sector, convened weekly meetings of managers to monitor the situation, just in case they needed to invoke their Business Continuity plans (or special’Flu Pandemic plans), but the impact of the incident was very small.

It’s a bit like saying that an organisation suffered from a fire just because someone burnt the toast. Yes, technically there was a fire, but it would have been quickly put out, there would have been very little business disruption, and no Business Continuity plans would be invoked. It would be dealt with as a  “business as usual” type of disruption.

Does this matter? Well, yes, I think that it does. To talk about ‘flu pandemic in the way that it was being talked about at the BCI meeting implies that there had been a business disruption and  that Business Continuity plans had been successfully invoked. There was no significant business disruption , and although ‘flu pandemic teams met,  no Business Continuity plans were invoked. In other words, the threat of the ‘flu pandemic was not realised, even though there was, technically, a ‘flu pandemic.

My message is simple. Don’t fool yourself into thinking that your plans dealt with the threat. It didn’t happen.

 

 

I have recently been contacted by an organisation that I did some work for more than 5 years ago, when I helped them in the development of an IT Service Continuity Plan. They would like some help in updating the plan.

This all seems very reasonable, until you realise that what they mean is that they would like some help in updating the original plan, in that they have not made any changes to it since I worked on it more than 5 years ago. Either they are an incredibly stable organisation that rarely changes or they simply put the plan on the shelf where it has been gathering dust. I know of at least one change that’s required though, and that is the name of the person responsible for maintaining the plan. Apparently, they left the organisation some time ago, which probably explains why the plan hasn’t been updated.

The interesting thing though, is why have they suddenly decided to update the plan. I’ll find out soon.