Skip navigation

Monthly Archives: June 2012

How often have you wished that you could undertake a Business Impact Analysis (BIA) quickly, with all the relevant people and decision makers available in one place? For most people, undertaking a BIA is a bit like having a tooth removed very slowly over a period of weeks or months. Very painful.

I’ve been reminded this week of how it doesn’t need to be like this, particularly for a small or medium sized organisation. One of my latest clien’s is a small manufacturing and service company, and I was able to undertake a BIA workshop for the top management in an afternoon. This didn’t just cover the strategic level, but went all the way down to the operational detail of which activities needed to be recovered by when, and what resources were needed to recover each activity. We even spent time on discussing strategy and tactical options for recovery.

What made this possible was not just the size of the company, but that the senior management were willing to put in the time and effort and that they knew the operational detail in their area of responsibility. Such a refreshing change. It would be wonderful if all BIAs were like this, but they’re not. Obviously this is not possible with a large organisation, but the main reasons why it is not possible in most small to medium sized organisations is that the senior team won’t spare the time, and more importantly, they don’t seem to know the operational details! Personally, I call that a failure of management.

I’m very pleased that I’ve managed to get my latest client, a small electronics company that actually decided by themselves to implement Business Continuity Management (BCM) rather than being told to, to think about the maximum scale of incident that it wants to plan to survive. Many organisations shy away from this issue, which makes it difficult when advising on safe separation distances for backups and recovery sites, but my client’s management team understands the issues and will be coming up with an answer.

I think that the factor that will determine the answer is the geographic spread of their staff. If there is some kind of natural or man made disaster that affects the homes and families of most of the staff then it is unlikely that they will want to come to work to help out their employer, particularly if their employer is asking them to work a significant distance from their families who may be evacuated.

If this is the case then we’re probably talking of their surviving an incident that has an effective radius of about 30km. Such an incident would take quite a catastrophic and unlikely event given that the client is nowhere near a nuclear or chemical facility, well away from the coast, and not in an earthquake zone or near an active volcano. The most likely wide spread event is a river flood, but that doesn’t usually last more than a few weeks in the UK.

The RBS systems failure should become a case study in Business Continuity, but I doubt that it will as the bank won’t want to advertise how it managed to not only get something seriously wrong, but how it took so long to fix and what it really cost. Every Business Continuity professional should be interested in this so that they can learn from any mistakes that were made, and see how Business Continuity Plans were used in response to a real disruption.

The first thing that I’m interested in though, is whether or not RBS activated its strategic level Business Continuity Plan, which may be known as an Incident or Crisis Management Plan. Presuming that RBS has such a plan, was it used, or did a group of senior executives just get together and decide what to do without reference to the plan?

Secondly, did the person who first identified that a software upgrade had gone wrong just try and fix it, or did they also escalate the issue up the management chain of command? If so, did it get to the top quickly, or did it stay hidden until the effect of the problem became widely known?

Being a UK taxpayer, I’m a shareholder in RBS. As a shareholder, I’d like RBS to undertake a thorough post incident review and publish the results so that we can all learn from what went wrong.

 

I was in the process of undertaking a risk assessment exercise for a client of mine when RBS suffered their systems failure the other day. By an amazing coincidence, I was working on the risks to the most urgent activities undertaken by their Finance department, and one of those activities is to make payments via BACS. I had identified that this activity was dependent on their bank, RBS, and was confronted with the very real problem of how do I assess the risk of RBS not being able to process payments because of a failure of their systems?

The likelihood of  RBS not being able to process payments because of a failure of their systems would normally be rated as being very low, but today it’s a racing certainty! Had I undertaken the risk assessment a week ago I would have rated the risk as very low and not worth taking any mitigating measures. Now the risk is very high (certainty multiplied by the impact on my client), so they should consider mitigating action – such as have an alternative bank.

The question is, what does this say about value of undertaking such risk assessments?

Here’s another new idea for business continuity that’s come from a business continuity course that I’m giving. Instead of just having control and escalation procedures to enable communication up and down the chain of command, why not have a special group sitting alongside the strategic, tactical, and operational teams that has the sole job of handling information up and down the command chain?

This group doesn’t replace the chain of command, it supplements it, making sure that information is getting to the right people in a timely manner. I’ve never thought of this before, but it might be something that’s in common practice elsewhere. If it is, does it work, or does it actually start to replace the chain of command and put all communication in the hands of a small group of people? The danger then, of course, is that they become a bottleneck and slow down the flow of vital information. There is also the danger that they could start to censor information, taking on themselves the role of deciding who should get which information.

One of the more interesting ideas that has come out of the training course that I’m giving is to reduce the number of Business Continuity Plans, so making the whole process of developing and maintaining plans much simpler.

This idea was put forward by a large UK retailer that has realised that the current situation is just not in keeping with the company’s culture, is out of control, and will never deliver what was originally intended. The idea is to dramatically cut back on the number of plans and to get rid of the bureaucracy surrounding their development, maintenance, and the review process.

The sheer number of plans that need to be kept up to date is one of the many problems that Business Continuity Managers in large organisations face. If this can be overcome it will make their jobs much easier, and will probably result in having plans that are both up to date and known to work. Can it be achieved though? That’s the big question.

The motto of the scouting movement is “Be Prepared”, which could just as well be adopted by the Business Continuity industry. I have, for some time now, been looking for a good “party” definition of Business Continuity, and now I think I’ve found one.

A  “party” definition is where someone that you meet in a social setting asks “What is Business Continuity?” You then have a split second in which to think of an interesting and engaging answer. The best one that I had previously was “Making sure that the product gets to the customer”, now I have “Being Prepared”.

Has anyone else got any pithy definitions?

This week I’m in the Cotswolds, in the UK, giving the Business Continuity Institute’s five day Good Practice Guidelines (GPG) course. As usual,the delegates all complain about the fact that GPG is a difficult book to read, and that if anything, it’s a great cure for insomnia. This is all a bit embarrassing for me as my name appears in the GPG as a contributor and one of the chief reviewers, but I can explain that is was written by committee and that it’s difficult to make such a document an exciting read.

I went on from this to explain that a new version is being written, and again, I am a contributor. On hearing this, one of the delegates asked me if the new version was going to be any better in terms of being easier to read. A very good question. I would hope so, but I couldn’t give a definitive answer as I’m just one of a number of contributors and the end document will be reviewed by a QA group. However, I will try and make it my business to see that we get a more readable version, if only for the sake of avoiding the inevitable criticism that I’ll be subjected to if it isn’t.

 

I’ve finally found time to update my company’s Business Continuity Plan. The current version is dated 24th February 2012, which means that it’s less than 6 months old. Congratulations are due, I’m updating it before the next scheduled review date. On second thoughts though, I should have updated it about a month ago when the company changed it’s bankers.

Now, if a company as small as Merrycon finds that its Business Continuity Plan is out of date within three months, how much more likely it is that larger companies need to update their plans more frequently. In my experience, updating a plan once every three months is very unusual. Even every six months is not all that common, with once a year being more like the norm. An awful changes in a year, and this probably explains why so many people ignore their plan when they need to respond to an incident – they know that it’s out of date.

One of the things that I always look for when helping a client to implement Business Continuity is single points of failure. My latest client has managed to provide me with the best example of one yet, and the name of the single point of failure is Malcolm.

The client will remain nameless, to protect the innocent, but quite by chance it was revealed that one of their most critical and urgent activities is totally dependent one a single person working for an outsource supplier, and his name is Malcolm. If Malcolm is not available to do an activity that is on the critical path to enable one of the client’s most important services to be delivered, the client’s reputation will be destroyed. This service is delivered just once a year, and is vital to thousands of my client’s stakeholders.

I’ve never met Malcolm, but apparently he’s been undertaking this activity for many years, and it’s never failed. I can’t help wondering how old Malcolm is, or whether or not he’s in good health. I know who he works for, but again, I must protect the innocent.

I nearly missed this single point of failure, so from now on I’m going to redouble my efforts to find the Malcolms of this world.