The troublesome case of the red-faced bank

2

This article is by Mark Toomey, the founder and managing director of IT consultancy Infonomics. For more about Toomey and his book, Waltzing with the Elephant – a comprehensive guide to Corporate Governance of Information Technology, see his bio at the end of this article, or visit his site. This article was part of the regular Infonomics Letter newsletter and is published here with Toomey’s consent.

The message to customers from British Airways’ chief executive Willie Walsh was succinct: “Since Thursday 27 March (2008), when Terminal 5 opened, we accept that the service we have provided our customers has not been good enough. We are extremely sorry for this.”

Equally succinct was the message from NAB chief executive Cameron Clyne in his nationwide apology (PDF) to customers published on 29 November 2010. Clyne appropriately acknowledged the potentially significant impact of a problem with processing transactions on the night of November 24th.

He acknowledged his customers and staff, and gave assurances that all financial losses incurred because of the problem would be made good. Finally, he confirmed that the fault has been resolved, though some correction to accounts was still required.

NAB’s problem came at a time when Australia’s major banks have been the target of considerable market resentment as they post substantial profits while simultaneously elevating interest rates. The press was ready to tear to pieces any bank that gave its customers grief, and by the time the problem was 24 hours old, the bank was not only dealing with an IT problem — it was also in the midst of a public relations crisis of significant proportions.

Not only did newspapers around the nation post story after story on the problem, they also opened their comment channels to let the public vent. Discussion ranged widely from bland individual status reports from individuals who did or did not have any money, to speculation on the likely causes, the perceived arrogance of the banks in general, the performance (good and bad) of the bank staff and so on. More than a few pointed to the issue of modern society’s dependence on reliable IT.

One comment, posted by ‘jim’ at 3:37 PM November 26, 2010 said:

“hahaha, one IT mistake, tens hundreds of thousands impacted (virgin blue, queensland health payroll, nab, etc etc)… company dumbos had better wake up to how important IT is and start treating it with respect”.

Indeed, Jim has a point, however irreverently he makes it. In modern society, we have become conditioned to IT that mostly works and when it goes wrong, it can have a seriously debilitating effect that takes hold very quickly.

On the surface, banking for most people is a very simple business: Put money in, take money out, pay bills, borrow, pay back and so on. But underneath the covers, banking is an extraordinarily complex business that even with today’s advanced technologies still requires unimaginably vast volumes of computer software to deal with every nuance of banking products and their associated legislative obligations — let alone the programs needed to just run the business. This complexity alone poses significant risk — but for NAB and many other banks, the complexity of the software required is greatly exacerbated by age.

Many of the banking systems of today are between 20 and 40 years old. They were designed in days when computing equipment was orders of magnitude more expensive than it is today, and when the boundaries of capability were much tighter. Banking systems from the 1960’s, 1970’s and 1980’s had to be designed to spread the workload so that an 8 hour business day could, for very straight forward logistical and economic reasons, be spread out over 24 hours.

The design of banking systems in that era included a great deal of attention to controls — but it wasn’t fool proof. From time to time events could occur that resulted in a problem that could interrupt processing. Bad transactions were almost routine and continuity of the business depended not just on controls to detect them, but highly skilled operational personnel who knew exactly how the systems worked and what to do to ensure that individual incidents did not present an impassable barrier to keeping the majority of account records up to date. Further controls provided strong audit trails of every action taken to resolve a problem transaction and ensure that it was correctly processed as soon as possible.

During the time when many of the still-current banking systems were being built, most of us lived in the world of cash and cheques. We would visit the bank once a week to deposit our pay, spreading it across our cheque, savings and loan accounts. Then along came credit cards which, like cheques, involved a paper transaction record that would become part of that 24 hour extended banking day, processed not individually, but overnight along with thousands of like transactions. Gradually we advanced to using an ATM for some transactions and then, a little more than twenty five years ago, we extended our demand for banking services with the now ubiquitous EFPOS machine. In the late 80’s and early 90’s, telephone banking gave bank customers a much longer day in which they could transact, and the increasingly stressed accounting systems, still basically oriented to an 8 hour day, had to cope with an extended and ever more diverse workload.

As the century drew to a close and a new one dawned, we extended our demand further to include internet access to banking from anywhere in the world, thus hammering the final nails into the death notices for banking systems everywhere. Systems designed for an eight hour banking day in an era of extremely expensive technology cannot be expected to cope with the rigours of our 24 hour society, even when the cost of IT equipment is relatively far lower.

Put simply, as the business of banking has changed beyond recognition during the past 30 years, many banking systems have not kept up with the pace of change and now, as so comprehensively demonstrated by NAB in the past few days, the limitations of those systems are becoming a significant business issue.

The comment referenced earlier, from ‘Jim’, is therefore quite apposite. Nowadays, when IT is not fit for purpose, and fails to do its job properly, the consequence is not merely a delay in some back office functions. Rather, IT failures today often have a significant debilitating impact on the business and its stakeholders. Among other things, treating IT with respect includes understanding its imitations and guarding against the consequences of failure. It also means maintaining investment to keep systems in good order, and replacing them when they no longer suit the evolving business model. Further, it means maintaining a level of expertise at close call so that when a problem does occur, it can be assessed, isolated and resolved without impeding other business activity.

As always, NAB’s experience can be constructively discussed in the context of the six principles for good governance of IT set out in ISO 38500. The principles also provide the backdrop for questions for CEOs and directors to ask about their organisation’s dependence on IT:

Responsibility: While most large organisations have a CIO and formal arrangements for supply of IT services, the consequences of an IT failure are now, unequivocally, business consequences. Therefore it is essential that business leaders and managers are responsible for ensuring that there is sufficient investment to keep the primary IT business systems in good working order aligned to the contemporary business, and that there are suitable arrangements in place for promptly resolving any problems that may arise. Who should make the key decisions about investment in generational change in your organisation’s IT? Do those responsible step up to the challenge?

Strategy: IT that underpins the business should be aligned to the needs of the business, just as the business should take appropriate advantage of new IT that enables the business to deliver new products, services, channels and relationships. There should be ongoing attention to ensuring that IT evolves to suit the current business and enable its future growth. Are your organisation’s IT systems well suited to your current and future business model, or do they pose constraint and risk that may be, especially in the eyes of your customers, unacceptable?

Acquisition: Decisions to spend on IT should not be taken lightly — especially in the context of the systems that support and enable core business activity. However, there comes a time in the life of every business asset where its capabilities and familiarity are outweighed by its costs and constraints, and a shift to a newer generation is appropriate. As banking has become ever more complex, the tendency in most organisations has been to graft on layers of functionality that provides a desired capability, but at the expense of additional complexity and overhead, and with magnified risk of things going wrong.

Decisions about expenditure on IT should take a long-term view on costs, benefits, opportunities and risks – but in too many cases are overridden by urgency and complacency. Is your organisation’s IT budget too heavily biased to sustaining systems that should be replaced? Are the spending decisions you make today increasing the cost and complexity of your systems in the future and locking you further out of the opportunity for generational change to more relevant technologies?

Performance: In many industries, but especially now in banks, customers expect continuous ininterrupted service. As the NAB experience shows, it’s no longer really a business choice — it’s a customer imperative.

That means that the window of opportunity for resolving problems that do arise is shrinking, and the importance of avoiding problems is increasing. Do you have a comprehensive and robust testing facility in which you can actively eliminate most problems before they arise in production? Do your testing systems learn from operational experience, with each new operational problem being incorporated into the test suite? Do you have timely and durable access to experts who have in-depth knowledge, skill and discipline required to analyse, isolate and resolve a problem that is debilitating your business? Do you have controls in place to block inappropriate action from non-experts when something does go wrong?

Conformance: The NAB incident has raised a number of questions about the stability of Australia’s payments system and opens the door to new questions regarding whether and to what extent there should be regulated requirements relating to the transaction processing aspects of banking systems. But looking deeper, it also appears that NAB has been highly disciplined internally as it dealt with the problem. NAB’s communication to stakeholders has been reported as of a high standard and from the right level, though it failed to use all relevant means of contacting customers.

There seems to have been little by way of leaks to confuse the picture of what actually went wrong. Action to help customers left without money may not have been extensive enough, but they were also not woefully inadequate. It might just be that NAB had thought about what to do in these circumstances, and acted accordingly. Do your executives, managers and staff know exactly what to do, what to say, and who to talk to when something goes wrong? Are there conditions in your business license (in whatever form it may take) that may expose your business to consequences if any of your key IT systems fail?

Human Behaviour: Until the birth of the internet and highly consumer-oriented IT, many people were prepared to make allowances for the unreliability of information technology. Nowadays, the average consumer, and indeed many people working even in quite high levels of business, see information technology through their experiences with home computers, game machines and so on. They have little patience and quite advanced expectations about the performance and capability of IT, which is not moderated by appreciation of the complexity imposed by scale, regulation, security and so on.

On the other hand, people who work in IT roles have a track record of “trying to help”, acting outside their boundaries in order to overcome minor problems and to generally be helpful. Unfortunately, without appropriate training, these actions can result in compounding of problems. Do you have a clear understanding of how your stakeholders will behave if you experience a significant IT based business disruption? Do you have proven strategies and protocols for dealing with stakeholder reactions in the event of such problems? Do your operational personnel have a clear understanding of what they can and cannot do in response to a problem arising with your main IT systems?

Mark Toomey is a man on a mission: he aspires to change the way the world’s business and government leaders deal with information technology, to greatly increase the economic value and operational reliability of information technology as a key enabler of modern society.

Toomey is recognised internationally as a leading expert in top level governance of information technology and ISO/IEC 38500. He is a past chair of the committee responsible for Australian standards on governance and management of IT, and is Australia’s lead representative to the corresponding international committee.

Mark writes and speaks extensively about how business leaders can govern IT. His publications include The Infonomics Letter (monthly), Waltzing with the Elephant – a comprehensive guide to Corporate Governance of Information Technology and The Director’s IT Compass. Through his company, Infonomics, he helps leaders understand and improve their organisation’s Governance of IT, and expands the skills of consultants who help their clients improve governance of IT.

Image credit: Delimiter

2 COMMENTS

  1. What’s the option for banks now? Sure systems were designed within tight constraints, but they were also designed when developers were predominantly experts and not the lowest-bidder commodity of today. The crucial aspect you touched was that of highly skilled operators; and was not the NAB issue precipitated by operator error?

    Operators were behind the fault, and operators were responsible for rollback. The issue lies with the operators and from experience, this is the case with modern systems.

    The existing batch model has proven to be at least highly reliable and auditable over decades, bar the odd mishap. Only a madman would rip it out to “modernize” it.

    • The batch model may have seemed reliable and auditable over decades – but what happened behind the scenes does not necessarily justify that impression. I’ve seen numerous cases of haggard faces on specialists who have worked 24, 48 and more hours straight to fix something that went horribly wrong before it came to the notice of customers and regulators.

      Changing business conditions demands that we periodically update our business systems. Would we advocate the cars of 1930 and the road rules of that era as suitable for today? Certainly not – and the same applies for the business systems of any organisation – we can maintain them for a while then we have to replace them in generational changeover.

      But while replacing their technology systems is a huge challenge for the banks, perhaps the bigger challenge is realising that technology, however advanced, is not omnipotent. Technology is a big part of the picture, but the picture also includes the people, the work they do, and the organisation and policy structures in which they work. The technology just supports and enables the system. At the end of the day, the technology, whether hardware or software, is just a machine. It’s how we use the machine that makes the difference and those organisations that design their entire business in the knowledge of how the machine works will get best value from it.

      Understanding that machines can and do go wrong from time to time is a key part of working with machines and having the skills on hand to deal with such problems is a vital part of organising the business. What the bankers have to work out, as does any other organisation that depends on its technology, is what skills does it need to keep close by and what skills can it afford to have at arm’s length.

      As they learned with branch closures, perhaps they will also learn with systems experts and operational personnel, that cutting too deeply can hurt a great deal.

Comments are closed.