COLUMN – OLAF RANSOME | Resilience starts with checklists and procedures, but it’s not just about ticking boxes to satisfy regulatory requirements. Aim to do more than the bare minimum, and it could lead to a healthier, more robust organisation.
In a series of six column contributions with PostTrade 360° throughout 2024, banking operations veteran Olaf Ransome digs into the topic of operational resilience – to help us understand its meaning under changing rules, and get adequately prepared. Find his articles listed here.
Resilience is about more than business continuity, more than a document about what a firm does if something goes wrong. The world of financial services (FS) is complex; customers have high volumes and high expectations. Cash machines going down, or payments not being able to be made, make headlines. Quite rightly regulators have focused attention on our industry’s ability to recover quickly and easily from shocks, system failures and other outages.
Regulation in the FS world works on a trickle down model. Guidance is often developed centrally, by one or other arm of the Bank for International Settlements (BIS) or the G20 and is then adopted by national regulators.
For resilience, the relevant guidance is from The Basel Committee on Banking Supervision, the BCBS, which has laid out some Principles for Operational Resilience (POR) and the Revised Principles for the Sound Management of Operational Risk (PSMOR). These are from 2021. The BCBS guidance is what is ultimately applied to individual banks. There is also guidance for our financial market infrastructures (FMIs), which is driven by the Committee on Payments & Market Infrastructure (CMPI).
Operational resilience is defined there as “the ability […] to deliver critical operations through disruption”. In other words, do you have the capabilities to know what would be a critical issue, to know when you have one and to be able to react across your whole firm?
Why bother?
Survival, reputation, profitability and compliance. First to survival. I’d expect any firm will say it has good operations and faith in its team to deal with issues as they arise. In FS, when your firm does have that “bad day”, there will always be two negative forces at work; the problem itself, and your ability to recover from it, as well as the perception the rest of the market has of your situation. Maybe it is: “They are not making payments to us, so we won’t make payments to them.” There is an old saying in FS: “If the market thinks you are illiquid, then you are illiquid.”
Second to reputation. Just having a disruption will not lose you customers, but handling it badly will. Our expectations around on-line banking are that it is always there and that there is clear communication from the bank when there is an issue. The UK’s NatWest fell afoul of both things in May 2024: “NatWest outage leaves customers fuming”.
Profitability is the third impact. Recovering from operational incidents takes a lot of organisational bandwidth, added to which you might be losing clients and not gaining new ones. Compliance is last but not least.
Resilience is a topic for regulators; they are now expecting FS firms to demonstrate capabilities. They are going beyond PowerPoint to: “Show me capabilities which demonstrate you can recover.”
That all gets to me what I would dub “the smart trade”; study the requirements, understand where you are now and make a plan to be pro-active.
What do we have to do, who’s doing what and how well?
The BCBS’ latest report on progress in matters resilience is worth a read; two minutes is all it takes. Here’s my view on what it is telling us, followed by some suggestions on what you might do next.
• The banks have done the work to know what the threats and vulnerabilities are but have not clear response plans to be able to address them.
• The regulators see banks as having incomplete pictures of their end-to-end operations; their view of all the parts of each organisation involved in delivering critical services is not sufficiently complete and granular. Included in that would be outsourced parts of the process. We must remind ourselves that we can outsource responsibility for something, i.e. the doing, but not accountability, i.e. doing the right thing and doing it properly. You need to understand exactly what your third-party providers do, how they do it and how they would recover.
• Board members’ roles and responsibilities and capabilities for operational resilience are still under development.
The last point is telling. It is as simple as “what gets measured, gets managed.” If the board is not clearly measuring and managing, then the “tone from the top” is wrong, or at least unclear, and that will dilute any focus on the topic.
After you have the board responsibility clear, my view is that the best place to start with resilience is to ask yourself three questions:
• What do I do need to do to stop things going wrong in the first place?
• Do I have a view of the 10 things most likely to go wrong in my area / team / department?
• Do I have a plan and the capabilities to monitor for those 10 situations and to react if something does happen?
For me, this all starts with checklists and procedures. Operations work can be grouped into two flavours: the repetitive daily kind of multiple steps throughout the day, which you cannot do all at once. Some of those things can be done any time, some need to be done by a certain time or at a fixed time. Some are done periodically, on Fridays, at month-end. The second type of activity is things you do “on-demand kind”: a new issue, a corporate action, month-end or quarterly reporting.
Each discrete function needs one or more checklists to run its business. There might be a policy, but today, at the coalface, that is not at all helpful. There might even be a procedure. If that is a full text document, it is only a bit helpful. It might help if it tells you exactly how to do something manually or who to call if there is an issue, it does not help if the 27 things you need to do each day are buried somewhere in a 30-page document. And you need the right tools for the job to make sure that your team is ticking off those tasks as they are completed, and recording the evidence of the work, or the status. Note, “ticking off as you go” and not a cursory end-of-day “Did we do everything? Yes, I think so”.
On most days, things go according to plan and the experts in your team are on top of things. The team needs easy access to a procedure: what to do at step three if you need the detail. Something which points you straight to what you need to do, not to a long document leaving you to search for what you need. The procedure might also include a back-up plan.
“Understanding normal” should also be integrated into the checklist tasks. You should have a sense of what normal volumes, values and timings are in your business. Having no trades booked at 0700 might not be a surprise, but if you are in FX, then at 0900 UK time, you’d expect quite a few trades. The checks need to be smart enough that the “abnormal” does not just happen and goes unnoticed.
Importantly, the tools are needed at the team level. Managing operations is a bottom-up task. Like an SAS unit in the field, the team needs the skills and the tools; they can’t rely on some headquarters function miles away.
If you have a great checklist and procedure set-up, then you can ask the team: “Are there things which go wrong regularly and consume lots of your time?” Ideally, you start to fix those things, because you take the view that if you have those better controls, then you have more capacity.
Then you move on to questions two and three above. What is likely to go wrong, how would you know if it did and how would the whole firm need to react? Trades might not be booked, there might be an extraordinary number of breaks, or differences, at some point in the process, your sanctions screening might crash, preventing payment instructions being sent out.
Thinking through the possibilities should lead you to be able to map out an end-to-end view of all the departments and systems involved in your transactions, all the way from a client order through to settlement and the balance sheet. In answering, at the team level you need to think end-to-end; upstream and downstream. What might happen either side of you which affects you, what affects would an issue in your area have? Think of the NatWest example; on-line banking is down, where do we communicate and how often?
In summary, you are going to be accountable for having operational resilience. Yes, that is yet another regulatory burden. You need to decide whether you want to just get a “tick in the box” or to use this as an opportunity to fine tune your organisation.