29 July 2018

Old is Good, Unless You are a Computer System

Old is, by itself, I am happy to say, not bad. And the process of getting older is also not by too bad either. We build up knowledge and understanding, and sometimes we see wisdom in some (older) friends. But “old” is not good in computer systems. That accumulated “knowledge” is actually decades of bugs and bug fixes, new functionality that does not always work with the old, and ancient security holes that either have never been found, or have been too difficult to fix without breaking the rest of the system.

New-build systems, while potentially having a limited functionality set, are easier to manager, faster to build, scale more easily, and consume fewer resources to run and maintain. The “systems shop” full of geeks is a thing of the past, unless you are running a large legacy system.

Agility in the face of threats and opportunities is magnified in newer systems, while legacy systems can be overwhelmed in the face of new threats.

This does not meant to settle the Buy-vs-Build argument, but it does argue for the replacement of legacy systems with newer systems built with current technology and for modern infrastructure. After all, who speaks COBAL any longer?

On example of how to overwhelm a legacy system; regulatory reporting. FATCA created a nightmare for financial institutions having to deal with new fields and new reporting requirements. Older systems required new code, new reporting systems build or new extracts to feed reporting platforms. Meanwhile, newer systems, built with regulatory reporting as a core design requirement, found the delivery of FATCA reporting much easier. 

Newer financial institutions and those with newer systems may still refuse to open accounts for US citizens, but that is being driven by an expectation of future US Legal Imperialism.

But these new systems are able to support CRS, the “rest of the world’s” response to FATCA. Pity that the US of Amerika refuses to engage with the rest of the world and implement CRS (Common Reporting Standard). Even countries like Panama are implementing CRS, and computer systems are having to cope with the new regulatory reporting requirements.

I enjoy being older. I’m smarter, I think more deeply, and my opinions are based on decades of experience and knowledge. At least, I flatter myself with these thoughts, even though I may be hard-pressed to find much support for those assertions. But I do envy the young. I cannot run as fast any longer, or run at all for that matter. I’m not as agile, and new music simply baffles me. 

Another area where younger seems to have an advantage is in fraud and cyber-security. I’m back to talking about computer application and banking systems of course.
Remember the good old days when a dial-up network with a 48kb connection was enough? Back in those days, hacking was a different scale, and individual hackers were or became known to officials. They weren’t always caught. But sometimes that special person like Clifford Stoll will "stalk the wily hacker", ultimately leading to an arrest.

From "Stalking the Wily Hacker", Clifford Stoll, 1988

Today stalking the wily hacker is almost impossible, and the number of vectors continues to increase exponentially. Building information security in from the beginning is key to a successful financial systems application. I do not know what application Monzo, the UK challenger bank, but they certainly are talking about their agility in the face of cyber-attacks and fraud. Imagine with old systems being able to respond to apparent fraudulent activity within four hours.

“Within four-and-a-half hours, the team rolled out updates to our fraud systems to block suspicious transactions on other customers’ cards. That evening, we reached out to other banks and the US Secret Service (which is responsible for credit card fraud in the US) to ask if they had seen anything similar. At the time, they hadn’t.” Try doing that with a legacy system.

Yet for all that, and perhaps as a victim of the “Sunk Cost Fallacy”. I happily will continue to hold this particular legacy system (myself) dear and will continue to attempt upgrades.

17 July 2018

It's all about the Target (risk assessment)

In my previous post I commented on the importance of adding a “Target” risk position to the traditional "Inherent" and "Residual" risk assessments, and the linkage to the Risk Appetite. More importantly, the “Target” level for any risk provides a focus on the future.

Let me explain.

Inherent to Residual: Inherent risk is the level of risk before remediation. This is important to ensure that we are focusing on the areas of risk that represent this greatest threat or opportunity for the entity. Inherent risk scoring is subjective, but then so is almost all risk scoring. Yet an assessment, subjective or quantified, of the risk before controls or other remediation ensures that we invest our limited resources on the areas that pose the greatest to achievement of the entity's objectives.

So we’ve assessed the Inherent risk, and we have applied controls for remediation, mitigation, etc. Now we have our “Residual” or “Net” risk position. Again, this is by nature subjective, and fraught with assumptions. But it does provide an assessment of our current state of risk and the level of risk that is being taken by or accepted by the entity. But this is subjective. And it will be wrong.

There will be too many missing controls, controls that are functioning ineffectively, mitigation that is unfocused or not in place. The assessment of the Residual risk position provides a snapshot of the current situation, with no insights into either the level of risk that is acceptable, the actual level of risk being taken, or the level and type of risks that the entity wants to take.

Target: And to we get to the “Target” risk level or assessment. What level of risk does the entity want to take, and what level of risk is acceptable. This fundamentally an assessment of the desired future-state of the risk environment that the entity wishes to work within. And yes, this too shall be subjective. It will also probably be achievable.

As Risk Managers we need to consider and advise on the level of acceptable or desirable risk to be taken or accepted by an entity, across the spectrum of risks. This means that we need to assess not only the raw (Inherent) risk environment and support the allocation of resources to highest or least desirable risk areas, and of course allocation of resources to those areas where we want to take risk. 

We also need to work with management to objectively assess the desired, or acceptable level of risk to take; the Risk Appetite. This should be done globally, and should be done at the level of each identified and recorded (and managed) risk. This is our Target risk level for each risk.

Now, and only now, can we meaningfully assess our “Residual” level or risk, and determine if that level of risk is appropriate. 

If our current (“Residual” or “Net”) equals our Target, then we are running at our desired level of risk. And if we are not, then we now know that our desired future state does not equal our current risk managed state for this, and all other risks where Residual does not equal Target. 

Example of Residual to Target tracking

In the example above, a number of interesting observations can be made, including that the Residual and Target risk assessments change, as risks are reviewed by the Risk Owners. It is also clear that Residual are Target are not the same. Within the data there will probably be a number of individual risks where the Residual assessment equals the Target assessment; the current risk situation for those risks equals the entity's Risk Appetite for those specific risks.

So why does Residual not equal Target? There are three possibilities:

  1. Our control environment is ineffective and or does not include all the controls that are already in place to manage the risks (and these then need to be identified).
  2. Our aspirational level of risk management for risk is too high.
  3. Conversely, we are over-controlled (where Residual is lower than Target for specific risks) and we are potentially stifling the business through excess controls.

And when we determine that our Residual risk position does not equal our Target risk position, we have four option:

  1. Accept that it will not be possible to achieve the desired Risk Appetite for this risk, and, through a Delegation of Risk Authority process, change our Risk Appetite and therefore our Target level for this risk to equate to the current Residual risk level.
  2. Subtly different, but we may determine that the Risk Appetite is not right, and that we do want to accept, or take, more of this risk, and therefore change the Target.
  3. Identify the controls that are not effective and implement improvement programmes or introduce new controls.
  4. Confirm that we are over-controlled and look at which controls are not actually required, or are burdensome and should be replaced with monitoring controls.

What is the role of Internal Audit in this?

The Internal Audit function provides some assurance that the system of internal controls is effective. This requires Internal Audit to determine what areas of business activity they will review. This selection should be risk-based, which means starting with the risk register and consider a balance between this Highest “Inherent” risk areas, and the highest “Residual” risk areas.

As part of each Internal Audit, the assessed level of Target risk should be considered, and Internal Audit should then perform an audit programme designed to confirm (or otherwise) that management’s assessment of the effectiveness of controls is accurate. If the controls are effective, and these controls have been determined to bring the entity to within Risk Appetite, than Internal Audit’s role is limited to questioning the appropriateness of the Risk Appetite. (Note I say question, not set, as that is the role of senior executives and the Board, other others within their Delegation of Risk Authority).

Where the Residual risk level does not meet the Target, Internal Audit should be determining if this is because the controls are ineffective, or because the control environment provides inadequate coverage of the risk. In which case, new controls may be appropriate.

In all cases, Internal Audit should be determining if Management's assessment of the effectiveness of the control environment matches the evidence provide to Internal Audit. If management's assessment is correct, and there remains a delta between the Residual and Target, and senior management and/or the Board are aware, then there is no Internal Audit finding other than the fact that senior management and/or the Board are aware of the difference, and are aware of and support management's plans for remediation.

It's all about the Future

The core message however is that the Inherent risk position represents a “past” with no controls, the Residual risk position represents the present (as assessed by management), while the Target risk represents the future, or desirable control and risk management state, and is one of the enunciation of the entity's Risk Appetite.

The question we ask of Risk Owners is: What are you doing to get from the Residual risk position to the Target risk position, and when will you get there?

11 July 2018

Why Inherent and Residual Risk are Inadequate: What is the Appetite?

Too often the practice of internal auditing, when performing risk assessments, looks at Inherent Risk (the level of risk before any remediation) and Residual Risk (the level of risk after remediation. This is inadequate and forgets one of the most important aspects of Risk: the Risk Appetite.

Risk Appetite provides management with a view of the level and type of risk that the entity is willing to take, and the risks that the entity will pursue. Missing from the IIA’s (and others) assessment of risk is the Target Risk level. This represents the level of risk acceptable for any individual risk based on the Risk Appetite of the entity.

The delta between Inherent Risk and Residual Risk measures only the current assessed level of control or risk. It does not provide a link to what is the acceptable level of risk (and control) for the entity.  This means that Internal Audit could, in theory, report that the entity is well controlled as the Residual Risk level is accurately stated and the controls to enable that level of Residual Risk are functioning effectively. 

Equally, in theory, the Residual Risk level could actually be fully in-line with the Risk Appetite, and in such a case there would be no Internal Audit findings other than “(Auditable area) appears to be well controlled with the current Residual Risk being within the Risk Appetite”.

I do say “in theory” because I have only seen one Internal Audit report in the past 35 years that did not contain findings and recommendations, even when reporting that the audited area is effectively controlled. Internal Auditors simply, almost pathologically, count the number of findings, and too few findings are seen (by the Internal Auditors) to indicate a poorly performed or ineffective Internal Auditor. For a candid discussion of the “7 deadly Internal Audit sins” I would only point you to the video from Richard Chambers, IIA President and CEO.

The concept limiting risk to Inherent and Residual is sound – IF that remediation reduces risk to within Risk Appetite.

From the IIA

The reality is that Inherent and Residual Risk scores do not cater for the situation in which the level of residual risk is inconsistent with the entity’s Risk Appetite. This is left to Internal Auditor to attempt to determine what the control environment should include to bring it within the Risk Appetite, sometimes in the absence of a defined Risk Appetite.
In this case, we need to know what the Target Risk score is, in terms of the Risk Appetite. The most important delta then is between the Residual Risk level and the Target Risk level, not between Inherent and Residual.

Of course there is the common problem that many (most?) entities do not have a well-defined Risk Appetite, and therefore it is almost impossible to confirm that a Residual Risk position actually is within the Risk Appetite. This make development and communication of the Risk Appetite a critical step for an entity in its journey to becoming “well controlled”.

Therefore, as the Risk Appetite frequently is either non-existent or not well communicated and understood, the probability is that the Residual Risk position will not be in line with what would be the Risk Appetite. What is needed then is to determine what management considers the “Target” risk position should be for any risk, thus creating the de-facto Risk Appetite at that particular risk level.

Then, with a Target Risk score, it is possible to clearly communicate the difference between the Residual and the Target. That difference is the Internal Audit finding, and can be used to demonstrate the need for improved or additional controls, or can be used to demonstrate that existing control are not operating effectively.

In an ideal world the entity will have a defined Risk Appetite statement, or Target risks scores for each identified risk, therefore having a de-facto Risk Appetite at the risk level. And in such an entity, all Internal Audit findings and recommendations should demonstrate how those recommendation will enable achievement of the Target, and therefore Risk Appetite. This will also allow management to petition an adequately senior authority to “accept” the risk or authorise resources to plug the gap. 

Such “acceptance” should of course be in line with the Delegations of Risk Acceptance, but that is a topic of a different article.

05 July 2018

Risk Quantification and Physics Envy

There is no question that the quantification of the potential negative (and positive) impact of the actualisation of a risk can help to clarify thinking, and support activities required to manage the risk. Yet quantification can also be a waste of time, and can be used to create rubbish results that mask as science. Poor quantification and simulation are easily as bad as or worse than poor subjective assessments.

My own personal view is that we attempt, in anything other than major engineering companies and financial services behemoths, to think that we can actually quantify most risks. We cannot. My rationale is that the vast majority of risks are subjective in nature, and while we can agree set(s) of assumptions and scenarios, these are limited by our imaginations and what is considered "realistic" by others within a group.

Once the assumptions have been agreed and documented, then it is possible to perform Monte-Carlo or other simulations against those assumptions. We then take the results and call it quantitative and "scientific" when it was not; it was a set of potential outcomes based on subjectively assessed and probably flawed assumptions. And in too many cases, if the results are not acceptable in terms of an organisation's internal appetite for change, the assumptions are tweaked until the resulting MC simulations provide results within the bounds of the appetite (stated or unstated).

This is not to suggest that there are not cases for and that there is no value from such modelling. The ability to provide and discuss potential outcomes is a major step forward, which supports decision making.

The danger is in the assumptions and scenarios, not the maths.

Bad assumptions will deliver poor results from scenarios. And in almost all cases, assumptions are not backed up by “facts”, but are subjective in nature.

Going back many years, I worked as a Mainframe Systems Capacity Planner for a very large (three letter name) computer manufacturer. We built logical models of customers’ computer systems, to project future performance and to help the sales team prepare proposals. The ability to accurately project future degradation in performance based on growth scenarios was critical to convincing the customers that “now” was the time to either purchase upgrades or at least budget for those upgrades.

The models would provide estimates of when in the future the existing platform would no longer support the workload. We could quite closely model expected system response times, network latency, and ultimately the number of users and the impact those users would have on utilisation across the system.

Almost all systems can cope with a steady increase in utilisation, until they cannot, and they hit a “J-curve” or “hockey-stick” event. As utilisation grows, system load increases, and response times begin to creep out. This seems linear, with sub-second response time slowly growing to 1 second, then 1.1 seconds, and then to 1.2 seconds. But when the “J-curve” is hit, the very quick response times jump to 10 seconds, and then 30 seconds. The point at which the system “fell over” depended on so many factors from memory to processor speed, to disk capacity and paging time, caching in memory versus caching on disk, etc.

With one client I spend almost three days calibrating the model to match their system, close to the second decimal point for each "CICS Region" (application or subset of an application today). When completed, and we agreed that the model reflected their mainframe, I asked the client "okay, what are the scenarios that we will run?"

"I don't have any now, but I'll go home a make some up, and we can model them tomorrow."

Who gives a damn about the second decimal point when the scenario is nothing other than a wet-finger in the air? If you have no idea where you think the business will be going, or what external factors are going to impact the scenarios, then the results are going to be rubbish. 

Conversely, I had clients who brought with them detailed scenarios with estimated numbers of additional staff and customers, loose estimates of the potential increase computing power required to support additional functionality in existing applications, and new applications that were in the pipeline, with estimates of the number of users and computations.

It was remarkable that, as a general observation, that the more detailed the future scenarios, the less obsessing clients were about the perfection of the model, and while 2 decimal places was the target, a model of their system to 1 decimal place was usually adequate, as it gave them more time to run their scenarios, and variations on their scenarios. Their interest was in the estimated break-points in the future, in terms of their mainframe’s and network’s ability to cope with the scenarios.

Quantification is an important tool for planning and scenarios modelling. Quantification of potential costs is critical when dealing with big numbers, and with quantifiable numbers. Monte-Carlo simulations provide one tool for estimating ranges of potential impacts. Such modelling of an investment portfolio is critical for enterprises that rely on solvency capital and investments. Scenarios in these cases provide internal comfort that investment and asset risk is being managed, and provides comfort to regulators that the business is solvent and should, under any but far-fetched scenarios, remain solvent.

But quantification of some operational risks and activities is meaningless without a detailed understanding of the risk, real data and “big numbers”, and without realistic and well-considered scenarios. The first half of the equation is the “model” – how well do we actually understand the risk and potential impacts. The second half is the “scenarios” – how well have we thought out future scenarios and potential events, and key internal and external influencers.

If the model is perfect but the scenarios are facile, the outputs will be useless. If the model is shaky and not well understood, it does not matter who good the scenarios are, the results will be equally useless.

01 July 2018

Risk Acceptance - the need for a Delegation of Risk Authority (DRA)

Over too many years, when pointing out a risk or situation, either management of below may respond with "it's okay, we've accepted that risk".

Really? Who accepted that risk, and did they have the authority to accept that much risk on behalf of the business. In too many cases the risk identified was significant, and if presented to senior management or the Board, that risk would not have been "accepted", at least not without consideration of the implications and costs of remediation or reduction of the risk.

What actually happened is that the person or people dealing with the risk have been unable to quantify or otherwise clarify the risk and potential impact, or develop a costed and realistic plan to mitigate the risk. Because of this, they have failed to convince themselves of the severity of the risk, and therefore are unable to communicate that exposure to senior management. Having failed to effectively communicate, they fall back on "we've accepted that risk". 

Too often what was missing was an actual assessment of the risk, either subjective or quantitative where possible. Included in such as assessment should be a definition of the existing controls and an assessment of the effectiveness of those controls.

Controls exist to provide confidence that risks are being managed. As such, on a quarterly, six-monthly and for some annual basis, management owners of controls should confirm that the controls associated with risks are functioning and are effective. Evidence should then be provided that demonstrates that the controls are functioning. 

Rarely is there a formal confirmation that the person responsible for the control actually has the authority to accept the associated risk.

Risk acceptance can be split into two parts:

  1. First, is the Risk Appetite appropriate for this risk? It may well be that the entities Risk Appetite is too caution for this type of risk, and therefore the reduction of the risk to tolerable levels will be too expensive and result in a situation of "over-control". 
  2. The second factor is the authority of the person accepting the risk. While companies generally have Delegations of Financial Authority (DFAs), rarely is there a formal Delegation of Risk Authority (DRA). 

To put that into a concrete example, a manager may have a financial delegation of up to $/€/£10,000. That is the level of expenditure that has been determined to be appropriate for that level or individual, without the need for additional authority. The next level up may have a delegation of $/€/£50,000. Finally, for major decisions, a Director or Board authority might be required, say for investment or programmes with a value above $/€/£1,000,000.

But how much Risk can a manager accept? 

What is missing from the picture is the Delegated Risk Authority to accept a residual risk position. All risks have an inherent level of risk and potential impact. We implement controls to reduce or manage the risks resulting in our residual or "net" risk position. Yet our residual risk position may not represent a level of risk that is acceptable to the entity within the bounds of the entity's Risk Appetite. 

Where the residual risk is above the acceptable level, either additional controls or mitigation needs to be put in place, or the residual level of risk needs to be "accepted" (which logically would alter the Risk Appetite for that particular risk). 

The question is; who has the authority to accept that residual level of risk?

My recommendation is that companies put in place a Delegation of Risk Acceptance (DRA) that mirrors their Risk Assessment levels. As most companies use, for better or worse, a Likelihood x Impact grid, that provides us with an example for the Delegation of Risk Acceptance.

When a ‘risk’ is accepted, this indicated that there is agreement that no additional actions or controls will be put in place to further reduce either the impact or the likelihood of the risk.

If, for example, the entity may have assessed the risk of a System Failure as a "High Likelihood / High Impact" pre-remediation of any kind. Controls in the form of effective governance over IT systems may have brought the assessed residual level of risk down the "Medium/Medium". However, the Risk Appetite may have been stated by the Board to be "Medium (Likelihood)/ Low (Impact)".

In this case, there is a disconnect between the residual risk position and the Risk Appetite, and either the residual risk must be "accepted" or additional control must be put in place.

The "solution" is the Delegations of Risk Acceptance.

For each risk (as per the Risk Appetite and/or grid) there should be an identified level of authority to accept a residual risk position. For example, a residual risk level of High/High should only be "accepted" by the Board, while a Low/Low residual risk position may be "accepted" by a manager.

In this case, the DRA may state that residual risk positions that are "Medium" (in likelihood or impact) require acceptance at the Cxx level. In which case, for this example, the CIO should be required to "accept" the residual "Medium / Medium" position, based on an assessment of the cost and effort to bring the residual risk to the Risk Appetite level of "Medium / Low".

The key to the Delegation of Risk Acceptance is that it is linked to the difference between the actual residual risk scoring and the Risk Appetite. Where there is no difference, and the residual risks score equals the Risk Appetite, there is no need to "accept" the risk.

Has this been implemented?

Yes, though with mixed success. As with all issues of Risk Management, the quality of Board, Director and Senior Management buy-in is critical. Communication is required, and an understanding of the risk and control environment, both internal and external.

When used effectively, the DRA can ensure that risk acceptance is being taken at the right levels, or additional investment is authorised to bring the residual risk situation into line with the Risk Appetite. I have seen this accomplished, and the risk environment has been demonstrably improved.

Likewise this provides Internal Audit with an effective tool to communicate and encourage the implementation of effective controls. On the one hand, IA "empowers" the auditee to perform their risk assessment and to then gain the required investment or reallocation of resources to resolve the audit issue, or management with sufficient DRA is then able to confirm that the risk as identified by IA has been accepted at an appropriate level.