05 July 2018

Risk Quantification and Physics Envy

There is no question that the quantification of the potential negative (and positive) impact of the actualisation of a risk can help to clarify thinking, and support activities required to manage the risk. Yet quantification can also be a waste of time, and can be used to create rubbish results that mask as science. Poor quantification and simulation are easily as bad as or worse than poor subjective assessments.

My own personal view is that we attempt, in anything other than major engineering companies and financial services behemoths, to think that we can actually quantify most risks. We cannot. My rationale is that the vast majority of risks are subjective in nature, and while we can agree set(s) of assumptions and scenarios, these are limited by our imaginations and what is considered "realistic" by others within a group.

Once the assumptions have been agreed and documented, then it is possible to perform Monte-Carlo or other simulations against those assumptions. We then take the results and call it quantitative and "scientific" when it was not; it was a set of potential outcomes based on subjectively assessed and probably flawed assumptions. And in too many cases, if the results are not acceptable in terms of an organisation's internal appetite for change, the assumptions are tweaked until the resulting MC simulations provide results within the bounds of the appetite (stated or unstated).

This is not to suggest that there are not cases for and that there is no value from such modelling. The ability to provide and discuss potential outcomes is a major step forward, which supports decision making.

The danger is in the assumptions and scenarios, not the maths.

Bad assumptions will deliver poor results from scenarios. And in almost all cases, assumptions are not backed up by “facts”, but are subjective in nature.

Going back many years, I worked as a Mainframe Systems Capacity Planner for a very large (three letter name) computer manufacturer. We built logical models of customers’ computer systems, to project future performance and to help the sales team prepare proposals. The ability to accurately project future degradation in performance based on growth scenarios was critical to convincing the customers that “now” was the time to either purchase upgrades or at least budget for those upgrades.

The models would provide estimates of when in the future the existing platform would no longer support the workload. We could quite closely model expected system response times, network latency, and ultimately the number of users and the impact those users would have on utilisation across the system.

Almost all systems can cope with a steady increase in utilisation, until they cannot, and they hit a “J-curve” or “hockey-stick” event. As utilisation grows, system load increases, and response times begin to creep out. This seems linear, with sub-second response time slowly growing to 1 second, then 1.1 seconds, and then to 1.2 seconds. But when the “J-curve” is hit, the very quick response times jump to 10 seconds, and then 30 seconds. The point at which the system “fell over” depended on so many factors from memory to processor speed, to disk capacity and paging time, caching in memory versus caching on disk, etc.

With one client I spend almost three days calibrating the model to match their system, close to the second decimal point for each "CICS Region" (application or subset of an application today). When completed, and we agreed that the model reflected their mainframe, I asked the client "okay, what are the scenarios that we will run?"

"I don't have any now, but I'll go home a make some up, and we can model them tomorrow."

Who gives a damn about the second decimal point when the scenario is nothing other than a wet-finger in the air? If you have no idea where you think the business will be going, or what external factors are going to impact the scenarios, then the results are going to be rubbish. 

Conversely, I had clients who brought with them detailed scenarios with estimated numbers of additional staff and customers, loose estimates of the potential increase computing power required to support additional functionality in existing applications, and new applications that were in the pipeline, with estimates of the number of users and computations.

It was remarkable that, as a general observation, that the more detailed the future scenarios, the less obsessing clients were about the perfection of the model, and while 2 decimal places was the target, a model of their system to 1 decimal place was usually adequate, as it gave them more time to run their scenarios, and variations on their scenarios. Their interest was in the estimated break-points in the future, in terms of their mainframe’s and network’s ability to cope with the scenarios.

Quantification is an important tool for planning and scenarios modelling. Quantification of potential costs is critical when dealing with big numbers, and with quantifiable numbers. Monte-Carlo simulations provide one tool for estimating ranges of potential impacts. Such modelling of an investment portfolio is critical for enterprises that rely on solvency capital and investments. Scenarios in these cases provide internal comfort that investment and asset risk is being managed, and provides comfort to regulators that the business is solvent and should, under any but far-fetched scenarios, remain solvent.

But quantification of some operational risks and activities is meaningless without a detailed understanding of the risk, real data and “big numbers”, and without realistic and well-considered scenarios. The first half of the equation is the “model” – how well do we actually understand the risk and potential impacts. The second half is the “scenarios” – how well have we thought out future scenarios and potential events, and key internal and external influencers.

If the model is perfect but the scenarios are facile, the outputs will be useless. If the model is shaky and not well understood, it does not matter who good the scenarios are, the results will be equally useless.

No comments:

Post a Comment