Precision, effectiveness, and the compliance dilemma!

Attending the FinCrime World Forum (virtually) today and listening in to one of the panel sessions, I was reminded how often people confuse system precision with system effectiveness. The confusion is made worse in the world of anti-money laundering (AML) and compliance as the industry lacks a reliable way to measure the true effectiveness of systems.

Precision: Precision is a measurement of how efficient a system is. In the world of compliance, precision is usually termed the false positive rate of a system. In simple terms, this is measured as follows:

False Positive Rate = Bad Actor Alerts / Total Alerts
False Positive Rate = True Positives / (True Positives + False Positives)

As an example, if my bank’s AML solution generates 1000 alerts a month in total and if operational teams find 100 alerts related to bad actors (true positives) and these are escalated for reporting or further investigation then the false positive rate of this system would be 100/1000 or 10%. The system’s precision is 10% as it gets the right answer (finds a true positive) on average once for every ten alerts generated (1 in 10).

Improving precision: This, in theory, is easy! Remove the unwanted alerts generated against legitimate customers (the erroneous false positives) and maintain the same number of alerts generated against the bad guys (true positives). Many vendors are now offering artificial intelligence and machine learning methods that attempt to do this.

In the example, if we can reduce the total alerts generated each month to 500, but still capture the same 100 alerts on the bad guys, then the system precision becomes 1 in 5 or 20%. Great news, the precision of the system has improved!

The system is now more precise, investigations can be performed more efficiently as there are fewer alerts to review, but the improved precision has done nothing to change the effectiveness of the system. Before optimization, the system generated 100 alerts against bad actors and after optimization, it still generates 100 alerts against these same characters and so the system effectiveness is unchanged. The system is more precise but no more effective.

A subtlety, and an error I have seen a number of times at institutions that should have known better, is that improving precision can often make effectiveness worse!

Improved precision can mean lower effectiveness: A naive team of data scientists might run an algorithm that reduces the alert rate to 250 alerts each month but now catches only 75 of the bad actor alerts (true positives). The precision is now 75/250 or 30%, which means even more efficiency and potential cost savings but this comes at a penalty in that the system is now less effective. There are now only 75 true positives alerts detected and the system is missing 25 other true positive, bad actor, alerts that it would previously have detected. So be careful!

Now that we’ve discussed how system precision can be measured and what it means for efficiency and false-positive rates, we can turn to the more difficult issue of measuring effectiveness? Now, this is where it gets tricky!

Effectiveness: System effectiveness is a measure of the total number of accurate bad actor alerts (true positives) that are generated by a system as a ratio of the complete set of bad actor alerts that should have been detected. The formula can be expressed as:

Effectiveness = Bad Actor Alerts / All Bad Actor Alerts
Effectiveness = True Positives / (True Positives + False Negatives)

Here’s where we run into the big issue, the one that is at the crux of all compliance debates. People talk endlessly about the need to measure system effectiveness but to know how effective a system is we also need to know how many bad actors there are operating at our bank so that we can see how many we need to detect! There is a tautology here, if we knew who these bad actors were we would not need to detect them! It is only once we know the complete number of bad actors that we can actually assess whether our AML system is 100% or 0.01% effective.

Returning to our example, if we find 100 bad actor alerts each month and there are only 100 bad actors active at our institution then our system could be 100% effective. But if there are millions of bad actors abusing the institution our effectiveness rate could only be 0.01%.

“Without knowing the unknown it is impossible to accurately assess what we do know.”

The Compliance Officer’s Dilemma

Compliance officer’s dilemma: This leads us to the compliance officer’s dilemma which, to paraphrase Donald Rumsfeld, is that without knowing the unknown we cannot accurately assess what we do know. Or to put it another way, without knowing about all the bad actor cases that our systems should have detected it is impossible to get an accurate measure of overall system effectiveness.

In practice, you can use trade-off graphs and other styles of analysis to get estimates of system effectiveness. These work in the way a gold prospector would, and look at rates of return of detection as you dig deeper into the pile of potential alerts that could be generated. Even with these approaches, it is still impossible to know all the unknowns.

Two takeaways …

First, next time you are asked how effective your AML transaction monitoring solution is perhaps you should give the real answer “it is impossible to know” and then qualify it with the evidence that you have as to why your teams look at the number of alerts that they do and the trade-offs that this represents.

Second, as an industry, we should focus on relative measures of effectiveness and look towards the incremental improvement of these over time. You may not know the absolute end goal of the effectiveness of your systems and processes, but the incremental improvement over time means that wherever that goal is you will be moving in the right direction.

Finally, if you have found this interesting you might also like my article on the challenges of non-verifiable judgements and why fast feedback loops are essential to improve the performance of compliance (and other) systems.