Project Review: Detecting Fraudulent Transfers with ML in Real-time
Last updated:
- Figuring out what "entity" to model
- Figuring out how to connect this to the underlying infrastructure
- Dealing with the extra latency
- Non-obvious features
- 2nd order effects
The objective of this project was to upgrade the then-current anti-fraud rule with a real-time ML model.
Figuring out what "entity" to model
At first we looked for transfer-out entities in the databases and set out to build a training dataset with that.
It soon became obvious that this could introduce data leakage; if the model was supposed to intercept the transfer-out, it should act before it was created!
Fortunately, the underlying system had just an entity to represent something that may or may not become a transfer-out: a transfer-out-request.
This is an example of how Event Sourcing is critical to ML-enabled systems.
Figuring out how to connect this to the underlying infrastructure
We already had a "static" anti-fraud rule, but it was unfortunately being gamed by fraudsters.
We found the code where this old rule was executed and we plugged a model call in addition to the old rule. (It was important to log which of the two rules was responsible for the denial, if it happened).
This helped us keep the same "side-effect" of the previous rule—this was good because it saved implementation effort but also because ill-intentioned customers wouldn't notice that a new type of defense was being used.
Dealing with the extra latency
The model, though fast, added some latency to the transfer authorization flow. Scoring every transfer with the model would not be feasible.
We considered only applying the model to transfers above a certain amount threshold. But this would surely be learned by fraudsters (they would just split high-ticket transfers into small amounts so as to fly under the radar).
In the end we mitigated the latency problem by:
Aggressively timing-out real-time features that took too long to be retrieved;
Removing some features that took too long to be computed (e.g. behavior patterns from the last 180 days) and using shorter-term proxies instead;
Using a cumulative threshold instead of a per-transfer threshold. This allowed us to reduce the number of model calls without increasing the adversarial attack surface area.
Non-obvious features
has_centsis a good feature in all things Fraud, where monetary amounts are involved.
2nd order effects
This project caused a feedback loop, because we had no way of knowing whether a blocked transfer was a true positive or a false-positive. (using control groups in fraud problems is not always possible for regulatory reasons)