This paper presents SynthAML 10, a synthetic data set to benchmark statistical and machine learning methods for AML. 9 further proposed AMLSim, augmenting and tailoring PaySim to a more classic bank setting where researchers, in addition to simulated normal transactions, can inject (hypothesized) money laundering patterns. 8 proposed PaySim, a multi-agent simulator designed to emulate mobile phone transfers. In light of this, we argue that simulated or synthetic data is the best viable option for open AML research. Unfortunately, the broader scientific literature contains multiple examples of successful de-anonymization attacks 3, 4, 5, 6, 7. For financial institutions to publish real data, they would need absolute anonymization guarantees. Bank transactions are highly confidential, containing information about sexuality and religious and political affiliations. The lack of public AML bank data sets is not without reason. It also severely limits academic research on open AML problems such as class imbalance, concept drift, and interpretability (see our “Usage Notes” section). This makes it hard to compare systems and assess their effectiveness, efficiency, and robustness. Complicating matters, there exist no real public data sets with AML bank data 2. Most authorities offer little guidance on AML systems, leaving banks to develop them on their own. These often rely on simple business rules, raising alerts for investigation by human bank officers who either (i) dismiss or (ii) report the alerts to national authorities. In practice, monitoring is done with electronic AML systems. The global framework for anti-money laundering (AML) is regulated by the Financial Action Task Force, requiring that banks monitor and report suspicious transactions 1.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |