STATISTICS FOR BIG DATA | Seminari Didattici | Presentazioni e Incontri
Fabio Forte - ING (NL)
Anomalies detection in credit risk data: an approach based on the Isolation Forest
Fabio Forte - ING (NL)
Aula Multimediale 12 (Edificio D1)
Anomalies detection in credit risk data: an approach based on the Isolation Forest
Fabio Forte, PhD
COORF - Management Reporting & Analytics
Product Owner - Regulatory Projects & Ad-hoc requests
Product Owner - Standardise & Innovate Reporting
ING, Amsterdam (NL)
Abstract
The starting point of the presentation is the definition of Risk, as the chances of having an unexpected or negative outcome.
After a brief introduction on most of the risk categories as Banks and regulators, the presentation focuses on credit risk models where the entire financial system is highly investing to avoid a further financial crisis.
Among the Credit Risk metrics, Risk Weighed Assets (RWAs) can be considered an important measure in the current credit risk environment. Indeed, they represent an aggregated measure of different risk factors affecting the evaluation of financial products.
The credit risk model accuracy, as all models, does not depend only on the effectiveness, parametrization and complexity of the model, but from the data used as input. This situation is often summarized as "Garbage IN is equal to Garbage OUT".
In this domain, several machine learning techniques for data anomalies detection have been introduced with a focus on Local Outlier Factor (LOF) and Isolation Forests.
These algorithms have been tested first on an artificial sample in order to show their statistical properties and then they have been applied on a real credit risk dataset where RWAs data anomalies have been analyzed.