Chapter 3 Big Data Analytics for Credit Card Fraud Detection Using Supervised Machine Learning Models
Purpose: This chapter aims to examine machine learning (ML) models for predicting credit card fraud (CCF). Need for the study: With the advance of technology, the world is increasingly relying on credit cards rather than cash in daily life. This creates a slew of new opportunities for fraudulent individuals to abuse these cards. As of December 2020, global card losses reached $28.65billion, up 2.9% from $27.85 billion in 2018, according to the Nilson 2019 research. To safeguard the safety of credit card users, the credit card issuer should include a service that protects customers from potential risks. CCF has become a severe threat as internet buying has grown. To this goal, various studies in the field of automatic and real-time fraud detection are required. Due to their advantageous properties, the most recent ones employ a variety of ML algorithms and techniques to construct a well-fitting model to detect fraudulent transactions. When it comes to recognising credit card risk is huge and high-dimensional data, feature selection (FS) is critical for improving classification accuracy and fraud detection. Methodology/design/approach: The objectives of this chapter are to construct a new model for credit card fraud detection (CCFD) based on principal component analysis (PCA) for FS and using supervised ML techniques such as K-nearest neighbour (KNN), ridge classifier, gradient boosting, quadratic discriminant analysis, AdaBoost, and random forest for classification of fraudulent and legitimate transactions. When compared to earlier experiments, the suggested approach demonstrates a high capacity for detecting fraudulent transactions. To be more precise, our model’s resilience is constructed by integrating the power of PCA for determining the most useful predictive features. The experimental analysis was performed on German credit card and Taiwan credit card data sets. Findings: The experimental findings revealed that the KNN achieved an accuracy of 96.29%, recall of 100%, and precision of 96.29%, which is the best performing model on the German data set. While the ridge classifier was the best performing model on Taiwan Credit data with an accuracy of 81.75%, recall of 34.89, and precision of 66.61%. Practical implications: The poor performance of the models on the Taiwan data revealed that it is an imbalanced credit card data set. The comparison of our proposed models with state-of-the-art credit card ML models showed that our results were competitive.
Year of publication: |
2022
|
---|---|
Authors: | Saheed, Yakub Kayode ; Baba, Usman Ahmad ; Raji, Mustafa Ayobami |
Published in: |
Big data analytics in the insurance market. - Bingley, U.K. : Emerald Publishing Limited, ISBN 978-1-80262-639-1. - 2022, p. 31-56
|
Subject: | Kreditkarte | Credit card | Künstliche Intelligenz | Artificial intelligence | Data Mining | Data mining | Big Data | Big data | Betrug | Fraud |
Saved in:
Saved in favorites
Similar items by subject
-
Application and use of Big Data and Artificial Intelligence (AI) in fraud detection and avoidance
Halbouni, Sawsan Saadi, (2020)
-
Approaches for identifying U.S. medicare fraud in provider claims data
Herland, Matthew, (2020)
-
Efficient big data model selection with applications to fraud detection
Vaughan, Gregory, (2020)
- More ...