The study investigates whether a machine learning algorithm can be used to detect fraud attempts and how a fraud management system based on machine learning might work. For fraud detection, most institutions rely on rule-based systems with manual evaluation. Until recently, these systems had been performing admirably. However, as fraudsters become more sophisticated, traditional systems' outcomes are becoming inconsistent.
Fraud usually comprises many methods that are used repeatedly that's why looking for patterns is a common emphasis for fraud detection. Data analysts can, for example, avoid insurance fraud by developing algorithms that recognize trends and abnormalities. AI techniques used to detect fraud include Data mining classifies, groups, and segments data to search through millions of transactions to find patterns and detect fraud.
The scientific paper discusses machine learning methods to detect fraud detection with a case study and analysis of Kaggle datasets.
Inhaltsverzeichnis (Table of Contents)
- Introduction
- Objective
- Literature Review
- Related Work
- Machine learning approaches
- Logistic Regression
- Decision Tree
- Working of Decision Tree
- Random Forest
- Support Vector Machines (SVM)
- K-Nearest Neighbours (KNN)
- Gradient Boosted Trees
- Research Method Data Challenges
- Recent Fraud Cases
- CRISP-DM Model
- Business Understanding
- Data Understanding
- Data Preparation
- Data Modelling
- Model evaluation
- Model Deployment
- Methodology and Case Study
- Banking Theory
- Data Description
- Data Preparation
- Scaling the data
- Missing values handling
- Dropping NA
- Data encoding
- Data Visualisation
- Univariate Analysis
- Histograms
- Boxplot
- Bivariate analysis
- Correlation
- Summary from EDA
- Feature Selection
- ANOVA
- Model Comparison and Results
- Logistic Regression
- Decision Tree
- Random Forest
- XGBBoost
- GradientBoosting
- LGBMclassifier
- Classification Evaluation Metrics
- Confusion Matrix
- Precision
- Recall
- F1 score
- AUC-ROC
- Receiver operating characteristic (ROC) Curve
- Accuracy
- Imbalanced Data
- Possible next steps
- Summary and Conclusion
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This master thesis investigates the application of machine learning techniques in detecting fraudulent banking transactions. The primary objective is to develop and evaluate a robust fraud detection model capable of accurately identifying suspicious transactions in real-time.
- Machine learning algorithms for fraud detection
- Data preprocessing and feature engineering for fraud detection
- Performance evaluation of different machine learning models
- Understanding and addressing challenges related to imbalanced data in fraud detection
- Exploring potential future directions for improving fraud detection systems
Zusammenfassung der Kapitel (Chapter Summaries)
The thesis begins with a comprehensive introduction, defining the problem of fraudulent transactions and highlighting the significance of machine learning in this domain. The subsequent chapter delves into the objectives of the study, outlining the research questions and methodologies employed.
Chapter 3 provides a thorough literature review, examining existing research on fraud detection, discussing various machine learning approaches, and highlighting recent fraud cases. It further explores the CRISP-DM model, a structured data mining process that guides the development and deployment of fraud detection systems.
Chapter 4 focuses on the methodology and case study. It discusses banking theory, data description, and detailed data preparation techniques, including data scaling, handling missing values, and encoding categorical features. The chapter further presents a thorough analysis of the data, including univariate and bivariate analysis, feature selection, and model comparison. Finally, it evaluates the performance of various machine learning models using different classification evaluation metrics.
Schlüsselwörter (Keywords)
Fraud detection, machine learning, banking transactions, data preprocessing, feature engineering, model evaluation, imbalanced data, CRISP-DM, logistic regression, decision tree, random forest, XGBoost, gradient boosting, LGBM, confusion matrix, precision, recall, F1 score, AUC-ROC, accuracy.
- Quote paper
- Riwaj Kharel (Author), 2022, Machine Learning Approach to Detect Fraudulent Banking Transactions, Munich, GRIN Verlag, https://www.grin.com/document/1275894