Accident data analysis is one of the prime interests in the present era. Analysis of accident is very essential because it can expose the relationship between the different types of attributes that commit to an accident. Road, traffic and airplane accident data have different nature in comparison to other real world data as accidents are uncertain. Analyzing diverse accident dataset can provide the information about the contribution of these attributes which can be utilized to deteriorate the accident rate. Nowadays, Data mining is a popular technique for examining the accident dataset. In this study, Association rule mining, different classification, and clustering techniques have been implemented on the dataset of the road, traffic accidents, and an airplane crash. Achieved result illustrated accuracy at a better level and found many different hidden circumstances that would be helpful to deteriorate accident ratio in near future.
Table of Contents
Chapter 1: INTRODUCTION
1.1 Background
1.2 Overview of Data Mining Techniques
1.2.1 Clustering
1.2.2 Classification
1.2.3 Association rule mining
1.3 Challenges in accident
1.4 Objective
1.5 Organization of Thesis
Chapter 2: LITERATURE SURVEY
2.1 Introduction
2.2 Factors responsible for accident
2.3 Traditional Statistical approach for accident analysis
2.4 Data Mining approaches for Accident Analysis
Chapter 3: METHODOLOGY AND DATA COLLECTION
3.1 Introduction
3.2 Proposed Methodology
3.2.1 K-modes clustering
3.2.2 Self-Organizing Map (SOM)
3.2.3 Hierarchical Clustering
3.2.4 Latent Class Clustering (LCC)
3.2.5 BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies)
3.2.6 Support Vector Machine (SVM)
3.2.7 Naïve Bayes (NB)
3.2.8 Decision Tree
3.2.9 Multilayer Perceptron
3.2.10 Association Rule Mining
3.2.11 Cluster Selection Criteria
3.2.12 Accuracy Measurement
3.3 Data Collection
3.3.1 Description of dataset for Result No. 1, 2 3
3.3.2 Description of dataset for Result No. 4
3.3.3 Description of dataset for Result No. 5
Chapter 4: ANALYSIS AND RESULTS
4.1 Introduction
4.2 Result No. 1 (Road-user Specific Analysis of Traffic Accident using Data Mining Techniques)
4.2.1 Classification Analysis
4.2.2 Classification followed by clustering of accident
4.2.3 Analysis
4.3 Result No. 2 (Performance Evaluation of Lazy, Decision Tree Classifier and Multilayer Perceptron on Traffic Accident Analysis)
4.3.1 Direct Classification Analysis
4.3.2 Classification followed by clustering techniques
4.3.3 Analysis
4.4 Result No. 3 (A Conjoint Analysis of Road Accident Data using K-modes clustering and Bayesian Networks)
4.4.1 Cluster Analysis
4.4.2 Performance Evaluation of Bayesian Network
4.4.3 Analysis
4.5 Result No. 4 (Augmenting Classifiers Performance through Clustering: A Comparative Study on Road Accident Data)
4.5.1 Cluster Analysis
4.5.2 Classification Analysis
4.5.3 Analysis
4.5 Result No. 5 (Analysis of Airplane crash by utilizing Text Mining Techniques)
4.5.1 Cluster Analysis
4.5.2 Association Rule Mining
4.5.3 Analysis
Chapter 5: CONCLUSION AND RECOMMENDATIONS
Research Objectives and Thematic Focus
The primary objective of this thesis is to improve the accuracy of accident analysis and identify critical factors contributing to traffic and airplane accidents. By applying various data mining techniques—specifically clustering, classification, and association rule mining—the research seeks to reduce accident ratios, save lives, and mitigate economic losses associated with these incidents. The work focuses on overcoming the inherent heterogeneity of accident datasets to extract actionable insights that can inform preventive safety measures.
- Application of clustering techniques (K-modes, SOM, LCC, BIRCH) to segment heterogeneous accident data into homogeneous groups.
- Evaluation of classification models (SVM, Naïve Bayes, Decision Trees, Multilayer Perceptron) to predict accident severity.
- Comparative analysis of classification performance before and after data clustering.
- Utilization of association rule mining to identify patterns and relationships between accident attributes.
- Investigation of airplane crash data using text mining and cluster analysis to determine common causes.
Excerpt from the Book
1.1 Background
The accident has been the major reason for untimely death as well as damage to property and economic losses around the world. There are a lot of people die every year in a different type of accident. Hence, traffic authority gives generous attempt to reduce the accident but still, there is no such lessening in accident rate since in these analyzed years. The accident is unpredictable and undetermined. Hence, analysis of accident requires the comprehension of circumstance which is affecting them. Data Mining [1, 2, 28, 29, 30] has pulled in a lot of consideration in the IT industries as well as in public arena because of the extensive accessibility of vast quantity of data. So, it’s necessary to transform these data into applicable knowledge and information. This applicable knowledge and information may be utilized to implement in different areas such as marketing, road accident analysis fraud detection and so on [8].
Road and traffic accident are one of the critical issues over the world. Lessening accident proportion is the best to approach to enhance traffic safety. There are diverse research has been done in many countries in traffic and road accident analysis by utilizing a different type of data mining approaches. Many researchers proposed their work in order to deteriorate the accident ratio by identifying risk factors which particularly impact in the accident [3-7].
Transportation frameworks itself is not in charge of these diverse crashes but rather a few different circumstances [12, 13]. These circumstances can be characterized as natural elements, for example, climate and temperature, road particular circumstances, for example, street sort, street width, and street bear width, human circumstances i.e. wrong side driving, abundance driving velocity and different variables. At whatever point an accident occurred in any street over the world, some of these accident circumstances are included. Likewise, these factors and their impact on the accident are not comparative in all nations; but rather they affected each accident in various nations in various ways.
Summary of Chapters
Chapter 1: INTRODUCTION: This chapter introduces the global impact of accidents and outlines the thesis objective of applying data mining to identify critical accident factors.
Chapter 2: LITERATURE SURVEY: This chapter reviews existing statistical and data mining research concerning accident analysis and factors affecting crash severity.
Chapter 3: METHODOLOGY AND DATA COLLECTION: This chapter details the machine learning approaches and datasets used in the study, including clustering and classification techniques.
Chapter 4: ANALYSIS AND RESULTS: This chapter presents the experimental findings across five distinct research segments, demonstrating the improvements in classification accuracy after data clustering.
Chapter 5: CONCLUSION AND RECOMMENDATIONS: This chapter summarizes the findings, confirming that clustering enhances classification performance and offers insights into accident prevention.
Keywords
road and traffic accident, airplane crash, data mining, clustering techniques, classification techniques, association rule mining, accident rate, predictive modeling, machine learning, heterogeneous data, safety analysis
Frequently Asked Questions
What is the core focus of this thesis?
The research focuses on using data mining techniques to analyze heterogeneous accident datasets (road, traffic, and airplane) to identify risk factors and improve prediction accuracy.
What are the primary themes of the work?
The work revolves around the integration of clustering and classification algorithms to reduce data heterogeneity and validate the performance of models in predicting accident severity.
What is the main objective of this research?
The primary objective is to enhance the accuracy of accident analysis models to help authorities identify factors that lead to crashes, thereby facilitating effective preventive measures.
Which scientific methods are employed?
The research employs supervised learning (SVM, Naïve Bayes, Decision Trees) and unsupervised learning (K-modes, LCC, BIRCH, SOM) along with association rule mining.
What is the focus of the main body of the work?
The main body (Chapter 4) provides a detailed comparative analysis of different machine learning classifiers on various datasets, demonstrating that preprocessing with clustering yields better results.
Which keywords characterize this work?
Key terms include data mining, clustering, classification, association rule mining, road accident analysis, and airplane crash investigation.
How does clustering improve the results?
Clustering effectively segments the highly heterogeneous accident data into smaller, more homogeneous groups, which allows classifiers to better identify underlying patterns and improve predictive accuracy.
What unique insight does the study provide regarding airplane crashes?
The study utilizes text mining techniques on airplane accident summary data to identify common terms associated with crashes, linking factors like engine failure, pilot error, and poor weather conditions to crash events.
- Citar trabajo
- Prayag Tiwari (Autor), 2017, Accident Analysis by Using Data Mining Techniques, Múnich, GRIN Verlag, https://www.grin.com/document/386836