In this thesis, I address this heuristic crisis with the development of a fully-automated machine learning framework capable of optimizing arbitrary econometric state space ARIMA methods in a completely data-driven manner. With this framework, I compare the predictions of a model portfolio consisting of all 8 possible combinations of a standard ARIMA, a seasonal SARIMA, an ARIMAX model with socio-economic variables, and an ARIMAX model with conflict indicators of neighboring countries as exogenous predictors.
In addition, each model is examined on a monthly and quarterly periodicity. By comparing the out-of-sample prediction errors, I find that this approach can beat the no-change heuristic in the country-level one-year ahead prediction of the log change of conflict fatalities in all metrics used, including the TADDA score.
While the urgency for early detection of crises is increasing, truly reliable conflict prediction systems are still not in place, despite the emergence of better data sources and the use of state-of-the art machine learning algorithms in recent years. Researchers still face the rarity of conflict onset events, which makes it difficult for machine learning-based systems to detect crucial escalation or de-escalation signals. As a result, prediction models can be outperformed by naive heuristics, such as the no-change model, which leads to a lack of confidence and thus limited practical usability.
Inhaltsverzeichnis (Table of Contents)
- Abstract
- Lists of Abbreviations and Symbols
- 1 Introduction
- 2 Method
- 2.1 State Space Modelling Approach
- 2.1.1 Linear Gaussian State Space Model
- 2.1.2 ARIMA State Space Model
- 2.1.3 SARIMA State Space Model
- 2.1.4 (S) ARIMAX Regression with (S) ARIMA errors
- 2.1.5 Kalman Filter
- 2.2 No-Change Baseline Model
- 2.3 Evaluation Metrics
- 2.3.1 TADDA
- 2.3.2 Mean Absolute Error (MAE)
- 2.3.3 Root Mean Square Error (RMSE)
- 2.1 State Space Modelling Approach
- 3 Data
- 3.1 Armed Conflict Location and Event Data Project (ACLED)
- 3.1.1 Analysis of Conflict Incidence and Country Categorization
- 3.2 International Monetary Fund - World Economic Outlook Database (IMF-WEO)
- 3.3 World Bank World Development Indicators (WB-WDI)
- 3.4 Variable Overview and Missing Data
- 3.1 Armed Conflict Location and Event Data Project (ACLED)
- 4 Implementation in Python
- 4.1 No-Change Forecaster Class
- 4.2 State Space ARIMA Forecaster Class
- 4.3 Grid Search Class
- 4.4 Automated Model Building Process
- 5 Results
- 5.1 Global Model Performances
- 5.2 Country-Level Model Performances
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This thesis aims to develop a fully automated machine learning framework for optimizing econometric state space ARIMA models in a data-driven manner. The framework compares predictions of a model portfolio, including standard ARIMA, seasonal SARIMA, ARIMAX models with socio-economic variables, and ARIMAX models with conflict indicators. The work addresses the challenge of predicting conflict onset events, which are rare and difficult to detect using traditional machine learning techniques.
- Development of an automated machine learning framework for optimizing state space ARIMA models
- Comparison of different ARIMA model variations for conflict prediction
- Evaluation of model performance using out-of-sample prediction errors
- Assessment of the ability to outperform naïve heuristics in conflict prediction
- Analysis of country-level model performances
Zusammenfassung der Kapitel (Chapter Summaries)
The Introduction chapter sets the context for the research by discussing the urgency for early conflict detection and the challenges of existing prediction systems. Chapter 2 delves into the methodology, explaining the state space modeling approach, the specific ARIMA models used, the evaluation metrics, and the no-change baseline model. Chapter 3 focuses on the data used, describing the sources, selection process, and variable overview. Chapter 4 details the implementation of the machine learning framework in Python, including the different forecaster classes and the automated model building process. Finally, Chapter 5 presents the results, comparing the performance of the different models and analyzing their predictive capabilities at both the global and country levels.
Schlüsselwörter (Keywords)
This thesis focuses on the following keywords: conflict prediction, state space ARIMA models, automated machine learning, model portfolio, out-of-sample prediction errors, naïve heuristics, conflict incidence, socio-economic indicators, conflict indicators, TADDA score, ACLED data, IMF World Economic Outlook Database, World Bank World Development Indicators.
- Quote paper
- Adrian Leon Scholl (Author), 2022, Development of an Automated Conflict Prediction System. State Space ARIMA Methods, Munich, GRIN Verlag, https://www.grin.com/document/1325354