Analysis and modelling of the daily observations is of the interest for both academic and practical needs during the worst public health crisis in decades. In this paper we propose a Boosting-based Quantile Autoregressive Tree (BQART) model to estimate the evolution in reported cases and fatality of the COVID-19 pandemic. The proposed approach benefit from the boosting methodology and the additive quantile regression to overcome challenges of unknown probabilistic distribution in the autoregressive variable and location shift in the observed data. The simple additive structure and binary autoregressive tree representation further improve the interpretability of the model and help to clearly illustrate the results.
The estimated results for the USA and Singapore were discussed in details with more results for other countries in the appendix. While the shape and structure of estimated trees represent the autoregressive properties observed in the data, the model output helps to demonstrate improved accuracy in time series forecasting and analysis. These results should encourage the use of machine learning based tree ensembles in time-series modelling where model performance and interpretability is sought.
Table of Contents
1. Overview
2. Methodology Overview
3. Estimated results
4. Conclusion and Discussion
Appendix
A. Rate of change in Total Cases (rC) results for Brazil, Russia, India and UK
B. Rate of change in Fatality Rate (rf) results for Brazil, Russia, India and UK
Research Objectives and Topics
This paper aims to address the challenges of modelling the dynamics of the COVID-19 pandemic by proposing a Boosting-based Quantile Autoregressive Tree (BQART) model. The primary research goal is to accurately estimate the evolution of reported cases and fatality rates in time series data while overcoming issues related to unknown probability distributions, non-linearity, and location shifts.
- Application of boosting methodology to time series forecasting.
- Utilization of Quantile Autoregression to capture non-linear relationships.
- Implementation of Autoregressive Tree models for improved interpretability.
- Comparative analysis of COVID-19 epidemic trends across the world's most affected countries.
Excerpt from the Book
Quantile Autoregressive Tree
The Quantile Autoregressive Tree (QART) is a special case of the general Autoregressive Trees (ART), where quantile estimates are used to split the branches instead of probabilistic coefficients, also the leaves of the tree are QAR estimators. Meek et al. 11 proposed and applied the ART approach to model the evolution of values of an univariate time series from a data mining perspective.
While methods such as traditional decision tree based approaches struggle to capture the serial dependence due to i.i.d assumptions, autoregressive trees link the observations observed in lagged variables to track the trend, this is particularly useful for the ongoing pandemic when a single mass gathering event could lead to a short period of increase in new cases.
In this paper, the QART model is an additive linear representation of the base QAR models, the branches are binary piecewise functions defined on each additional autoregressive variable. The shape and structure of the estimated trees help to interpret the data and improve quality of the forecast with straightforward tracking of quantile location.
Define the height of tree with J, Figure 1 is an example decision tree where J = 6, the split criterion is a comparison of the target variable and the factor under consideration at that particular height level of the tree.
Summary of Chapters
1. Overview: Introduces the methodology for estimating epidemic trends using boosting-based decision trees and outlines the data sources and analytical targets.
2. Methodology Overview: Details the theoretical framework, including Quantile Autoregression and the structure of the Autoregressive Tree, used to model pandemic dynamics.
3. Estimated results: Presents and discusses the model performance, focusing on the USA and Singapore as case studies to demonstrate tree structures and forecast accuracy.
4. Conclusion and Discussion: Summarizes the effectiveness of the BQART method in handling non-linear epidemiological data and suggests future directions for model improvement.
Appendix: Provides extensive supplementary visual data, including pruned trees and distribution results for the top affected countries.
Keywords
Ensemble Learning, Boosting, Quantile Autoregression, Quantile Autoregressive Tree, Time Series Analysis, COVID-19, Machine Learning, Forecasting, Epidemiological Modeling, Autocorrelation, Stationarity, Model Interpretability.
Frequently Asked Questions
What is the primary focus of this research paper?
The paper focuses on developing a new machine learning approach, the Boosting-based Quantile Autoregressive Tree (BQART) model, to analyze and forecast daily COVID-19 pandemic data.
What are the central thematic areas?
The central themes include time series forecasting, non-linear dynamics, ensemble learning, and the application of statistical methods to public health data during a global crisis.
What is the core objective of the model?
The objective is to accurately estimate the rate of change in total confirmed cases and fatality rates by bypassing restrictive probabilistic assumptions found in traditional models.
Which scientific methodology is employed?
The study employs a combination of boosting algorithms, Quantile Autoregression (QAR), and Autoregressive Tree (ART) structures to perform multi-step non-linear time series forecasting.
What topics are covered in the main body?
The main body covers the data processing, the mathematical formulation of the BQART estimator, model specification, and an in-depth empirical evaluation using data from the most heavily affected countries.
Which keywords best characterize this work?
The work is characterized by terms like Ensemble Learning, Boosting, Quantile Autoregressive Tree, Time Series Analysis, and COVID-19.
How does the BQART model differ from traditional autoregressive models?
Unlike traditional models, BQART does not require prior assumptions about the probability distribution of the variables, making it more robust for evolving situations like an ongoing pandemic.
What role do the pruned trees play in the results?
The pruned trees serve as visual diagnostic tools that help interpret the model's structure and show how specific lagged observations influence the current epidemic trajectory.
- Quote paper
- Yang Liu (Author), 2020, A Boosting-based Quantile Autoregressive Tree Model for the COVID-19 Time Series, Munich, GRIN Verlag, https://www.grin.com/document/923166