In mathematical statistics, detecting changes in parameters of real-life data series, known as change-point problems, is crucial. Originating in quality control during the 1950s, these problems have widespread applications today, spanning fields like economics, finance, medicine, and geology. In finance, fluctuations in asset returns can violate assumptions of constant variance, leading to inaccurate forecasts.
Chapter 2 briefly discusses the univariate case, focusing on detecting changes in mean and variance parameters over time. The Cumulative Sums (CUSUM) test statistics, derived from likelihood ratios, serve as change-point estimators. However, their asymptotic distribution complexity and slow convergence limit applicability to small sample sizes. Nevertheless, asymptotic quantiles help determine if changes have occurred.
Chapter 3 extends this analysis to the multivariate case, specifically addressing changes in covariance matrices. Estimating the covariance matrix, particularly in scenarios with many variables and few observations, poses challenges. Shrinkage estimators, like the Ledoit-Wolf (LW) estimator, offer improvements over sample covariance matrices, especially in small sample sizes. The Rao-Blackwell theorem leads to the development of the Rao-Blackwellized Ledoit-Wolf (RBLW) estimator, enhancing performance under Gaussian assumptions.
A simulation study in Chapter 5 demonstrates the effectiveness of using these shrinkage estimators in detecting change-points, resulting in improved test power and accuracy. However, due to the absence of an asymptotic distribution for the test statistics, quantiles must be obtained through simulation.
Table of Contents
Introduction
1 Basics
1.1 Some probability theory
1.1.1 Elementary concepts
1.1.2 Stochastic processes
1.2 Some mathematical statistics
1.2.1 Properties of estimators
1.2.2 Test theory
1.3 Some linear algebra
2 Changes in univariate data
2.1 Change in mean
2.1.1 Testing problem
2.1.2 Log likelihood approach
2.1.3 Asymptotic distribution
2.1.4 Appropriate variance estimators
2.2 Change in variance
2.2.1 Testing problem
2.2.2 Log likelihood approach
2.2.3 Critical values
2.2.4 General approach
3 Changes in multivariate data
3.1 Introduction
3.2 The sample covariance matrix
3.3 Log likelihood approach
3.3.1 Preliminary work
3.3.2 LRT for full rank matrices
3.3.3 LRT for singular matrices
3.4 Conclusion
4 The Shrinkage Estimator
4.1 Introduction
4.2 Asymptotic framework
4.3 Some estimators of the covariance matrix
4.3.1 Haff estimator
4.3.2 SteinHaff estimator
4.3.3 Minimax estimator
4.4 Bias-variance trade-off
4.5 Eigenvalue dispersion
4.6 Optimal linear shrinkage
4.7 Analysis under general asymptotics
4.7.1 The behavior of the sample covariance matrix
4.7.2 Consistency of U*
4.8 Invertibility and condition number
4.9 The shrinkage estimator under Gaussian assumption
4.9.1 The RBLW estimator
4.9.2 The OAS estimator
4.10 Conclusion
5 Simulation Study
5.1 The shrinkage estimators in comparison
5.2 Basics of the simulation study
5.3 Change-point detection using the sample covariance matrix
5.3.1 Variance shifts
5.3.2 Covariance shifts
5.3.3 Variance-covariance shifts
5.4 Change-point detection using the shrinkage estimators
5.4.1 The test statistic
5.4.2 Variance shifts
5.4.3 Covariance shifts
5.4.4 Variance-covariance shifts
5.5 Location of the change-points
Objectives & Topics
This thesis aims to develop and analyze mathematical tools for detecting change-points in the covariance matrices of multivariate time series, particularly in high-dimensional settings where the sample covariance matrix performs poorly. It investigates whether utilizing shrinkage estimators instead of the standard sample covariance matrix can enhance the power of statistical change-point tests.
- Statistical change-point detection in univariate and multivariate datasets.
- Analysis of the sample covariance matrix and its limitations in high-dimensional, small-sample scenarios.
- Implementation and comparison of shrinkage estimators (LW, RBLW, OAS) to improve covariance estimation.
- Evaluation of test statistic performance through extensive simulation studies regarding power and change-point localization.
Excerpt from the Book
4.1 Introduction
In the previous chapter, we have already introduced a tool to detect changes in the covariance matrix of a process. The difficulty was to estimate the unknown true covariance matrix Σ. When the matrix dimension p is large compared to the sample size n, the usual estimator - the sample covariance matrix S - is extremely unstable because of the large number of unknowns involved. Moreover, when p > k − 1, S1 is no longer invertible. In other words, one gets zero eigenvalues which make the considered statistic unbounded. The way out was to consider only the r1 = min(p, k−1) positive eigenvalues and to take the product of them. Our aim now is to get p positive eigenvalues, i.e. a positive definite matrix which is always invertible. Furthermore, when the ratio p/n is less than one but not negligible, S is invertible but numerically ill-conditioned. More specifically, this means that inverting it amplifies estimation error. The larger p becomes the harder it is to find enough observations to make p/n negligible. Therefore, it seems favorable to also develop a well-conditioned estimator. In this context, we will take a closer look at the condition number of a matrix.
The estimator described in this chapter has these desired properties asymptotically. It is called the shrinkage estimator. The idea is to take a weighted average of a structured estimator, called shrinkage target, and the sample covariance matrix. In other words, one shrinks towards a structured matrix. The weight or shrinkage intensity, labeled as α, controls how much one shrinks the sample covariance matrix towards the target. The idea behind that is that the sample covariance matrix has too little structure. The cure is to impose some structure on the estimator. In general, the shrinkage target should fulfill two requirements: it involves only a small number of free parameters, which means that it has a lot of structure, but it also reflects important characteristics of the unknown quantity being estimated (nevertheless, we will not pay attention at the latter requirement as stated next).
Summary of Chapters
1 Basics: Provides foundational definitions in probability theory, stochastic processes, mathematical statistics, and linear algebra required for the subsequent analysis.
2 Changes in univariate data: Discusses hypothesis testing for detecting shifts in the mean or variance of univariate time series, primarily employing likelihood ratio test statistics.
3 Changes in multivariate data: Extends the change-point analysis to multivariate settings, introducing the likelihood ratio test for covariance matrix shifts and addressing rank-deficiency issues.
4 The Shrinkage Estimator: Introduces and analyzes various shrinkage estimators, including LW, RBLW, and OAS, to overcome the instability of the sample covariance matrix in high dimensions.
5 Simulation Study: Presents empirical results comparing the performance of standard and shrinkage-based test statistics in terms of power and accuracy of change-point location detection.
Keywords
Change-point detection, Covariance matrix, Shrinkage estimator, High-dimensional statistics, Likelihood ratio test, LW estimator, RBLW estimator, OAS estimator, Asymptotic behavior, Covariance structure, Mean squared error, Eigenvalue dispersion, Invertibility, Multivariate norm, Simulation study
Frequently Asked Questions
What is the core focus of this thesis?
The thesis focuses on detecting change-points in the covariance matrices of multivariate data, particularly when the data dimensions are large relative to the sample size.
What are the primary challenges addressed?
It addresses the instability of the sample covariance matrix, such as non-invertibility and high estimation error, which occur in high-dimensional or small-sample datasets.
What is the main objective of using shrinkage estimators?
The goal is to replace the unstable sample covariance matrix with a well-conditioned, always invertible shrinkage estimator to improve the statistical power of testing for change-points.
Which mathematical methods are employed?
The thesis uses hypothesis testing, specifically likelihood ratio tests, alongside linear shrinkage techniques and Monte Carlo simulation for validating performance.
What is covered in the simulation study?
The simulation study examines the power and empirical levels of tests when applied to various covariance structures and compares the accuracy of detecting change-point locations.
Which specific shrinkage estimators are analyzed?
The thesis evaluates the Ledoit-Wolf (LW) estimator, the Rao-Blackwell Ledoit-Wolf (RBLW) estimator, and the Oracle-Approximating Shrinkage (OAS) estimator.
Why are standard sample covariance matrices problematic for change-point detection?
They become singular or rank-deficient when the dimension p is large, leading to unbounded test statistics that cannot reliably identify change-points.
How does the OAS estimator compare to others in the study?
The simulation results consistently show that the OAS estimator yields higher power and better localization accuracy, especially in small-sample data scenarios.
- Quote paper
- Mounir Zahnouni (Author), 2012, Shrinkage for Stabilizing the Detection of Changepoints in Covariances for High-Dimensional Data, Munich, GRIN Verlag, https://www.grin.com/document/1463620