Regression Analysis is an important statistical tool for many applications. The most
frequently used approach to Regression Analysis is the method of Ordinary Least Squares.
But this method is vulnerable to outliers; even a single outlier can spoil the estimation
completely. How can this vulnerability be described by theoretical concepts and are there alternatives? This thesis gives an overview over concepts and alternative approaches.
The three fundamental approaches to Robustness (qualitative-, infinitesimal- and quantitative Robustness) are introduced in this thesis and are applied to different estimators. The estimators under study are measures of location, scale and regression. The Robustness approaches are important for the theoretical judgement of certain estimators but as well for the development of alternatives to classical estimators. This thesis focuses on the (Robustness-) performance of estimators if outliers occur within the data set. Measures of location and scale provide necessary steppingstones into the topic of Regression Analysis. In particular the median and trimming approaches are found to produce very robust results.
These results are used in Regression Analysis to find alternatives to the method of Ordinary Least Squares. Its vulnerability can be overcome by applying the methods of Least Median of Squares or Least Trimmed Squares. Different outlier diagnostic tools are introduced to improve the poor efficiency of these Regression Techniques. Furthermore, this thesis delivers a simulation of some Regression Techniques on different situations in Regression Analysis.
This simulation focuses in particular on changes in regression estimates if outliers occur in the data.
Theoretically derived results as well as the results of the simulation lead to the
recommendation of the method of Reweighted Least Squares. Applying this method
frequently on problems of Regression Analysis provides outlier resistant and efficient
estimates.
Inhaltsverzeichnis (Table of Contents)
- Introduction
- The Classical Linear Ordinary Least Squares Regression
- Introduction to OLS
- Properties of the Least Squares Estimates
- Problems of OLS
- Outliers and OLS
- Outlier definition and common error sources
- Outlier in Regression Analysis and their influence on OLS results
- The concept of Robustness
- Introduction to Robustness
- Qualitative Robustness
- Infinitesimal Robustness
- Quantitative Robustness
- Robust Estimates
- On asymptotic Results
- Some measures of location and scale - with regard to their robustness properties
- Introduction
- Measures of location
- A Definition
- The Arithmetic Mean
- The Median
- Trimmed mean(s)
- Other measures of location
- Measures of scale
- A Definition
- The Standard deviation
- The Median Absolute Deviation (MAD)
- The t-Quantile Range
- Other scale estimates
- Higher Dimensions
- Robust Regression Techniques
- An Introduction and Definition
- M-Estimates
- The Repeated Median
- The Least Median of Squares Regression
- The Least Trimmed Squares Regression
- The Coakley – Hettmansperger Estimator
- Reweighted Least Squares
- The Multivariate Reweighted Least Squares Approach
- The Hat Matrix
- The Minimum Volume Ellipsoid Estimator
- Other Regression Methods and Limitations
- Conclusions on Robust Regression
- Application to SAS and Simulation
- Introduction to Robustness Application and Simulation purposes
- The initial data set – The zero contamination case
- Seemingly negligible contamination in X-direction
- Seemingly negligible contamination in Y-direction
- High Leverage contamination
- Large overall contamination
- Conclusions
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This thesis explores robust methods in regression analysis, addressing the vulnerability of the traditional Ordinary Least Squares (OLS) method to outliers. It aims to provide a comprehensive overview of theoretical concepts and alternative approaches to robust regression, examining their performance in the presence of outliers.
- The limitations of OLS in the face of outliers
- Different approaches to robustness: qualitative, infinitesimal, and quantitative
- Robust measures of location and scale: median, trimmed means, MAD, and others
- Robust regression techniques: Least Median of Squares, Least Trimmed Squares, and Reweighted Least Squares
- Evaluation of the performance of robust methods through simulations
Zusammenfassung der Kapitel (Chapter Summaries)
- Introduction: Introduces the concept of outliers and their potential impact on regression analysis, highlighting the need for robust methods.
- The Classical Linear Ordinary Least Squares Regression: Provides an overview of OLS, its properties, and limitations, particularly its sensitivity to outliers.
- Outliers and OLS: Defines outliers and discusses their sources, illustrating how outliers can significantly influence OLS results.
- The concept of Robustness: Introduces the concept of robustness, outlining its three main approaches: qualitative, infinitesimal, and quantitative, and their implications for estimator selection.
- Some measures of location and scale - with regard to their robustness properties: Examines the robustness properties of different measures of location (mean, median, trimmed means) and scale (standard deviation, MAD, t-quantile range), laying the groundwork for robust regression.
- Robust Regression Techniques: Explores various robust regression techniques such as M-estimates, Least Median of Squares, Least Trimmed Squares, and Reweighted Least Squares, highlighting their strengths and weaknesses.
- Application to SAS and Simulation: Demonstrates the practical implementation of robust regression methods using SAS software and showcases their performance through simulations in different outlier scenarios.
Schlüsselwörter (Keywords)
This work focuses on robust methods in regression analysis, specifically addressing the impact of outliers on statistical estimations. Key concepts include robust estimation, outlier detection, qualitative and quantitative robustness, measures of location and scale, M-estimation, Least Median of Squares, Least Trimmed Squares, Reweighted Least Squares, and simulation analysis.
- Quote paper
- Robert Finger (Author), 2006, Robust Methods in Regression Analysis – Theory and Application, Munich, GRIN Verlag, https://www.grin.com/document/73282