Grin logo
de en es fr
Shop
GRIN Website
Texte veröffentlichen, Rundum-Service genießen
Zur Shop-Startseite › Mathematik - Statistik

New methods for generating significance levels from multiply-imputed data

Titel: New methods for generating significance levels from multiply-imputed data

Doktorarbeit / Dissertation , 2010 , 103 Seiten , Note: summa cum laude

Autor:in: Christine Aust (Autor:in)

Mathematik - Statistik
Leseprobe & Details   Blick ins Buch
Zusammenfassung Leseprobe Details

Missing data are an ubiquitous problem in statistical analyses that has become an important research field in applied statistics. A highly useful technique to handle missing values in many settings is multiple imputation, that was first proposed by Rubin (1977, 1978) and extended in Rubin (1987). Due to the ongoing improvement in computer power in the last 10 years, multiple imputation has become a well known and often used tool in statistical analyses.

However, there still exists a problem in generally obtaining significance levels from multiply-imputed data, because the application of multiple imputation requires normally distributed or t-distributed complete-data estimators. Today there are basically three methods that extend the suggestions given in Rubin (1987). First, Li, Raghunathan, and Rubin (1991) proposed a procedure, where significance levels are created by computing a modified Wald-test statistic that is then referred to an F-distribution. This procedure is essentially calibrated and the loss of power due to a finite number of imputations is quite modest in cases likely to occur in practice. But this procedure requires access to the completed-data estimates and their variance-covariance matrices, that may not be available in practice with standard software. Second, Meng and Rubin (1992) proposed a complete-data two-stage-likelihood-ratio-test-based procedure that in large samples is equivalent to the previous one. This procedure requires access to the code for the calculation of the log-likelihood-ratio statistics. Common statistical software does not provide access to the code in their standard analyses routines. Third, Li, Meng, Raghunathan, and Rubin (1991) developed an improved version of a method in Rubin (1987) that only requires the chi-square-statistics from a usual complete-data Wald-test. This method is only approximately calibrated and has a substantial loss of power compared to the previous two.

To sum, there exist several procedures to generate significance levels in general from multiply-imputed data, but none of them has satisfactory applicability due to the facts mentioned above. Since many statistical analyses are based on hypothesis tests, especially on the Wald-test in regression analyses, it is very important to find a method that retains the advantages and overcomes the disadvantages of the existing procedures. Developing such a method was the aim of the present thesis.

Leseprobe


Table of Contents

1 Introduction

2 Multiple imputation

3 Significance levels from multiply-imputed data

3.1 Significance levels from multiply-imputed data using moment-based statistics and an improved F-reference-distribution

3.2 Significance levels from multiply-imputed data using parameter estimates and likelihood-ratio statistics

3.3 Significance levels from repeated p-values with multiply-imputed data

4 z-transformation procedure for combining repeated p-values

4.1 The new z-transformation procedure

4.2 z-test

4.3 t-test

4.4 Wald-test

5 How to handle the multi-dimensional test problem

5.1 Idea

5.2 Simulation study

5.3 Further problems

6 Small-sample significance levels from repeated p-values using a componentwise-moment-based method

6.1 Small-sample degrees of freedom with multiple imputation

6.2 Significance levels from multiply imputed data with small sample size based on Sd

7 Comparing the four methods for generating significance levels from multiply-imputed data

7.1 Simulation study

7.2 Results

7.2.1 ANOVA

7.2.2 Combination of method and appropriate degrees of freedom

7.2.3 Rejection rates

7.2.4 Conclusions

8 Summary and practical advices

9 Future tasks and outlook

A Derivation of (3.1)-(3.5) from Section 3.1

B Derivation of the degrees of freedom δ and w in the moment-based procedure described in Section 3.1

Objectives and Research Themes

The primary objective of this thesis is to develop a robust statistical method for calculating significance levels from multiply-imputed data that overcomes the limitations of existing procedures. The author seeks to provide a practical and accessible approach that is compatible with standard statistical software while retaining high power and calibration, even in cases with small sample sizes or high-dimensional data.

  • Theoretical evaluation and refinement of multiple imputation combining rules.
  • Development of a new z-transformation procedure for combining p-values.
  • Analysis and resolution of multi-dimensional test problems, particularly for Wald-tests.
  • Introduction of a componentwise-moment-based method to handle small-sample significance levels.
  • Extensive comparative simulation studies across diverse statistical scenarios to establish practical guidelines.

Excerpt from the Book

1 Introduction

Missing data are an ubiquitous problem in statistical analyses that has become an important research field in applied statistics because missing values are frequently encountered in practice, especially in survey data. Many statistical methods have been developed to deal with this issue. Substantial advances in computing power, as well as in theory, in the last 30 years enables the application of these methods for applied researchers. A highly useful technique to handle missing values in many settings is multiple imputation, which was first proposed by Rubin (1977, 1978) and extended in Rubin (1987). The key idea of multiple imputation is to replace the missing values with more than one, say m, sets of plausible values, thereby generating m completed data sets. Each of these completed data sets is then analyzed using standard complete-data methods. These repeated analyses are combined to create one imputation inference, that takes correctly account into the uncertainty due to missing data. Multiple imputation retains the major advantages and simultaneously overcomes the major disadvantages inherent in single imputation techniques.

Due to the ongoing improvement in computer power in the last 10 years, multiple imputation has become a well known and often used tool in statistical analyses. Multiple imputation routines are now implemented in many statistical software packages. However, there still exists a problem in generally obtaining significance levels from multiply-imputed data, because Rubin’s combining rules (1978)

Summary of Chapters

1 Introduction: Provides an overview of the problem of missing data and the role of multiple imputation in modern statistical analysis.

2 Multiple imputation: Introduces the theoretical foundations, necessary notations, and the established combining rules for multiple imputation.

3 Significance levels from multiply-imputed data: Details existing procedures for generating significance levels, including moment-based and likelihood-ratio-based approaches.

4 z-transformation procedure for combining repeated p-values: Presents a novel z-transformation approach for combining p-values and evaluates its performance on z-tests, t-tests, and Wald-tests.

5 How to handle the multi-dimensional test problem: Discusses the limitations of existing methods in multi-dimensional contexts and explores the challenges related to small sample sizes.

6 Small-sample significance levels from repeated p-values using a componentwise-moment-based method: Proposes an adjusted procedure utilizing componentwise-moment-based calculations for improved inference in small samples.

7 Comparing the four methods for generating significance levels from multiply-imputed data: Conducts an extensive simulation study to evaluate and compare the performance of the various discussed methods.

8 Summary and practical advices: Summarizes the findings and provides actionable recommendations for researchers applying these methods in practice.

9 Future tasks and outlook: Identifies open research problems and suggests potential directions for future statistical development.

Keywords

Multiple Imputation, Missing Data, Significance Levels, Wald-test, z-transformation, p-values, Simulation Study, Small Sample Size, Multi-dimensional Test, Statistical Inference, Combining Rules, Imputation Model, Degrees of Freedom, Applied Statistics, Hypothesis Testing

Frequently Asked Questions

What is the core focus of this research?

This work focuses on solving the challenge of obtaining accurate significance levels from multiply-imputed data, especially when standard combining rules are insufficient or when specific statistical software access is limited.

Which specific statistical test does the author primarily analyze?

The research frequently analyzes the Wald-test due to its ubiquity in regression models, alongside evaluations of z-tests, t-tests, and F-tests.

What is the primary objective or research question?

The main goal is to develop a method that retains the advantages of existing procedures while overcoming their limitations—such as the requirement for variance-covariance matrices—by relying on standard output from statistical software.

What scientific methods were employed?

The author uses theoretical derivations based on Rubin’s rules and conducts extensive factorial simulation studies in the statistical programming language R to validate the performance of different methods.

What are the key contributions of the main chapters?

The main chapters introduce a z-transformation procedure, address multi-dimensional test issues, and propose a componentwise-moment-based method designed specifically for small sample sizes.

Which keywords best describe this study?

Key terms include Multiple Imputation, Wald-test, p-value combination, simulation analysis, and small-sample degrees of freedom.

How does the componentwise-moment-based method differ from the standard moment-based method?

The componentwise approach is designed to provide better calibration in small-sample settings where the standard moment-based method may break down, by utilizing adjusted degrees of freedom calculated componentwise.

Why are standard Wald-tests problematic with multiply-imputed data?

Standard Wald-tests can produce invalid significance levels if they do not correctly account for the uncertainty introduced by the imputation process and the specific distribution of the test statistics across multiple data sets.

Ende der Leseprobe aus 103 Seiten  - nach oben

Details

Titel
New methods for generating significance levels from multiply-imputed data
Hochschule
Otto-Friedrich-Universität Bamberg  (Sozial- und Wirtschaftswissenschaften)
Note
summa cum laude
Autor
Christine Aust (Autor:in)
Erscheinungsjahr
2010
Seiten
103
Katalognummer
V418195
ISBN (eBook)
9783668674196
ISBN (Buch)
9783668674202
Sprache
Englisch
Schlagworte
multiple imputation p-value significance levels statistics statistical methods missing data multiply-imputed data
Produktsicherheit
GRIN Publishing GmbH
Arbeit zitieren
Christine Aust (Autor:in), 2010, New methods for generating significance levels from multiply-imputed data, München, GRIN Verlag, https://www.grin.com/document/418195
Blick ins Buch
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
Leseprobe aus  103  Seiten
Grin logo
  • Grin.com
  • Versand
  • Kontakt
  • Datenschutz
  • AGB
  • Impressum