This thesis deals with the development of an "alpha"-quantile estimate based on a surrogate model with the use of artificial neural
networks. Using artificial neural networks as an estimate is considered a nonparametric approach.
The estimation of a specific quantile of a data population is a widely used statistical task and a comprehensive way to discover the true relationship among variables. It can be classified as nonparametric regression, where it is one of the standard tasks. The most common selected levels for estimation are the first, second and third quartile (25, 50 and 75 percent). The quantile level is given by "alpha". A 25 percent quantile for example has 25 percent of the data distribution below the named quantile and 75 percent of the data distribution above it. Sometimes the tail regions of a population characteristic are of interest rather than the core of the distribution.
Quantile estimation is applied in many different contexts - financial economics, survival analysis and environmental modelling are
only a few of them.
Contents
Contents.. 3
List of Figures.. 5
List of Tables.. 5
List of Abbreviations.. 6
Nomenclature.. 6
1 Introduction and Overview.. 7
2 Nonparametric regression.. 8
3 Nonparametric quantile estimation based on surrogate models.. 10
3.1 Introduction in Surrogate Models.. 10
3.2 Order Statistics.. 11
3.2.1 Asymptotic distribution of a central order statistic.. 12
3.3 A general error bound.. 14
3.3.1 Theorem.. 15
4 Neural Networks.. 22
4.1 Introduction.. 22
4.2 Biological neural networks.. 22
4.3 Historical Background.. 23
4.3.1 McCulloch-Pitts Model.. 23
4.3.2 Perceptron.. 24
4.4 Elements of an artificial neural network.. 24
4.5 Definitions.. 25
4.5.1 Sigmoid function.. 25
4.5.2 Squashing function.. 26
4.5.3 Artificial neuron.. 27
4.5.4 Feedforward neural network with hidden layers.. 28
4.5.5 A recursive definition of multilayered feedforward neural networks.. 28
4.6 Approximation characteristics of neural networks.. 29
4.6.1 Idea.. 29
4.6.2 Lemma 3 (An approximation result).. 29
4.6.3 Lemma 4.. 39
5 Implementation.. 48
5.1 The Quantile estimates.. 48
5.2 Test Settings.. 48
5.3 Application on Simulated Data.. 50
5.4 Backpropagation algorithm.. 51
5.4.1 Gradient descent method.. 51
5.4.2 Phases of the Backpropagation algorithm.. 53
5.4.3 Training and Test Phase.. 53
5.4.4 Implementation.. 53
5.4.5 Initialising of the weights.. 54
5.4.6 Structure of the Parameters.. 56
5.4.7 Partial Derivatives.. 56
5.5 Comparison.. 57
5.6 Discussion.. 63
6 Conclusion and Outlook.. 64
Bibliography
Appendix.. 67
Implementation.. 67
Order Statistic Estimate.. 67
Backpropagation Algorithm.. 68
Monte Carlo Estimate.. 73
Application on Simulated Data.. 74
Definitions.. 76
1 Introduction and Overview
The estimation of a specific quantile of a data population is a widely used statistical task and a comprehensive way to discover the true relationship among variables. It can be classified as nonparametric regression, where it is one of the standard tasks. The most common selected levels for estimation are the first, second and third quartile (25, 50 and 75 percent). The quantile level is given by alpha. A 25 percent quantile for example has 25 percent of the data distribution below the named quantile and 75 percent of the data distribution above it. Sometimes the tail regions of a population characteristic are of interest rather than the core of the distribution. Quantile estimation is applied in many different contexts - financial economics, survival analysis and environmental modelling are only a few of them.
This thesis deals with the development of an alpha-quantile estimate based on a surrogate model with the use of artificial neural networks. Using artificial neural networks as an estimate is considered a nonparametric approach.
In reference [14] the following examples of nonparametric quantile estimation can be found:
• A device manufacturer may wish to know what the 10% and 90% quantiles are for some feature of the production process, so as to tailor the process to cover 80% of the devices produced.
• For risk management and regulatory reporting purposes, a bank may need to estimate a lower bound on the changes in the value of its portfolio, which will hold with high probability.
• A pediatrician requires a growth chart for children given their age and perhaps even medical background, to help determine whether medical interventions are required, e.g. while monitoring the progress of a premature infant.
These examples show that there is a wide range of fields in which quantile estimation could be needed.
In this thesis a function m is considered. This function is to be estimated, where it is assumed that it is costly to evaluate the func- tion. That is why an estimation is required. In this context m could be a deterministic outcome of costly computations for scientific problems. For a random variable the interest is information about m(X), which is also called a complex system. Ideally one would like to get sufficiently enough evaluations of m. But this stands in contrast to its cost. The solution to this costly, time-consuming and uneconomic situation is the construction of a surrogate model. The advantage here is that one can construct a surrogate model from a small sample size. The aim is to get a good approximate mn of m, which is faster and cheaper to evaluate. Then mn can be used to generate sufficiently large samples. In our context the error given by using a surrogate model is acceptable since the advantages by using a bigger sample size are greater.
To give a brief overview of the chapters: First of all chapter (2) clarifies what nonparametric regression means. After that, simulation models and surrogate models are being described and defined in chapter (3). Furthermore, the order statistic esti- mate and the Monte Carlo estimate are introduced and distinguished. After that, a general error bound is given and proved.
Chapter (4) provides the needed information for artificial neural networks. They are explained regarding the biological model, the historical background is presented, which is followed by the elements of artificial neural networks. Artificial neural networks allow the estimation of possibly nonlinear models. All that gives a base to regard and prove an approximation result.
The implementation of an order statistic estimate and a Monte Carlo surrogate estimate based on neural networks is described in chapter (5). The finite sample size behaviour of these estimates is examined by applying them on simulated data. The errors of the estimates are compared to each other in boxplots as well as their medians and interquartile ranges and the results are discussed.
Finally a conclusion and a possible outlook is given.
The MATLAB codes for the implementation and some definitions, which could be useful at some points are listed in the ap- pendix. Also since chapter (5) mainly deals with the results and the MATLAB code for the implementation is included in the appendix, the procedure and how the MATLAB functions work are explained here.
[...]
-
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X.