Excerpt

## Contents

List of Abbreviations

List of Figures

List of Tables

1 Abstract

2 Introduction

3 Main Part

3.1 Volatility Basics

3.1.1 Types of Volatility

3.1.1.1 Historical Volatility (=Simple Volatility)

3.1.1.1.1 Introduction and Data Quality

3.1.1.1.2 Standard Deviation as Measurement of Volatility

3.1.1.1.3 Daily Returns as Basis

3.1.1.1.3.1 Discrete Returns

3.1.1.1.3.2 Constant (Log)-Returns

3.1.1.1.3.2.1 The Math behind It

3.1.1.1.3.2.2 From Log Returns and the Central Limit Theorem

3.1.1.1.3.2.3 Brownian Motion and Random Walk

3.1.1.1.4 Study of Time Series and Empirical Distributions

3.1.1.1.4.1 Augmented Dickey-Fuller Test for Stationarity

3.1.1.1.4.2 Kurtosis Analysis

3.1.1.1.4.3 Skewness Analysis

3.1.1.1.4.4 Jarque-Bera Test and Analysis Summary

3.1.1.1.5 Explanatory Power of Historical Volatilities

3.1.1.1.6 Alternative Approaches

3.1.1.1.6.1 Exponentially Weighted Moving Average

3.1.1.1.6.2 Generalized Autoregressive Conditional Heteroscedasticity

3.1.1.2 Implied Volatility

3.1.1.2.1 From the Binomial Tree to the Black-Scholes Model

3.1.1.2.2 The Basics of Options

3.1.1.2.3 The Black-Scholes Model

3.1.1.2.3.1 Introduction to the Black-Scholes Model

3.1.1.2.3.1.1 Model Assumptions

3.1.1.2.3.1.2 The Black-Scholes Formula

3.1.1.2.3.1.3 The Greeks

3.1.1.2.3.1.3.1 Delta (ȟ)

3.1.1.2.3.1.3.1.1 Delta-Volatility Dynamics

3.1.1.2.3.1.3.1.2 Gamma (Ȟ)

3.1.1.2.3.1.3.2 Theta (ϴ)

3.1.1.2.3.1.3.3 Rho (p)

3.1.1.2.3.1.3.4 Vega (v)

3.1.1.2.3.1.4 Discussion of the Model

3.1.1.2.3.1.4.1 Volatility Smiles

3.1.1.2.3.1.4.2 Explanations for the Existence of Volatility Smiles

3.1.1.2.3.1.5 Modifications of the Black-Scholes Model

3.1.1.2.4 Calculation of Implied Volatility using the Black-Scholes Formula

3.1.1.2.5 Explanatory Power of Implied Volatilities vs. Historical Volatility

3.1.2 Chapter Summary

3.2 The Chicago Board Options Exchange (CBOE) Volatility Index (VIX)

3.2.1 History of the VIX

3.2.2 Importance of the VIX

3.2.3 The VIX in Detail

3.2.3.1 Methodology of Calculating the VIX

3.2.3.2 VIX Futures & Options

3.2.3.3 Average Daily Trading Volumes since 2006

3.2.3.4 VIX Dynamics

3.2.3.4.1 Correlation VIX vs. S&P500

3.2.3.4.2 Historical Volatility S&P500 vs. VIX and Variance Risk Premiums

3.2.3.4.3 Volatility of Volatility

3.2.3.4.4 Volume and Volatility

3.2.3.4.5 Distribution of Volatility

3.2.3.4.6 Asymmetric Volatility Phenomenon

3.2.4 Chapter Summary

3.3 Volatility Related Financial Instruments

3.3.1 Options based Strategies on Volatility

3.3.1.1 Butterfly Spread

3.3.1.2 Straddle

3.3.2 Variance Swaps

3.3.3 Volatility Related Exchange Traded Products

3.3.4 Volatility Related Exchange Traded Notes & Exchange Traded Funds

3.3.4.1 Introduction to VIX ETPs

3.3.4.1.1 Structure of ETPs in General

3.3.4.1.2 The S&P500 VIX Short-Term Futures TR Index

3.3.5 Chapter Summary

3.4 Case Study - Implementation of Volatility ETPs for Volatility Hedging

3.4.1 Construction and Purpose of the Study

3.4.2 Spread, Liquidity and Cost Analysis

3.4.2.1 Spread Analysis

3.4.2.2 Volume Analysis

3.4.2.3 Cost Analysis

3.4.2.4 Differences between ETNs and ETFs

3.4.3 Behavior of VXX and the VIX Futures Term Structure

3.4.3.1 Contango/Backwardation Induced VXX Performance (VIX as Benchmark)

3.4.3.2 Front-Term VIX Futures ETPs vs. Mid-Term VIX Futures ETPs

3.4.4 Hedge-Efficiency Analysis

3.4.4.1 Defining the Hedging Objective

3.4.4.2 Defining the Strategy and Necessary Assumptions

3.4.4.3 Considerations regarding Correlation Dynamics

3.4.4.4 Event Driven Hedge-Efficiency Analysis

3.4.4.4.1 Regression Analysis October 2014

3.4.4.4.2 Regression Analysis December 2014

3.4.4.5 Mean Reversion of Volatility

3.4.4.6 Overall Portfolio Impact of VXX/VIX Portions

3.4.4.6.1 S&P500 + VXX/VIX Portfolio Analysis

3.4.4.6.2 Interpretation of the Findings and Optimization of VXX/VIX Portions

3.4.5 Study Summary

3.5 VBA Solution for Volatility ETP Analyses

3.5.1 Aim of the Tool

3.5.2 Implementation

3.5.3 User Guide

3.5.4 Potential Drawbacks and Further Development

4 Conclusion

5 Appendix I

6 Appendix II (VBA Code)

7 List of References

## List of Abbreviations

illustration not visible in this excerpt

List of Figures

Figure 3.1.I: Autocorrelation S&P500 daily log returns 1-15 lags

Figure 3.1.II: Densityfunction S&P500 based on daily log returns

Figure 3.1.III: Volatility Surface DAX30 Call Option as of June 2014 (exemplary illustration)

Figure 3.2.I: Average Daily Trading Volume - VIX Options

Figure 3.2.II: Average Daily Trading Volume - VIX Futures

Figure 3.2.III: Daily Closing Prices S&P500 vs. VIX

Figure 3.2.IV: S&P500 Realized Volatility vs. VIX

Figure 3.2.V: Daily Log Returns - S&P500 vs. VIX

Figure 3.2.VI: S&P500 Volume vs. VIX

Figure 3.2.VII: Densityfunction S&P500 30-day Volatility

Figure 3.4.I: Average Spreads

Figure 3.4.II: Average Daily Volumes

Figure 3.4.III: Total Expense Ratios

Figure 3.4.IV: 30-day Volatility VIX vs. VXX

Figure 3.4.V: Trailing 30-day Returns VIX vs. VXX

Figure 3.4.VI: Rolling Correlation S&P500 vs. VXX

Figure 3.4.VII: Regression Analysis VXX vs. S&P500 October 1-15, 2014

Figure 3.4.VIII: Regression Analysis VXX vs. S&P500 December 2-16, 2014 .

Figure 3.4.IX: Trailing 30-day Average Returns (95/5)

Figure 3.4.X: Trailing 30-day Average Returns (90/10)

Figure 3.4.XI: Long-Term Development of VXX, VXZ and VIXY vs. the VIX

## List of Tables

Table 3.1.I: Dickey-Fuller test significance levels and related critical values

Table 3.1.II: Dickey-Fuller test statistics and related p-values

Table 3.1.III: Chi-square significance levels and critical values of the Jarque- Bera test

Table 3.1.IV: Kurtosis and Skewness analysis summary and Non-Normality Test analysis summary

Table 3.1.V: Implied Volatility structure S&P500 Call (expiry October, 2015)

Table 3.4.I: Analyzed ETP Universe

Table 3.4.II: Performance Comparison VXX vs. VIXY

Table 3.4.III: Short-Term vs. Mid-Term VIX Strategies

Table 3.4.IV: Regression statistics summary S&P500 October 1-15,

Table 3.4.V: Regression statistics overview S&P500 December 2-16,

Table 3.4.VI: Mean Reversion Analysis

Table 3.4.VII: Sharpe Ratios: 06/2014 - 05/2015 (95/5)

Table 3.4.VIII: Sharpe Ratios: 06/2014 - 05/2015 (90/10)

## Acknowledgements

I would like to thank Commerzbank AG and its Flow Trading and ETF Market Making department for kindly granting access to Bloomberg and Reuters terminals and for being always available for guidance and assistance while conducting this thesis.

Furthermore I would like to thank my family whose support and funding were the key to success at any stage of my academic career.

Nuertingen, January 7, 2016

*“ This analysis suggests that a more catholic approach should be taken to explaining the behavior of speculative prices. ”*

Lawrence H. Summers

## 1 Abstract

Title: “The Application of Volatility related Exchange Traded Products as Instruments for Volatility Hedging”

Theoretical Background: Besides stocks, bonds, cash, commodities or real estate, volatility emerged as its own asset class in recent years. The last major financial crisis and increasing market volatility raised this topic to a new level. Studies have shown that volatility tends to be negatively correlated to stock market returns. In other words, when stock markets plunge, volatility tends to increase and vice versa. Investors recognized that this relationship can be used in the portfolio management process.

However, mostly professional investors engage in this asset class in order to hedge their portfolios against expected drawbacks in the markets or to use volatility exposure as an instrument for speculation.

The development of the global financial markets and the introduction of Exchange Traded Products (ETPs) opened the markets to a wide range of investors, even to non-professional investors.

The increasing interest in volatility-linked products led ETP issuers to the introduction of volatility-related products.

Volatility ETPs seek to track a specific index - in many cases sub-indices - that track the performance of the *Chicago Board Options Exchange Volatility Index (VIX)* while providing all the advantages of a globally traded ETP such as highly liquid markets, market makers all over the world, tight spreads, low costs and assets hold as security funds.

These advantages, compared to other volatility strategies via options or listed warrants, make volatility ETPs a very attractive tool for portfolio managers.

Aim of the Paper and Analysis Question: The aim of this paper is to introduce the reader to the topic of volatility as an asset class respectively to volatility ETPs, their construction, price behavior and characteristics. What is more, since the vast majority of research, dealing with volatility hedging, focuses on option- or futures-based strategies, there has not been done much research on the hedging efficiency and effectiveness of volatility linked ETPs as hedging instruments yet.

The analyzed question is therefore: “How efficient are volatility hedges through volatility-linked ETPs, compared to hedges through the VIX?”

Methodology: The paper starts with an introduction to the fundamentals of volatility. The different types of volatility, the calculation and differences in power of expression.

Subsequently the most relied on volatility index, the VIX, will be introduced. Because the vast majority of volatility linked financial derivatives use the VIX as underlying i.e. as instrument to be tracked, I will refrain from discussing other volatility indices like the VSTOXX^{1} index. I will constantly draw connections to the benchmark VIX in order to evaluate efficiency and effectiveness of hedges via VIX ETPs.

After this introduction I familiarize the reader with volatility ETPs and their construction and characteristics.

I follow that with my research results on volatility behavior in general and the effectiveness of hedging high volatility events of the past via volatility related ETPs.

Getting correct information about the current statistics is crucial when it comes to volatility hedging. The last part of this paper therefore presents my Visual Basic for Applications (VBA) solution that enables the user to run automated statistical analyses on volatility ETPs using the latest market data. The related VBA code can be found in Appendix II.

Research Outcomes: I could show that volatility is far from being constant and its correlation to the S&P500 is more volatile than one would suggest. This mainly affects short-term hedge quality because correlation can increase significantly (which is actually not appreciated from a hedging perspective). However, volatility can be applied as a hedge against market drawdowns while generating excess returns.

In the long-term, VIX is, due to the underperformance in VIX ETPs, induced by the futures rolling process, the superior hedging tool and can bring down overall portfolio risk while increasing portfolio returns at the same time. Due to the complex statistical relationships and volatility dynamics, VIX futures and VIX related ETNs should be traded only by investors who know and understand the inherited risks.

Literature: My research is based on the basic works on finance and financial theory such as *Th é orie de la Sp é culation* by Bachelier (1900), *Differential Space* by Wiener (1923) or *The Behavior of Stock-Market Prices* by Fama (1965). It is extended by well-recognized works in quantitative finance such as *The Variation of Certain Speculative Prices* (Mandelbrot, 1963), *Using daily stock returns - The case of event studies* (Brown & Warner, 1985), or *The Distribution of Stock Return Volatility* (Andersen, Bollerslev, Diebold, & Ebens, 2000) *.* The current state of research already covers the dynamics of volatility from a hedging perspective. Related works are: *Volatility Trading* (Sinclair, 2013), *A Theory of Volatility Spreads* (Bakshi & Madan, 2006), *The VIX, the Variance Premium and Stock Market Volatility* (Bekaert & Hoerova, 2014) or *Price and Volatility Dynamics implied by the VIX Term Structure* (Duan & Yeh, 2011).

However, as already indicated, very few researchers have dealt with the effectiveness and efficiency of volatility hedging through VIX ETPs so far. My research is therefore extended by own empirical studies of market data as well as numerous publications in financial journals in the area of financial theory. During my research it was interesting to observe the development of financial theory over the last century. When Louis Bachelier introduced his theory of the movement of stock prices in 1900, his research was not recognized. Later on Norbert Wiener proved Bachelier’s theory and defined the so-called “Wiener Process” (1923). Most of the models we use and rely heavily on today are based on their findings.

The most difficult part of the literature research was to determine whether the theories based on these models and my own findings are transferable to the ETP segment and today’s market conditions.

Because most of the research was conducted before the recent financial crisis I made a point of double-checking the findings with my own calculations based on real market data or back-checking with more recent literature.

## 2 Introduction

Uncertainty has always been something people want to get rid of. Uncertainty about the future leads us to buy insurances that cover every imaginable risk. Increasing uncertainty about the future can even turn into fear. Fear is an unappreciated feeling, which is why we are willing to pay insurance fees respectively to disclaim parts of future returns in order to reduce the amount of fear we see ourselves exposed to.

At the stock market, there is a strong connection between uncertainty about the future and the intensity stock prices fluctuate over time. In the narrow sense the degree of stock price fluctuation is also described as a stock’s volatility. But in a broader sense it is nothing more than the fear of market participants who are facing uncertainty about future outcomes.

During financial or economic crises, volatility is a reliable indicator of the fear that is in the markets. It tends to rise as uncertainty increases and declines as financial crises settle down.

Financial markets find the fair price of an asset by discounting all available information. The price of a company share is, for example, dependent on the expected future cash flows the company will create. The higher the uncertainty about the future cash flows, the more volatile the stock price will be. But financial markets cannot only find fair prices for stocks, bonds or financial derivatives. They can also put a price tag on the “fear”, or, in other words, on the volatility itself.

In this paper I want to analyze stock market volatility and volatility linked ETPs under the aspect of their practicability as a tool for hedging drawbacks in the markets in high volatility scenarios.

I will begin with an introduction to the different kinds of volatility, their calculation and informative value. I will introduce the most relied on index of stock market volatility, the Volatility Index (VIX) invented by the Chicago Board Options Exchange (CBOE).

Following this, I will introduce the reader to volatility ETPs that have gained significant importance during recent years. As the first financial instrument so far, volatility ETPs opened the market to a wide range of professional as well as non-professional investors all over the world who can include volatility as an asset in their portfolios or use volatility ETPs as new hedging tools.

Subsequently I will present my research on the effectiveness of volatility ETPs and their implementation as a hedging tool. I will discuss the results under aspects such as applicability, cost and effectiveness.

The most difficult part in designing this research is getting access to correct and reliable data sets. At this point I would like to thank Commerzbank AG and its Flow Trading and Market Making department who kindly granted access to Bloomberg and Reuters terminals and were always available for further support and guidance.

## 3 Main Part

### 3.1 Volatility Basics

Volatility in terms of financial markets is the intensity with which prices of financial assets fluctuate over time. In order to understand the characteristics of volatility and the products that rely on volatility, it is essential to be familiar with the basics of volatility calculation.

The purpose of this chapter is to present the current state of volatility research, the basic models and their short comings.

#### 3.1.1 Types of Volatility

Volatility can be calculated using either historical datasets (for example time series of stock prices) or it can be calculated recursively by applying complex mathematical models such as the Black-Scholes model. Both derived volatilities have their advantages and drawbacks which will be discussed in detail in the following.

##### 3.1.1.1 Historical Volatility (=Simple Volatility)

There are several ways of computing historical volatility in the context of stock markets. A well-known way is calculating the “simple” volatility that basically applies the standard statistics to time series of stock returns. However, the recent financial crisis raised attention to the topic “risk” and correct measurement of market risks. Volatility is a key figure when it comes to risk, which explains the emergence of more complex models during recent years. Current research nevertheless relies heavily on simple volatility that calculates volatility based on log returns. This is why the following sections will discuss in detail the area of simple volatility and briefly present more complex models such as the Exponentially Weighted Moving Average (EWMA) model as well as the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model.

###### 3.1.1.1.1 Introduction and Data Quality

As the name already indicates, historical volatility is derived by analyzing the past. The key factor for a correct and unbiased volatility estimate is the quality of the dataset. When the data contains dividend payments (or corporate actions like stock splits which lead to adjustments) and the stock reaches the “ex- dividend” day the stock price will be decreased by the amount of the dividend. This in turn will look like a price-drop in an unadjusted time series and the volatility estimate based on this time series will be biased. Therefore, we need to make sure our dataset is adjusted for such events.^{2}

A free and convenient source for those datasets is finance.yahoo.com or if available Bloomberg or Thomson Reuters. If not indicated differently, my calculations and charts, which are presented during this paper, are based on data provided by Thomson Reuters.

However, since some parts of this thesis especially the developed VBA code need an instant connection to a data source, I also use finance.yahoo.com as a tertiary source, which is indicated where applicable. My VBA solution that enables the user to connect to the finance.yahoo.com servers and pull data instantly is attached in appendix II. The associated VBA tool is attached as a macro enabled excel file and presented in detail in chapter 3.5.

###### 3.1.1.1.2 Standard Deviation as Measurement of Volatility

The symbol [illustration not visible in this excerpt] (= Greek letter “sigma”) denotes the standard deviation of a time series. In other words,[illustration not visible in this excerpt]measures the amount of dispersion to the mean for a given dataset. This is basically the figure that needs to be calculated for volatility matters.

In mathematical terms, volatility is the square root of the variance (Sinclair, 2013):

illustration not visible in this excerpt

Where [illustration not visible in this excerpt] denotes the standard deviation and ߪ; the variance. Variance is calculated as follows^{3}:

illustration not visible in this excerpt

Where ܰ denotes the sample size, x the returns and x the mean return. When looking at a dataset that contains daily close prices, and calculating a rolling volatility based on the past 30 days, we need to annualize the result in order to get comparable figures. The following formula does so:

illustration not visible in this excerpt

I chose to annualize by 252 trading days because this is the number of trading days per year^{4} that is assumed in many options and volatility-related books and is a common assumption in financial research (Chang & Wong, 2013).

The sample size depends on the time horizon that needs to be covered. If only the most recent events need to be incorporated in the volatility calculation, a time scale of one year is commonly used in research (Jorion, 2007). If major events of the past like the recent financial crisis need to be considered too, the sample size has to be adjusted accordingly.

###### 3.1.1.1.3 Daily Returns as Basis

Throughout this paper we will look at volatility figures calculated based on daily rates of return of financial assets. The usage of returns instead of raw prices is mainly due to normalization. Using returns as a basis for our calculation instead of prices enables us to compare different datasets and time series. But there is a discussion amongst researchers about why to use daily returns instead of, for example, weekly or monthly returns. And indeed, using daily returns inherits some perils. According to Brown and Warner (1985) parameters derived from daily data may be systematically biased by asynchronous trading and, according to Fama (1965), daily returns tend to depart more from reality than monthly returns do.

However, according to a study of Brown and Warner (1985) on the characteristics of daily stock returns, these perils do not reduce the empirical power in the context of event study methodologies. This is why it is common sense amongst researchers to use daily returns and why I will stick to the academic consensus and use daily returns for further calculations too. The following chapters outline the different kinds of returns in terms of compounding periods and discuss constant as well as discrete returns.

3.1.1.1.3.1 Discrete Returns

There are different types of discrete returns. Current research knows simple rate of returns, geometric returns, arithmetic returns and money-/time-weighted returns (Ernst & Schurer, 2014). In the case of calculating discrete returns of daily close prices, the simple rate of return is the most applicable because a day-to-day calculation does not need to consider any payments during the period^{5}. We calculate discrete simple returns therefore as follows:

illustration not visible in this excerpt

Where [illustration not visible in this excerpt]denotes the stock price at time ݅ and [illustration not visible in this excerpt]the stock price one period before.

3.1.1.1.3.2 Constant (Log)-Returns

Constant returns assume that, in contrast to discrete returns, time increments go to zero. This means that interest is charged or yielded continuously. We calculate constant returns by taking the natural logarithm of the discrete returns:

illustration not visible in this excerpt

Because of this calculation, constant returns are commonly referred to as log returns. I will stick to this nomenclature too during this paper. The following sections will show why using log returns makes more sense in our case than using discrete returns. I will present in detail the mathematical and statistical background and the related assumptions that need to be considered. I will also discuss whether the assumptions can stand the reality and which quantitative models research has developed to overcome potential drawbacks that come with these assumptions.

3.1.1.1.3.2.1 The Math behind It

Many papers on stock prices and volatility analysis assume log returns in their calculations. Why this is convenient for further research on stock return distributions will be discussed in chapter 3.1.1.1.4. During this chapter I want to emphasize why research uses log returns instead of discrete returns. The reason for this is quite simple and can be explained by applying the basics of mathematics.

Assume a starting notional value [illustration not visible in this excerpt] that grows at a constant annual rate p with interest compounded every year. This leads us to principal ܲ at time t using the following formula:

illustration not visible in this excerpt

In stock markets there is no reason for yearly (as assumed above), monthly or even daily compounding. Markets process and inherit available information continuously. This leads to a way of calculating p where r is a continuous logarithmic rate with ݁ as the base of the natural logarithms:

illustration not visible in this excerpt

By taking the logarithms we can solve for the logarithmic rate r. Applying the same logic, we can take log returns as the sequential difference of log prices: (Hirsa & Neftci, 2014)

illustration not visible in this excerpt

This finding provides already the answer for the question above. By using log returns instead of raw returns we can turn an exponential (due to constant compounding) problem into an arithmetically solvable linear one (Hudson & Gregoriou, 2010).

What is more, log returns are defined, in contrast to raw returns, in the whole amount of real numbers which is an essential attribute when it comes to distribution approximations.

In addition to the fact that log returns are convenient for further calculation, Hudson and Gregoriou (2010) indicate that for forecasting measures, compounding expected log returns will provide a better median of future cumulative returns than compounding expected simple returns.

3.1.1.1.3.2.2 From Log Returns and the Central Limit Theorem

In terms of measuring market volatility, log returns are taken as random variables. Furthermore it is often assumed that log returns derived from daily stock prices have a constant expected value (equal to zero) and variance, are serially independent and follow a normal distribution on all timescales. At this point it is not my intention yet to defend these assumptions. Plenty of papers covering this topic, including my own research, have shown that in reality, the distributions of log returns peak, are skewed and have fatter tails than the normal distribution (Jorion, 2007).

However, many models in modern financial theory rely on these assumptions. One of them is the famous Black-Scholes model that will be covered in detail in chapter 3.1.1.2.3. But in order to understand the drawbacks of these assumptions, it is essential to get the whole picture.

When it comes to justifying the assumptions named above, the central limit theorem is often mentioned. It says that even if log returns do not follow a normal distribution (i.e. are “non-normal”), the sum of ݊ independent, identically distributed random variables with finite variance converges to a normal distribution anyway when ݊ is large (Aas, 2004). The classical central limit theorem by Lindeberg-Lévy states therefore (Wewel, 2011):

illustration not visible in this excerpt

Where n is the sample size, x the random variable and [illustration not visible in this excerpt] the expected value. The distribution converges with an increasing sample size ݊ to a normal distribution N with a mean of 0 and a variance of [illustration not visible in this excerpt]

The fact that the normal distribution is defined by only two parameters, already explains why the usage of the normal distribution in quantitative finance is so popular.

However, according to Aas, log returns may be uncorrelated but not independent, which is required for the central limit theorem (refer to chapter 3.1.1.1.4.1 for an S&P500 autocorrelation analysis). This in turn affects the speed of convergence. In other words: The less independent the returns are, the slower the convergence towards the normal distribution will be. This means that a much higher sample size is required to achieve optimal convergence (Aas, 2004).

Furthermore, convergence is known to be very slow in the tails of the distribution (Bradley & Taqqu, 2003). This is important to keep in mind because even for a high sample size and not perfectly independent returns, the convergence criteria may only hold for the middle part of the distribution while convergence in the tails is far from perfect.

Chapter 3.1.1.1.4 illustrates these relationships in a detailed kurtosis and skewness analysis of daily log returns across major indices.

3.1.1.1.3.2.3 Brownian Motion and Random Walk

The basics of the financial models we use and rely on heavily today, are actually pretty old. In 1827 Robert Brown observed a randomly moving pollen suspension in water. Even though Brown did not prove his observations mathematically by himself, random movements are often referred to as brownian motions today. Louis Bachelier defined brownian motion mathematically in his doctoral thesis (1900) while Norbert Wiener proved its existence (1923). This is why a brownian motion is also called a “wiener process” in literature.

When it comes to modeling and forecasting time series in financial theory, the series can be modeled using a random walk. A random walk is a mathematical term for a path that consists of random steps. It is often also referred to as a Markov-chain^{6} (Moerters & Peres, 2008).

Brownian motion and random walk are, as introduced in chapter 3.1.1.1.3.2.1, also connected via the central limit theorem because the Brownian motion can be seen as the limiting case of the random walk as time increments go to zero (Kac, 1947).

It should be kept in mind that even if we break down the most sophisticated models of modern financial theory, they will all rely, at least in their basic assumptions, on the findings and theories of Brown, Bachelier and Wiener. It is because of the lack of alternative theories that research has to come back to those models and assumptions, even though empirical data differs and would not justify their application (Mandelbrot, 1963). The following section deals with empirical data of major stock indices and investigates if there are deviations from the assumptions in the models and if so, how significant these deviations are.

###### 3.1.1.1.4 Study of Time Series and Empirical Distributions

In order to empirically evaluate whether the theoretical relationships named above fit to real market conditions and can be applied to the models that will be used during this paper, I apply the Augmented Dickey-Fuller test to major stock indices including the S&P500, DAX30, CAC40, FTSE100, Nikkei225, ATX, HSI and Russel2000 in order to test for stationarity of the returns.

Stationarity means that the mean of the data is time-independent and autocorrelation is only dependent on the time intercepts but independent regarding time shifts in the series (Schmelzer, 2009).

This is crucial because the models and theories that are applied to real market data during this paper rely heavily on the normal distribution assumption and a constant mean would enhance the empirical quality of the normal distribution approach. In a second step I will conduct an in-depth study of empirical return distributions across the indices. I will look at the kurtosis and skewness of return distributions and will evaluate (non-)normality of return distributions by applying the Jarque-Bera test. The related calculations can be found in the attached excel sheet and the density functions for all indices are plotted in appendix I.

3.1.1.1.4.1 Augmented Dickey-Fuller Test for Stationarity

The Augmented Dickey-Fuller test (ADF) is a common tool in time series analysis especially when it comes to testing for stationarity (Gregoriou & Pascalau, 2011). Because I assume that the analyzed series of log returns do not show a significant trend (refer to figure 3.2.V) I apply the ADF without considering adjustments for trends. The test is then conducted by regressing the differences of the log returns, represented by[illustration not visible in this excerpt]with the lagged series of log returns, represented by[illustration not visible in this excerpt](Franke, Haerdle, & Hafner, 2001). My data consists of 3874 daily log returns beginning on January 3rd, 2000 until May 29th, 2015 and I assume a lag of one trading day.

The t-value of the regression is then compared to the critical values of the significance levels of the Dickey-Fuller t-distribution^{7} which are shown in table 3.1.I below. The existence of nonstationarity is the null-hypothesis that is either accepted or rejected (Dickey & Fuller, 1981).

illustration not visible in this excerpt

Table 3.1.I: Dickey-Fuller test significance levels and related critical values

If the t-statistic of the regression delivers a value that is below the critical value of the corresponding significance levels, the null-hypothesis is rejected (Dickey & Fuller, 1981). Or in other words: The analyzed data is stationary. The t- statistics of the analyzed indices are plotted in table 3.1.II below:

illustration not visible in this excerpt

Table 3.1.II: Dickey-Fuller test statistics and related p-values

As we can see, none of the tested indices shows a t-statistic that is above any of the critical values shown in table 3.1.I. I assume therefore, that the analyzed indices, or to be more precise, their log returns are mean-stationary. To fulfill the definition of (weak) stationarity the data must be covariance-stationary too (Schmelzer, 2009). A detailed analysis of covariance stationarity would go beyond the scope of this paper but the corellogram for the autocorrelation of S&P500 returns, which is plotted in figure 3.1.I below, shows a slight autocorrelation (*slight,* because it ranges from +0.05 to -0.09, which can be interpreted as a ‘very low autocorrelation’ in statistical terms) of the log returns. Autocorrelation is a term from statistics and signal processing. In a random process, autocorrelation describes the correlation of a time series with its own past and future values (Chatfield, 2004). It is also sometimes called “lagged correlation” or “serial correlation”.

illustration not visible in this excerpt

Figure 3.1.I: Autocorrelation S&P500 daily log returns 1-15 lags

As we can see, autocorrelation can be observed for most of the lags. This finding leads me to the assumption that mean-stationarity is existent in the log return series while the criteria of time-shift independency of autocorrelation is not fulfilled. In order to fully assess the quality of the normal-distribution assumption, a test for skewness and kurtosis of the empirical return distributions is indispensable. The following sections deal with this question in a detailed skewness and kurtosis analysis.

3.1.1.1.4.2 Kurtosis Analysis

The first drawbacks of the normal distribution assumption can be observed in figure 3.1.II that plots the density function of daily log returns of the S&P500 index. Appendix I plots the density functions of the other indices for comparison purposes. The data ranges from 03.01.2000 to 29.05.2015, is based on adjusted closing prices and includes 3872 observations.

illustration not visible in this excerpt

Figure 3.1.II: Densityfunction S&P500 based on daily log returns

The mean of the distribution amounts to 0.0001, which is basically the arithmetic average since all observations receive the same weight. The standard deviation amounts to 0.0127 (variance is 0.0002). From the shape of the distribution it is clearly observable that the normal distribution is already a close approximation but the empirical distribution shows nevertheless some significant deviations. First of all, the distribution peaks. To prove for kurtosis (either leptokurtic or platykurtic) I applied the standard kurtosis formula by Karl Pearson (Cramer, 1997):

illustration not visible in this excerpt

Where n is the number of observations (i.e. the log returns),xi the log returns, [illustration not visible in this excerpt] the mean of the log returns and ߪ its standard deviation.

Kurtosis (w) therefore amounts to 11.1020 which is comparatively high but supports the finding of a peaking, leptokurtic distribution which is slimmer and has fatter tails than the normal distribution.

To be more precise an *excess* kurtosis of +8.1020 is observable compared to the normal distribution, which has a kurtosis of +3 and whose distribution shows a mesokurtic shape.

3.1.1.1.4.3 Skewness Analysis

An ideal normal distribution is symmetric which means it shows no skewness at all. In order to find out how close the empirical distribution the normal distribution comes, a test for skewness is essential. I tested for skewness by applying the following formula by Karl Pearson (Cramer, 1997):

illustration not visible in this excerpt

Where n is the number of observations (i.e. the log returns), xi the log returns,[illustration not visible in this excerpt] the mean of the log returns and [illustration not visible in this excerpt] its standard deviation.

Skewness (v)therefore amounts to -0.1849 for the S&P500. This means the distribution is slightly skewed to the left, which is another deviation from the normal distribution and supports the finding that the normal distribution is just an approximation but not an optimal fit for S&P500 return distributions.

3.1.1.1.4.4 Jarque-Bera Test and Analysis Summary

The same analysis as above has been also done for other major indices in other markets such as DAX30, CAC40, FTSE100, Nikkei225, ATX, HSI and Russel2000. The results can be found in table 3.1.IV.

However, the findings are analogue and show that the normal distribution can only be an approximation but is far from being an optimal fit to empirical data. In order to statistically confirm the findings of above, the Jarque-Bera test has been applied for all indices. The Jarque-Bera test statistics can be calculated as follows (Judge, Hill, Griffiths, Lütkepohl, & Lee, 1988):

illustration not visible in this excerpt

Where ݊ is the number of observations,s the skewness of the distribution and k its kurtosis. The term (k-3) is actually the *excess* kurtosis considering that the normal distribution shows a kurtosis of 3.

The Jarque-Bera (JB) test statistic follows a chi-square probability distribution with two degrees of freedom that in turn delivers the p-value to test for significance (Judge, Hill, Griffiths, Lütkepohl, & Lee, 1988).

The significance levels of a chi-square distribution are plotted in table 3.1.III below and can be looked up in any chi-square probability distribution table:

illustration not visible in this excerpt

Table 3.1.III: Chi-square significance levels and critical values of the Jarque-Bera test

The null-hypothesis (Ho:“normal distribution is existent, skewness and excess kurtosis is zero”) is rejected if the JB test statistic exceeds a critical value of the chi-square distribution (Judge, Hill, Griffiths, Lütkepohl, & Lee, 1988).

Assuming that the null-hypothesis is true, the p-value describes the probability for a result equal to or even higher than what can be actually observed in the sample (Hubbard, 2004). This means, the smaller the p-value, the more likely it is to reject the null-hypothesis.

As we can see in table 3.1.IV, which is plotted below, the JB test statistics of all indices exceed the critical values for any reasonable level of significance.

What is more, the p-values for all tested indices are equal to zero. The nullhypothesis has to be therefore rejected for all indices (Wewel, 2011).

illustration not visible in this excerpt

Table 3.1.IV: Kurtosis and Skewness analysis summary and Non-Normality Test analysis summary

A detailed calculation for all indices can be found in the attached Excel sheet. All analyzed indices show significant leptokurtosis while it is worth noting that DAX30 and CAC40 show the most moderate kurtosis with 7.2242 and 7.7790 respectively. What is more, these two indices show the lowest skewness of 0.0100 and 0.0143 respectively, which is rather symmetric.

While the Jarque-Bera test showed that the analyzed data deviates significantly from the normal distribution, the Dickey-Fuller test indicated that the analyzed distributions are mean-stationary. Even though the data is mean stationary, the application of models that are based on a normal distribution assumption would lead to biased results because the normal distribution is not able to describe the kurtosis and skewness of the empirical data.

However, especially when we talk about the Black-Scholes model or volatility and its calculation, we have to come back at the normal distribution assumption. The reader should therefore consider the risks that come with this assumption and that it only eases calculation but risks in the height and the tails of the distribution might be systematically underestimated.

What is more, applying the same methods on data consisting of weekly or monthly prices may lead to different results.

###### 3.1.1.1.5 Explanatory Power of Historical Volatilities

As the name *historical volatility* already indicates, derived standard deviations are solely based on past events and tell us nothing about the future or important, potentially market-moving events such as Federal Reserve announcements or corporate actions (Ederington & Guan, 2004). This makes historical volatility a poor estimator of the future volatility of an asset.

In addition, calculating volatility as the historical standard deviation of log returns assigns all information in the dataset the same weight. This is a significant short-coming of historical volatility since research has shown that more recent observations contain more information in terms of relevance for the future than very old ones (Engle, 2004).

###### 3.1.1.1.6 Alternative Approaches

The short-comings described in chapter 3.1.1.1.4 led to more sophisticated models for volatility measuring and forecasting. The two main models shall be introduced briefly in the following section.

3.1.1.1.6.1 Exponentially Weighted Moving Average

One of the major advantages of the Exponentially Weighted Moving Average model (EWMA) is that it gives more weight to the recent observations than the standard historical volatility model does (Bauwens, Hafner, & Laurent, 2012). It also assumes continuous compounding and is therefore based on log returns. EWMA further assigns weights to the squared log returns instead of weighing them equally. This is achieved by the weight factor Lambda (ߣ). Lambda serves as a smoothing factor and must be less than 1. The question here is where to start with the first weight. RiskMetrics^{8} recommends 94% which leads to the following calculation: The first weight will be (1-0.94) = 6%, the second weight will be 6% * 0.94 = 5.64% and so on. Naturally all weights must sum to 1, however, the weights decline over time exponentially at the constant rate ߣ but never reach 0.

3.1.1.1.6.2 Generalized Autoregressive Conditional Heteroscedasticity

Generalized Autoregressive Conditional Heteroscedasticity (GARCH) models are well known tools in econometrics especially in time series analysis. The first Autoregressive Conditional Heteroscedasticity model was developed by Robert Engle who received the Nobel Prize in economics for his research in this area (Jorion, 1995).

GARCH models are applied whenever there is a probability that the variance of the time series (in our case stock market returns) is not constant over time (Bauwens, Hafner, & Laurent, 2012). They also assume volatility clustering, which means that volatility tends to be dependent upon past events and past realized volatility. This idea goes back to Benoit Mandelbrot who found that periods of large changes are followed by periods of large changes, and small changes tend to be followed by small changes (Mandelbrot, 1963).

Further research has also found that while returns of asset prices tend to be uncorrelated, the squares of these returns are slightly auto correlated (Aas, 2004).

*Heteroscedasticity* is a term that comes from regression analysis. In a regression analysis the distances between the data points and the regression line calculated by applying the least squares method are called “residuals”. Residuals themselves can, from a probability theory point of view, be interpreted as random variables. Heteroscedasticity (or homoscedasticity) describes the variance of these residuals (Wooldridge, 2012). If the variance of the residuals is constant, homoscedasticity is existent. If the variance of the residuals is not constant, heteroscedasticity is existent. This means that return residuals need to be tested for heteroscedasticity before applying a GARCH model. Since a detailed analysis of the EWMA and the GARCH model would go beyond the scope of this paper, the interested reader is referred to the relevant literature of Bauwens, Hafner & Laurent (2012) or the research of Aas (2004).

##### 3.1.1.2 Implied Volatility

Until now, we have discussed the theory of historical volatility, its characteristics, applications and drawbacks. Similarly, we will look in the following chapters at implied volatility, which is another essential approach of measuring volatility in financial markets. In order to fully understand the theory of implied volatility and its implications, the next sections give an overview about the most recognized model in financial theory regarding implied volatility: The Black-Scholes model. The chapters 3.1.1.2.1 and 3.1.1.2.2 build up the required basics first.

###### 3.1.1.2.1 From the Binomial Tree to the Black-Scholes Model

In order to fully understand the constant Black-Scholes model (BSM) one should also be aware of the fundamentals of the discrete Binomial model of option pricing. This is why I will introduce briefly the connection between the Binomial model and the BSM in this section.

The Binomial model was developed by John C. Cox, Stephen Ross and Mark Rubinstein in 1979 and is therefore often referred to as the Cox-Ross- Rubinstein (CRR) model (Hull, 2006).

The CRR values options using a binomial tree. The tree consists of a number of time steps (also called “nodes”) between the valuation and expiration dates. Each node in the tree represents a possible price of the underlying at a given point in time for a given probability (Cox, Ross, & Rubinstein, 1979).

Valuation is then done iteratively. This way of valuation delivers a possible option value for each node and the price of the option is the sum of the probability-weighted prices of each node at the end of the tree (and discounted until today at the risk-free interest rate).

The BSM is often incorrectly interpreted as being the limiting case of the discrete CRR because as the number of time steps (or nodes) increases, the value derived by the CRR converges to the value calculated by the BSM (Steinbrenner, 2001). However, in mathematical terms the BSM is not the limiting case of the CRR but a model that delivers a value that comes very close to the CRR value as the number of time steps increases. In contrast to the CRR the BSM is able to provide a closed-form solution which is the main reason why it is the most used model when it comes to option pricing (Evangelos, 2011).

###### 3.1.1.2.2 The Basics of Options

An option is a financial derivative based on a specific underlying. In contrast to futures, it gives the buyer the right but not the obligation to buy (call option) or sell (put option) the underlying at a predetermined price.

The price of the option is also called “premium” and consists of a potential intrinsic value and time value (Hull, 2006).

Options can be differentiated between American and European option exercise styles. An American option can be exercised at any time during the time to maturity whereas European options can only be exercised on the expiration day (Jordan, 2011).

For call options we can generalize: If the predetermined execution price (also called “strike”) is below the price of the underlying, the option is in the money. Applying the same logic, an option can also be out of the money or at the money. The opposite relationship is true for put options.

An option that is in the money inherits both an intrinsic value (=current price - strike) and time value (Hull, 2006). Similarly, options that are at or out of the money inherit only time value. A call option would therefore only be exercised if the current market price of the underlying exceeds the strike of the option.

Options can be traded as standardized contracts on all major derivatives exchanges and provide a heavily used tool for portfolio insurance or speculation.

###### 3.1.1.2.3 The Black-Scholes Model

In 1973 the economists Fisher Black, Myron S. Scholes and Robert C. Merton developed a mathematical model that was able to find a price for options.

Until today, the Black-Scholes formula plays a key role in options valuation and is, in despite of his age, widely recognized (Perridon, Steiner, & Rathgeber, 2012).

3.1.1.2.3.1 Introduction to the Black-Scholes Model

The Black-Scholes model needs, as every model, various assumptions. Whether these assumptions do fit well into reality, is one of the most mentioned concerns amongst researchers regarding the Black-Scholes model. Despite its short-comings, which will be discussed in detail in the following chapters, an understanding of the Black-Scholes model and its implied volatility approach is essential when it comes to measuring market volatility using the VIX. For that purpose, I will present in the following sections the model assumptions, the formula as well as the sensitivity measures, a critical discussion of the model assumptions and a way of applying the model for calculating implied volatility.

3.1.1.2.3.1.1 Model Assumptions

The Black-Scholes model assumes no taxes, transaction fees or dividends, the feasibility of unlimited short-selling, a constant and symmetric risk free interest rate, a constant volatility of the underlying and log-normality of asset returns (Hull, 2006). I will concentrate on the basic Black-Scholes model that values European options. The extended model for valuing American options won’t be discussed with respect to the range of this paper.

Some chapters above already indicated why these assumptions cannot stand the reality. However, a comprehensive summary of model critics can be found in chapter 3.1.1.2.3.1.4.

3.1.1.2.3.1.2 The Black-Scholes Formula

The success of the Black-Scholes model can be mainly explained by its robust and handy formula, which needs only one unknown variable as input: The volatility. The Black-Scholes formula for a European call option denotes as follows:

illustration not visible in this excerpt

And for a European put option:

illustration not visible in this excerpt

Where:

illustration not visible in this excerpt

The terms [illustration not visible in this excerpt] and [illustration not visible in this excerpt] represent the values of a cumulative, standard normal probability density function at ݀[illustration not visible in this excerpt]or [illustration not visible in this excerpt]respectively (Dubil, 2011).

3.1.1.2.3.1.3 The Greeks

The *Greeks* are sensitivity measures derived out of the Black-Scholes formula. In a more mathematical sense, the Greeks quantify the sensitivity of the option price to changes in the input parameters (Marroni & Perdomo, 2014).

An understanding of the Greeks not only in quantitative but also in intuitive terms is a condition sine qua non for understanding implied volatility.

This is why the following sections will provide a brief overview about the most relevant Greeks.

3.1.1.2.3.1.3.1 Delta [illustration not visible in this excerpt]

Delta is represented by the term [illustration not visible in this excerpt]which in turn represents the *delta factor* that quantifies the sensitivity of the option price towards changes in the price of the underlying asset (Hull, 2006).

The delta factor is therefore the first partial derivation of the Black-Scholes formula with respect to the option price (Marroni & Perdomo, 2014) and can be explained as follows: Assume the delta of an option amounts to 0.5, this means a change of 1% in the price of the underlying asset leads to a change of 0.5% in the option price. An option that is at the money^{9} will have a delta very close to 0.5 with an increasing delta as the option moves into the money and vice versa. This is quite intuitive since the probability that the option ends in the money increases with increasing moneyness (difference between strike and underlying price) (Hull, 2006).

Delta is the most important ratio when it comes to the trading and hedging of all kinds of financial products and has therefore emerged to a widely used term.

Even though linear equity derivatives such as ETFs or ETNs are not priced using the Black-Scholes formula, they are also called “Delta One” products since their delta always amounts to one and presents a linear relationship between the price of the ETP and the price of the underlying.

3.1.1.2.3.1.3.1.1 Delta-Volatility Dynamics

It is important to understand the dynamics of delta regarding changes of the volatility of the underlying asset.

The delta of an option that is out of the money will increase when the volatility of the asset increases. This is explained by the increasing probability that that the option ends in the money (Marroni & Perdomo, 2014).

The delta of an option that already is in the money will behave inversely and decrease since an increasing volatility reduces the probability that the option ends in the money (Marroni & Perdomo, 2014).

3.1.1.2.3.1.3.1.2 Gamma (r)

Gamma and delta are closely connected. Gamma is defined as the change in delta per unit change in the price of the underlying asset. It is therefore the sensitivity measure for delta and represents the second derivation of the Black- Scholes formula with respect to the option price (Dubil, 2011).

In options trading and hedging, gamma is an important risk measure. Option positions with high gammas need very close supervision by the trader since high gammas and very sensitive deltas require frequent rebalancing in order to hold the delta-hedge.

3.1.1.2.3.1.3.2 Theta ([illustration not visible in this excerpt])

In order to understand theta, it is necessary to disassemble the price of an option into its major parts. The price of an option consists in general of its intrinsic value and its time value. A call option has an intrinsic value when its strike is below the price of the underlying asset or if the option is in the money (Jordan, 2011). Similarly, the reverse is true for put options. Sequitur, if the strike of a call option is above the price of the underlying asset, the option only comprises a time value. Theta in turn measures the sensitivity of the option price to the time decay. It is often denoted as the amount of money the option loses in value per day.

3.1.1.2.3.1.3.3 Rho (p)

Rho measures the sensitivity of the option price towards changes in the risk-free interest rate (congruent to the remaining time to maturity).

It is typically expressed as the amount of money the option will gain or lose when the risk-free interest rate changes by 100 basis points. Compared to the other Greeks, rho does not have a significant influence on the price of the option and is therefore a subordinate ratio in options valuation (Hull, 2006).

3.1.1.2.3.1.3.4 Vega (v)

Vega is defined as the change of the option price per unit change in the assumed annual volatility. It is the sensitivity measure for the volatility and the first derivation of the Black-Scholes formula with respect to the volatility (Dubil, 2011).

At this point it becomes clear why the Black-Scholes model faces a typical “Chicken-Egg” problem when it comes to volatility. On the one hand volatility is an input for the Black-Scholes model. On the other hand, implied volatility, which is the volatility that the market expects, is calculated recursively by solving the Black-Scholes formula for the volatility with given market premiums for the option.

However, sometimes we need to price an option using historical volatility estimates due to the lack of reliable market premiums. In these scenarios the term “*vega risk*” becomes important. Vega risk describes the situation where the historical volatility underestimates the possible stock price outcomes. If the volatility used upfront to price the hedge is too low, then the hedge will cost actually more than the Black-Scholes model predicts (Dubil, 2011).

Analogue to the delta-gamma relationship, there is the same relationship between vega and vomma. Vomma, as the second derivation of the Black- Scholes formula with respect to assumed volatility, is the sensitivity measure of vega. It quantifies the change in vega due to a one-unit change in the volatility.

3.1.1.2.3.1.4 Discussion of the Model

As in every model, the Black-Scholes model needs some assumptions (refer to chapter 3.1.1.2.3.1.1 to review the assumptions), in order to stay applicable.

**[...]**

^{1} EURO STOXX 50 Volatility Index

^{2} This problem emerges mainly while analyzing single stocks. Performance indices already incorporate these events in their calculation

^{3} The analysed data in this paper is assumed to be a sample that is used as an estimator for the whole population. In Excel, this way of calculating variance is represented by the formula VAR.S

^{4} At least for U.S. equity markets where 252 trading days per year is a common assumption

^{5} It is worth noting that this approach requires *adjusted* closing prices. Adjusted closing prices consider dividend payments and/or stock splits

^{6} A Markov-chain is a random process that is memoryless between each state. To be more precise: The probability distribution of the next state depends only on the current state and not on any events or states before

^{7} Which is actually not a *regular* t-distribution but a t-distribution adjusted by Dickey and Fuller for the purposes of the test

^{8} RiskMetrics is a database constructed by JP Morgan. It uses the EWMA model with Lambda = 0.94

^{9} An option is at the money when strike and underlying price are equal. An option is in the money when the underlying price is above the strike of the option similarly an option is out of the money when the strike is above the underlying price

- Quote paper
- Lennart Berning (Author), 2016, The Application of Volatility related Exchange Traded Products as Instruments for Volatility Hedging, Munich, GRIN Verlag, https://www.grin.com/document/321873

Comments