Excerpt

Fresh up your knowledge

Correlation

Fresh up your knowledge

Correlation

· In easy words:

Correlation

describes how two variables are

In easy words:

Correlation

describes how two variables are

determined by each other and how a change in one of them

affects the other

affects the other

· A crude measure of the relationship between variables is

covariance

covariance

· We can measure the relationship between two variables using

l i

ffi i

correlation coefficients

3

Why should correlation be interesting?

Why should correlation be interesting?

Let's use an easy example:

Let s use an easy example:

*Imagine you have created a TV-advertisement for an already*

*existing sport drink called "BLUECOW" and your boss is asking*

*existing sport drink called BLUECOW and your boss is asking*

*you if your spot benefits the numbers of sold drinks. How can you*

*find out if it does or if it's crap?*

Answer:

Answer:

You measure the correlation between the adverts and the

numbers of sold drinks

numbers of sold drinks

4

Do you still know anything about correlation from

h

the statistics course?

· The correlation coefficient has to lie between

-1

and

+1

.

ff

f

f

l

h

· A coefficient of

+1

indicates a

perfect positive relationship

,

· So a coefficient of

1

indicates a

perfect negative relationship

· So a coefficient of

-1

indicates a

perfect negative relationship

,

· And a coefficient of

0

indicates

no relationship

at all.

p

5

How to interpret the values

How to interpret the values

The correlation coefficient is a commonly used measure of the

The correlation coefficient is a commonly used measure of the

size of an effect.

· Values of

± 1

represent a

small effect

· Values of

± .1

represent a

small effect

,

·

± .3

is a

medium effect

and

·

± .5

is a

large effect

.

However, focus on the context of your research to interpret the

,

y

p

values instead of simply following these benchmarks.

6

Starting slowly

Covariance

Starting slowly

Covariance

cov(x,y) = [SUM (x

i

- x

mean

)(y

i

- y

mean

)] / n - 1

with n = number of value pairs

with n number of value pairs

x

mean

= [SUM (x

1

+x

2

+...x

n

)] / n of x-values

y

mean

= [SUM (y

1

+y

2

+...y

n

)] / n of y-values

7

Starting Slowly

Covariance Example

Starting Slowly

Covariance Example

**Participant**

**1**

**2**

**3**

**4**

**5**

**Mean**

**s**

Adverts watched

5

4

4

6

8

5.4

1.67

BlueCow cans

bought

8

9

10

13

15

11.0

2.92

cov(x,y) = [SUM (x

i

- x

mean

)(y

i

- y

mean

)] / n - 1

cov(x,y) = [SUM (-0.4)(-3) + (-1.4)(-2) + (-1.4)(-1) + (0.6)(2) + (2.6)(4)] / 5 1

cov(x,y) = [1.2+2.8+1.4+1.2+10.4] / 4

(

)

/

cov(x,y) = 17 / 4

cov(x,y) = 4.25

A positive value shows that if one variable increases, the other increases as well.

A positive value shows that if one variable increases, the other increases as well.

A negative value shows that if one variable increases, the other decreases.

8

Covariance

Standardization

Covariance

Standardization

The value of

covariance

alone is not really objective and comparable, so

we need to

standardize

it by using the

standard deviation (s

x

, s

y

)

to

receive

Pearson's correlation coefficient

.

r = cov

xy

/ s

x

s

y

"Th

P

'

l ti

ffi i

t

i

t i

t ti ti

d

"The

Pearson's correlation coefficient

is a parametric statistic and

requires interval data for both variables. To test its significance we can

assume normality, too."

9

Source: Field, A., Miles & Field, Z. (2012)

Covariance Example

Pearson's correlation

Covariance Example

Pearson s correlation

**Participant**

**1**

**2**

**3**

**4**

**5**

**Mean**

**s**

Adverts watched

5

4

4

6

8

5.4

1.67

cov(x,y) = 4.25

**r = cov**

**xy**

**/ s**

**x**

**s**

**y**

BlueCow cans

bought

8

9

10

13

15

11.0

2.92

cov(x,y) 4.25

s

x

= 1.67

s = 2 92

**r cov**

**xy**

**/ s**

**x**

**s**

**y**

r = 4.25 / 4.88

r = 0 87

s

y

= 2.92

s

x

s

y

= 4.88

r = 0.87

10

Remember this slide?

Remember this slide?

· The correlation coefficient has to lie between -1 and +1.

· A coefficient of +1 indicates a perfect positive relationship,

· So a coefficient of -1 indicates a perfect negative relationship,

· And a coefficient of 0 indicates no linear relationship at all.

Answer to our example:

A Pearson's correlation coefficient,

**r = 0.87**

, shows a strong

positive relationship between the "BlueCow" ads and bought

"Bl

C

"

"BlueCow" cans.

11

Face new problems:

Causality

Face new problems:

Causality

· Direction of causality:

No statistical reason why we shouldn't be able to interpret the

variables in the opposite way.

For example:

The number of cans somebody is buying affects the number of adverts he or

h i

i

she is seeing.

· Third-variable problem:

Normally there is more than one reason why we start buying / doing

something, and these unmeasured variables affect the results as

well.

12

Excerpt out of 18 pages

- Quote paper
- Kersten Thiele (Author), 2018, Discovering Statistics Using R-Correlation, Munich, GRIN Verlag, https://www.grin.com/document/430073

Publish now - it's free

Comments