1 Formula for the Sample Linear Correlation Coefficient

59 downloads 4092 Views 28KB Size Report
sample linear correlation coefficient. • rXY measures the strength and direction of linear association between two quantitative variables X and Y. Formula for the ...
The Sample Linear Correlation Coefficient

Formula for the Sample Linear Correlation Coefficient

• rXY (or just r for short) is the sample linear correlation coefficient

rXY =

i

- X )(Yi - Y ) /(n - 1)

i =1

sX sY

66 64

Sister's Height (inches)

68

Scatter Plot of the Heights of a Brother and Sister in 11 Families

62

For example, consider 11 families randomly selected from the population of families with one brother and one sister, both full grown. Let Xi denote the height (in inches) of the brother in the i-th family. Let Yi denote the height (in inches) of the sister in the i-th family.

n

60

• rXY measures the strength and direction of linear association between two quantitative variables X and Y.

? (X

66

68

70

72

Brother's Height (inches)

1

25

0

0

1

0

1

-3

9

1 1

2

68

64

-1

3

66

65

-3

4

67

63

-2

-1

2

4

5

70

65

1

1

1

1

1

6

71

62

2

-2

-4

4

4

7

70

65

1

1

1

1

1

8

73

64

4

0

0

16

0

9

72

66

3

2

6

9

4

10

65

59

-4

-5

20

16

25

66

62

-3

-2

6

9

4

759

704

0

0

39

74

66

11

X = 759 / 11 = 69

∑i =1 n

rXY =

Y = 704 / 11 = 64

( X i − X ) (Yi −Y ) /(n − 1) s X sY

=

SX =

74 11 −1

SY =

Scatter Plot of the Heights of a Brother and Sister in 11 Families

r=0.558 68

(Yi − Y )2

4

5

66

(Xi − X)2

10

2

64

( X i − X ) (Yi −Y )

Xi − X Yi − Y

69

Sister's Height (inches)

Yi

71

62

Xi

60

i 1

66 11 − 1

3. 9 ≈ 0 .558 (7 .4 ) (6 .6 )

66

68

70

72

Brother's Height (inches)

Is a Bear's Neck Circumference Correlated with a Bear's Weight?

400 300

Weight in Pounds

200 100

• rXY estimates the population linear correlation coefficient ?XY . • rXY is dimensionless and is always between 1 and -1. • rXY =1 if and only if all data points fall perfectly on a line with positive slope. • rXY =-1 if and only if all data points fall perfectly on a line with negative slope. • rXY =0 suggests that there is no linear association between X and Y.

500

Some Properties of rXY

10

15

20

25

30

Neck Circumference in Inches

2

Is Outside Temperature Associated with Crawling Time? Nov

Oct

32

Dec Apr

31

Jun

Feb Jan

30

Average Crawling Age in Weeks

400 300 200

Aug

Mar

29

100

Weight in Pounds

Sep Jul

r=0.934 33

500

34

Is a Bear's Neck Circumference Correlated with a Bear's Weight?

May

10

15

20

25

30

30

40

Neck Circumference in Inches

Sep

70 Apr

31

Jun

Feb Mar

China Russia Iraq Morocco

Vietnam S.Africa

Egypt Pakistan Haiti Uganda

Madagascar

50

30

Jan

Life Expectancy in Years

32

Dec

60

33

Aug

Yemen

Cambodia Angola

May

30

40

50

70

Japan France Canada,Australia UK US Mexico Sri Lanka

29

Average Crawling Age in Weeks

r=-0.699

Nov

Oct

60

Does Watching TV Increase Life Expectancy? 80

34

Is Outside Temperature Associated with Crawling Time? Jul

50

Average Monthly Temperature (F) in First 6 Months

60

Average Monthly Temperature (F) in First 6 Months

70

0

50

100

150

200

People Per Television

3

Guess the correlation for each pair of variables. Japan France Canada,Australia UK US Mexico Sri Lanka

r=-0.804

China Russia Iraq

60

Morocco

Vietnam S.Africa

Egypt

A) Stalk Diameter of Corn Plant

Weight of Corn Plant

B) Person’s Age

Person’s Year of Birth

C) Daily Dow Jones Industrial Average

Daily Rainfall in Seattle

D) # of Ultrasounds During Pregnancy

Birth Weight of Baby

E) U.S. Monthly Ice Cream Cone Sales

Drowning per Month in U.S.

F) Age of Wife

Age of Husband

Pakistan Haiti Uganda

Madagascar

50

Life Expectancy in Years

70

80

Does Watching TV Increase Life Expectancy?

Yemen

Cambodia Angola

-1 0

50

100

150

0

1

200

People Per Television

4

Suggest Documents