REGRESSION ON DUMMY VARIABLES Reference : Gujarati ...

5-1

REGRESSION ON DUMMY VARIABLES Reference : Gujarati, Chapter 9; Neter, Chapter 10; Stewart, Session 3.6. We use dummy variable to represent qualitative explanatory variables in regression analysis, usually use the value 0 and 1. For example, Yi = β0 + β1Di + ui where Y = annual salary of a college professor,      

1

if male college professor

 

0

otherwise

Di = 

Consider the following example with SAS program:

(1)

ECON 7710, By WONG Wing Keung, Professor of Economics

DATA a ; INPUT Y label Y D CARDS; 22.0 1 18.0 0 18.5 0 20.5 1 17.5 0 ;

5-2

D @@ ; = ’annual salary of a college professor’ = ’Sex : 1 for male and 0 for female’ ; 19.0 21.7 21.0 17.0 21.2

0 1 1 0 1

proc plot ; plot y * D = ’*’ ; run ; proc reg data = a ; model Y = D ; PLOT p.*D = ’p’ y*D = ’*’ / overlay ; run ;


5-3

Plot of Y*D. Symbol used is ’*’. Y | | 22.0 | * 21.5 | * 21.0 | * 20.5 | * 20.0 | 19.5 | 19.0 | * 18.5 | * 18.0 | * 17.5 | * 17.0 | * | -------------------------------------------------0 1 Sex :0 for male and 1 for female

Model: MODEL1 Dependent Variable: Y

Source Model Error C Total

DF 1 8 9

Root MSE Dep Mean

Variable INTERCEP D

DF 1 1

annual salary of a college professor Analysis of Variance Sum of Mean Squares Square 26.89600 26.89600 3.88800 0.48600 30.78400

0.69714 19.64000

R-square Adj R-sq

F Value 55.342

Prob>F 0.0001

0.8737 0.8579

Parameter Estimates Parameter Standard T for H0: Estimate Error Parameter=0 18.000000 0.31176915 57.735 3.280000 0.44090815 7.439

Prob > |T| 0.0001 0.0001


P PRED r e d i c t e d V a l u e o f Y

22

20

18

16

------------------------------------------------------------| | | * | | * | | ? | | | | * | | | | | | * | | | | * | | ? | | * | | * | | | | | | | | | ------------------------------------------------------------0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Sex :0 for male and 1 for female D

yˆi = 18.00 + 3.28Di (0.32)

(0.44)

with R2 = 0.8737. Mean salary of a female college professor is E(Yi | Di = 0) = β0 and mean salary of a male college professor is E(Yi | Di = 1) = β0 + β1

5-4


Regression on One Quantitative Variable and One Qualitative Variable with Two Classes For example, Yi = β0 + β1Di + β2Xi + ui where Y = annual salary of a college professor, X = years of teaching experience

     

1

if male college professor

 

0

otherwise

Di = 

Then, Mean salary of a female college professor is E(Yi | Di = 0) = β0 + β2Xi and mean salary of a male college professor is E(Yi | Di = 1) = β0 + β1 + β2Xi

5-5


5-6

Regression on One Quantitative Variable and One Qualitative Variable with More Than Two Classes Qualitative Variable may consist of more than two classes, e.g. Race in Singapore. Rule : the number of dummies be one less than the number of categories of the variable. For example, Race in Singapore consists of Chinese, Malay, Indian and others. Then, we can define D1i = D2i = D3i =

     

1

if Chinese

    

0

otherwise

     

1

if Malay

    

0

otherwise

     

1

if Indian

    

0

otherwise

The model will be Yi = β0 + β1D1i + β2D2i + β3D3i + β4Xi + ui


5-7

Then, Mean salary of a Chinese professor is E(Yi | D1i = 1, D2i = 0, D3i = 0) = β0 + β1 + β4Xi Mean salary of a Malay professor is E(Yi | D1i = 0, D2i = 1, D3i = 0) = β0 + β2 + β4Xi Mean salary of an Indian professor is E(Yi | D1i = 0, D2i = 0, D3i = 1) = β0 + β3 + β4Xi Mean salary of a professor of other races is E(Yi | D1i = 0, D2i = 0, D3i = 0) = β0 + β4Xi


5-8

Example Use Dummy variables in seasonal analysis: Let D2i = D3i = D4i =

     

1

if second quarter

    

0

otherwise

     

1

if third quarter

    

0

otherwise

     

1

if fourth quarter

    

0

otherwise

The model is: Profiti = β1 + β2D2i + β3D3i + β4D4i + β5Salei + ui Then, profit in Spring is: E(Profiti | D2i = 0, D3i = 0, D4i = 0) = β1 + β5Salei Then, profit in Summer is: E(Profiti | D2i = 1, D3i = 0, D4i = 0) = β1+β2+β5Salei and so on.


5-9

Interaction Effect Consider the following Model: Yi = β0 + β1D1i + β2D2i + β3Xi + ui where where Y = annual expenditure on clothing X = income

D1i = D2i =

     

1

if female

    

0

if male

     

1

if college graduate

    

0

otherwise

There may be intreaction between D1i and D2i. If so, the model becomes Yi = β0 + β1D1i + β2D2i + β3D1iD2i + β4Xi + ui Then, E(Yi | D1i = 1, D2i = 1) = (β0 + β1 + β2 + β3) + β4Xi


5-10

and so on. If there are intreactions between D1i and X and between D2i and X, the model becomes The model will be Yi = β0 + β1D1i + β2D2i + β3D1iD2i + β4Xi +β5D1iXi + β6D2iXi + ui Note that the dummy variables in this model affect the intercept as well as the slope. For example, E(Yi | D1i = 1, D2i = 0) = β0 + β1 + β4Xi + β5Xi = (β0 + β1) + (β4 + β5)Xi

E(Yi | D1i = 0, D2i = 1) = β0 + β2 + β4Xi + β6Xi = (β0 + β2) + (β4 + β6)Xi

and so on.


5-11

Piecewise Linear Regression Interaction Effect can be used in modelling Piecewise Linear Regression. Consider the following Model: Yi = β0 + β1Xi + β2(Xi − X ∗)Di + ui where where X ∗ = threshold value of X also known as a knot

     

1

if Xi ≥ X ∗

 

0

if Xi < X ∗

Di = 

Refer to Figures 15.8 and 15.9 for the knot. When Xi < X ∗, E(Yi | Di = 0) = β0 + β1Xi . When Xi ≥ X ∗, E(Yi | Di = 1) = (β0 − β2X ∗) + (β1 + β2)Xi . To test whether there is no knot effect, we simply test H0 : β2 = 0 .


5-12

If there is a “jump” in the knot, we can use the following model: Yi = β0 + β1Di + β2Xi + β3XiDi + ui To test that there are no jump and no knot effect, we simply test H0 : β1 = β3 = 0 . For example: DATA a ; INPUT Y X @@ ; label Y = ’Total Cost, dollars’ X = ’Total Output’ ; if X > 5500 then D = 1 ; else D = 0 ; X1 = (X - 5500)*D ; CARDS; 256 1000 414 2000 634 3000 778 4000 1003 5000 1839 6000 2081 7000 2423 8000 2734 9000 2914 10000


5-13

; proc plot ; plot Y * X = ’*’ ; run ; proc reg data = a ; model Y = X X1 ; test X1 ; PLOT p.*x = ’p’ y*x = ’*’ / overlay ; run ; The output is Plot of Y*X. Symbol used is ’*’. T | o 3000 | * t | * a | l | * | C 2000 | * o | * s | t | , | 1000 | * d | * o | * l | * l |* a 0 | r ----------------------------------------------------------------s 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Total Output


5-14

Model: MODEL1 Dependent Variable: Y

Source Model Error C Total

DF 2 7 9

Root MSE Dep Mean

Variable INTERCEP X X1

Variable INTERCEP X X1

DF 1 1 1

DF 1 1 1

Total Cost, dollars Analysis of Variance Sum of Mean Squares Square F Value 8832644.8985 4416322.4492 129.608 238521.50152 34074.50022 9071166.4

184.59280 1507.60000

R-square Adj R-sq

0.9737 0.9662

Parameter Estimates Parameter Standard T for H0: Estimate Error Parameter=0 -145.716667 176.73414648 -0.824 0.279126 0.04600814 6.067 0.094500 0.08255241 1.145 Variable Label Intercept Total Output

Dependent Variable: Y Test: Numerator: 44651.2500 Denominator: 34074.5

DF: DF:

1 7

F value: Prob>F:

1.3104 0.2899

ECON 7710, By WONG Wing Keung, Professor of Economics ------------------------------------------------------------P PRED | | r 3000 | ? | e | * | d | p | i | ? | c | | t 2000 | ? | e | * | d | p | | | V | p | a 1000 | p * | l | * | u | ? | e | ? | | ? | o 0 | | f | | ------------------------------------------------------------Y 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Total Output X

To see the jump effect as well as the knot effect, we use the following: DATA a ; set a ; XD = X * D ; run ; proc reg data = a ; model Y = D X XD ; test D , XD ; run ;

5-15


5-16

The output is Model: MODEL1 Dependent Variable: Y

Source Model Error C Total Root MSE Dep Mean C.V.

Variable INTERCEP D X XD

DF 1 1 1 1

Total Cost, dollars

Analysis of Variance Sum of Mean DF Squares Square F Value 3 9062580.9 3020860.3 2111.136 6 8585.50000 1430.91667 9 9071166.4 37.82746 R-square 0.9991 1507.60000 Adj R-sq 0.9986 2.50912 Parameter Estimates Parameter Standard T for H0: Estimate Error Parameter=0 59.600000 39.67377387 1.502 96.200000 104.96693924 0.916 0.185800 0.01196209 15.532 0.094500 0.01691695 5.586

Dependent Variable: Y Test: Numerator: 137293.6258 Denominator: 1430.917

DF: DF:

2 6

F value: Prob>F:

95.9480 0.0001

REGRESSION ON DUMMY VARIABLES Reference : Gujarati ...

REGRESSION ON DUMMY VARIABLES Reference : Gujarati ...

Suggest Documents

Using Dummy Variables

Comparison of multiple regression analysis using dummy variables ...

Dummy variables and their interactions in regression ...

1 How Robust Is Linear Regression with Dummy Variables ? Eric ...

The Use of Dummy Variables in Regression Analysis - MoreSteam.com

The Use of Dummy Variables in Regression Analysis - MoreSteam.com

The Use of Dummy Variables in Regression Analysis - MoreSteam.com

7 Dummy-Variable Regression

On dummy variables of structure-preserving transformations

Dummy Endogenous Variables in Nonseparable

Problems in Model Averaging with Dummy Variables

Reference Guide on Multiple Regression

Dummy Variables and Omitted Variable Bias

Lecture 14. More on using dummy variables (deal with seasonality ...

Dummy Variable Multiple Regression Forecasting Model - IJESI

a dummy variable regression on students' academic ... - TJournal

DUMMY VARIABLE REGRESSION MODELS AND ANALYSIS OF ...

ACIX Model with Interval Dummy Variables and Its ... - Science Direct

Using dummy variables to estimate economic base ...

COLLECTIVE VARIABLES MODULE Reference ...

A Smart Guide to Dummy Variables - ATS @ UCLA

Interpreting Dummy Variables in Semi-logarithmic ... - Semantic Scholar

Mngt 917: Interpreting Dummy Variables When You Run Interactions

Tests of Inference for Dummy Variables in Regressions with ...