‘”regression” analysis
Word file:Introduction (relevance of technique
and
purpose of the exercise)
Conclusion (interpretations of all results and recommendations)
Excel file:Infant Mortality
1/ Do questions of the exercise
2/ Check assumptionsof the model
3/ Try some interactionsto find the best mode
Word is based on excel, do Introduction (relevance of technique and purpose of the exercise)
and
Conclusion (interpretations of all results and recommendations)
Sheet1
Infant Mortality(deaths per thousand births) | %age adult literacy | %age finishing primary school | GNP per capita | ||||||||||
Cuba | 18 | 98 | 20 | ||||||||||
Sri Lanka | 85 | 92 | 3300 | ||||||||||
Costa Rica | 19 | 94 | 84 | 58 | |||||||||
Vietnam | 44 | 60 | |||||||||||
China | 54 | 80 | 86 | 2400 | |||||||||
South Africa | 56 | 76 | 68 | 4000 | |||||||||
Saudi Arabia | 38 | 59 | 11000 | ||||||||||
Brazil | 78 | 5600 | |||||||||||
Zimbawe | 82 | 1800 | |||||||||||
Morocco | 42 | 3400 | |||||||||||
Pakistan | 36 | 2100 | |||||||||||
Nigeria | 1600 |
Sheet2
Sheet3
> (typo)
.83 +0.0022 GNP +4. 9 Unemp +21.42 Educa
31 46 3
6636
455
262570359
07 s
21 .031863 04
.6772877101
2
275112247
20 Standard Error 35 02647
03
-280.0125369343 50.3422364049 078
0.0002893159 0.0043061078 195245732
919952639
310444103
-5.9919952639 310444103 >5% -5.4481651767 48.2935617842 >5% 361699264
46177
Finland 308801233
588490055
343469586
438233859
560006075
538752485
Sales/Capita GNP per head Unemployment rate %age spend on education 0.0022977119 4.2195245732 21.4226983038 4.2 5.9 3.5 4.4 8.4 1971 9.9 10 5.7 9.1 9.9 3.9 6.3 4.3 5 4.4 5 14.4 6.3 5.9 7 4.4 3.6 5.6 6.3 7.6 8.6 5.9 4.8 Regression Statistics 20 ANOVA 65
0.00005932 < 5%
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% -121.0133353092 56.5958130387 >5% 0.0006258048 0.002731028 <5%
-5.9878520746 4.932117783 >5% 1.2443190793 29.2109701347 <5%
RESIDUAL OUTPUT Observation Residuals Country Austria Belgium Bulgaria Czech Rep. Denmark France Germany Greece Hungary Ireland Italy Netherlands Poland Portugal Romania Spain Switzerland Sweden Turkey UK 4.2 5.8 8.1 5.9 13.5 3.5 6.6 4.4 5.2 8.4 10 5.7 9.1 4.6 9.9 3.9 7.3 5.1 6.3 4.3 9.3 5 4.4 5 14.4 5.6 $68.09 6.3 5.9 7 3.3 14.2 4.4 3.6 5.6 6.3 7.6 8.6 3.7 5.9 4.8 Regression Statistics 74% and Skewness
uals
Change the Y ANOVA <5% Stat. Sig
Total 19 55534.6871274582 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% -97.1147612378 20.1542388226 >5% 0.0008091927 0.0026371438 <5%
1.8157269048 28.8037527661 <5%
check range in book p. 196 Nomality is not violated 23.7377575957 27.1897541428 SUMMARY OUTPUT 6.1274225713 36.3524587116 Regression Statistics 27.092039838 Multiple R 29.701569802 R Square 25.8205843078 Adjusted R Square 3.2111779389 Standard Error 21.2534351845 Observations 20 0.4207380729 38.8889406757 ANOVA 2.0968122206 df SS MS F Significance F 3.8008900498 Regression 1 950.4041552658 23.2487629095 Residual 18 4.1955655227 Total 19 16.360454741 7.5580143036 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% 62.3250374937 Intercept -14.0198979087 24.2758926574 2.6874614327 Predicted Sales/Capita 0.0779681637 -0.018856947 0.3229428277 52.0520088849
135.785376643295 132.991198238111 26.3905804660167 64.5004824648947 197.164767110683 125.483476102523 108.53937218657 75.8004636531507 66.153435184493 131.534374436555 104.393185157875 129.327430203661 71.1201625817398 91.3422208534079 28.0673452754208 89.5725809732703 158.278014303639 167.392353810684 35.2940689103188 110.033237016713 23.737757595676 27.1897541428409 6.12742257127982 36.3524587115759 27.0920398379561 29.7015698020384 25.820584307782 3.21117793886496 21.253435184493 0.420738072919022 38.8889406757017 2.096812220581 3.80089004983917 23.248762 26 4.19556552270128 16.3604547410154 7.55801430363917 62.3250374936639 2.68746143268912 52.0520088849264 Predicted Y
2
Regression
1
SUMMARY OUTPUT
p.1
8
3
Model: Predicted Sales/cap = –
11
4
9
21
Regression Statistics
Multiple R
0.
7
10
6
5
R Square
0.5345
16
53.40%
average
Adjusted R Square
0.45237
25
Standard Error
5
8.4
116.8525
14
18
Threashold for residuals
Observation
ANOVA
df
SS
MS
F
Significance F
Regression 3
666
38
13
222
12
6.5070594887
0.00394022
19
< 5%
Statiscally sig.
Residual
17
58031.6676908
20
341
3.6
Total
124669.699553951
Coefficients
t Stat
P-value
Lower 95%
Upper 95%
Lower 95.0%
Upper 95.0%
Intercept
-11
4.8
15
78.28996
449
-1.4667927238
0.1606877493
-280.0125369343
50.3422364049
>5%
GNP per head
0.0022977119
0.0009519296
2.413741341
0.0273548383
0.0002893159
0.00430
61
<5%
Unemployment rate
4.2
4.8400058964
0.8718015357
0.395462657
–
5.9
1
4.4
14.4
%age spend on education
21.4226983038
12.736119575
1.6820428057
0.1108365431
-5.4481651767
48.2935617842
these coefficients give you the model
RESIDUAL OUTPUT
Observation
Predicted
Sales/Capita
Residuals
Country
1
141.1050110113
-29.0573919637
Austria
2
15
3.9
6.2447824546
Belgium
3
32.1578881741
-11.8947302793
Bulgaria
4
54.7672884559
46.0856527206
Czech Rep.
5
229.7909035734
-59.7181763006
Denmark
6
179.8
1971
192.0670778351
Finland
Outlier
7
151.7406303742
-55.9587240737
France
8
124.2402274129
-41.5214395341
Germany
9
83.2552007781
-10.6659150638
Greece
10
60.6308801233
-1
5.7
Hungary
11
142.7851158825
-11.6714795188
Ireland
12
11
9.9
-54.4546045233
Italy
13
132.5310691088
-1.1068266846
Netherlands
14
97.7164232545
-22.7953706229
Poland
15
90.8053302087
-22.7118722648
Portugal
16
6.7651463829
25.4977644152
Romania
17
120.2673826729
-1
4.3
Spain
18
168.3638233859
-1
7.6
Switzerland
19
193.9264924164
35.790898888
Sweden
20
2
3.5
14.4255297355
Turkey
21
112.9313706531
4
9.1
UK
data
Source Economist Pocket World in Figures 2011
Y=Sales/capita
X= GNP and Unemp and Educa
Country
Pop (millions)
Computer Sales
-114.8351502647
1 Austria 8.4
941.2
$112.05
$49,600
5.8
$141.11
2 Belgium
10.5
1681.9
$160.18
$47,090
8.1
$153.94
3 Bulgaria 7.6
154
$20.26
$6,550
13.5
$32.16
4 Czech Rep.
10.2
1028.7
$100.85
$20,670
6.6
$54.77
5 Denmark
5.5
935.4
$170.07
$62,120
5.2
$229.79
6 Finland
5.3
$371.89
$51,320
6.3
$179.82
7 France
61.9
5928.9
$95.78
$44,510
$151.74
8 Germany
82.5
6824.3
$82.72
$44,450
4.6
$124.24
9 Greece
11.2
813
$72.59
$31,670
$83.26
10 Hungary 10 449
$44.90
$15,410
7.3
5.1
$60.63
11 Ireland 4.4
576.9
$131.11
$60,460
$142.79
12 Italy
58.9
3858.2
$65.50
$38,490
9.3
$119.96
13 Netherlands
16.5
2168.5
$131.42
$52,960
$132.53
14 Poland 38
2847
$74.92
$13,850
5.6
$97.72
15 Portugal
10.7
72
8.6
$68.09
$22,920
$90.81
16 Romania
21.3
687.2
$32.26
$9,300
3.3
$6.77
17 Spain
44.8
4745.8
$105.93
$35,220
14.2
$120.27
18 Switzerland
7.5
1130.4
$150.72
$64,430
$168.36
19 Sweden
9.2
2113.4
$229.72
$51,950
$193.93
20 Turkey
75.8
2879
$37.98
$9,940
3.7
$23.56
21 UK 61
9887.2
$162.09
$43,540
$112.93
Regression2
SUMMARY OUTPUT
Model: Predicted sales/cap=-32.2+0.0016GNP-0.527Unemp+15.22Educ
Multiple R
0.8608056365
R Square
0.7409863438
74%
closer to 1
Adjusted R Square
0.6924212833
69%
Standard Error
29.9835812953
59.9671625906
Observations
df SS MS F Significance F Regression 3
41150.4447707765
13716.8149235922
15.2576015709
0.00005932
Stat. Sig
Residual 16
14384.2423566816
899.0151472926
Total 19
55534.6871274582
Intercept
-32.2087611353
41.8908213525
-0.7688739465
0.4531689934
-121.0133353092
56.5958130387
GNP per head
0.0016784164
0.0004965371
3.3802436956
0.0038159552
0.0006258048
0.002731028
Unemployment rate
-0.5278671458
2.5755796407
-0.2049508147
0.8401951569
-5.9878520746
4.932117783
%age spend on education
15.227644607
6.5962029214
2.3085470215
0.0346578879
1.2443190793
29.2109701347
Predicted Sales/Capita
1
137.1439896071
-25.0963705595
2
132.3952470051
27.7857053758
3
24.9554160194
-4.6922581247
4
66.0018192089
34.8511219676
5
197.2217719175
-27.1490446448
6
124.0164561638
-28.2345498633
7
107.6404225426
-24.9216346638
8
75.108615856
-2.5193301417
9
67.4631931047
-22.5631931047
10
131.4216039245
-0.3079675608
11
103.6225451408
-38.1183006587
12
130.4957796366
0.9284627877
13
68.7108290703
6.2102235612
14
92.7780831901
-24.6846252462
15
29.9566686787
2.3062421195
16
86.410987695
19.5220480193
17
159.3060963628
-8.5860963628
18
167.3895074617
62.3278838426
(Typo in p. 184)
19
36.2773255918
1.7042047512
20
110.8477673956
51.237478506
Finland out
Eliminate Finland: Outlier
2007 Source Economist Pocket world in Figures
Country Pop (millions) Computer Sales Sales/Capita GNP per head Unemployment rate %age spend on education
1 Austria 8.4 941.2 $112.05
49600
2 Belgium 10.5 1681.9 $160.18
47090
3 Bulgaria 7.6 154 $20.26
6550
4 Czech Rep. 10.2 1028.7 $100.85
20670
5 Denmark 5.5 935.4 $170.07
62120
7 France 61.9 5928.9 $95.78
44510
8 Germany 82.5 6824.3 $82.72
44450
9 Greece 11.2 813 $72.59
31670
10 Hungary 10 449 $44.90
15410
11 Ireland 4.4 576.9 $131.11
60460
12 Italy 58.9 3858.2 $65.50
38490
13 Netherlands 16.5 2168.5 $131.42
52960
14 Poland 38 2847 $74.92
13850
15 Portugal 10.7
728.6
22920
16 Romania 21.3 687.2 $32.26
9300
17 Spain 44.8 4745.8 $105.93
35220
18 Switzerland 7.5 1130.4 $150.72
64430
19 Sweden 9.2 2113.4 $229.72
51950
20 Turkey 75.8 2879 $37.98
9940
21 UK 61 9887.2 $162.09
43540
Regression3
SUMMARY OUTPUT
Model: Predicted Sales/cap=-38.48 + 0.0017GNP +15.31Educ
Multiple R
0.8604105733
Check assumptions on the best model
Issue or not
Solutions if problem
R Square
0.7403063547
close to 1 Good
Normality
Check range for
kurtosis
No problem
Change the Y
Adjusted R Square
0.7097541611
71%
better adjusted Rsquare
Heteroscedasticity
Check graph of predicted Y with
abs resid
There is heteroscedasticity issue
Standard Error
29.1265043448
58.2530086896
Autocorrelation
(we don’t have time series data)
Observations 20
Multicollinearity
Check correlation between all Xs (close to 1 highly correlated – not good) OR check VIF (must be lower than 5 for each X)
No problem
Eliminate the X with highest p-value
df SS MS F Significance F Regression 2
41112.6817865343
20556.3408932671
24.2308740653
0.000010542
Residual 17
14422.0053409239
848.3532553485
Intercept
-38.4802612076
27.79129164
-1.3846157892
0.1840755133
-97.1147612378
20.1542388226
GNP per head
0.0017231683
0.0004332016
3.9777511063
0.0009730057
0.0008091927
0.0026371438
%age spend on education
15.3097398355
6.3958258117
2.3937080662
0.0284865891
1.8157269048
28.8037527661
skewness
0.8317848946
check range in book p. 196
between -1.02 and 1.03
Nomality is not violated
RESIDUAL OUTPUT kurtosis
0.1798015122
between -1.27 and 2.46
Observation Predicted Sales/Capita Residuals abs resid
1
135.7853766433
–
23.7377575957
2
132.9911982381
27.1897541428
3
26.390580466
–
6.1274225713
4
64.5004824649
36.3524587116
5
197.1647671107
–
27.092039838
0.4031625579
6
125.4834761025
–
29.701569802
0.1625400481
p-value>.05 so no significant heteroscedacity
7
108.5393721866
–
25.8205843078
0.1160144952
8
75.8004636532
–
3.2111779389
16.4937627235
9
66.1534351845
–
21.2534351845
10
131.5343744366
–
0.4207380729
11
104.3931851579
–
38.8889406757
12
129.3274302037
2.0968122206
13
71.1201625817
3.8008900498
950.4041552658
3.4935651067
0.0779681637
14
91.3422208534
–
23.2487629095
4896.7957580359
272.0442087798
15
28.0673452754
4.1955655227
5847.1999133016
16
89.5725809733
16.360454741
17
158.2780143036
–
7.5580143036
18
167.3923538107
62.3250374937
5.1279973743
9.114043699
0.562647881
0.5806154642
-14.0198979087
24.2758926574
19
35.2940689103
2.6874614327
0.1520429404
0.0813451828
1.8691081046
-0.018856947
0.3229428277
20
110.0332370167
52.0520088849
sample size = 20
90948
Abs Resid
Reg Interaction
0.8615115717 | |||||||||
0.7422021881 | |||||||||
0.6938650984 | |||||||||
29.9131250736 | |||||||||
41217.9663007715 | 13739.3221002572 | 15.3547139925 | 0.0000571751 | ||||||
14316.7208266866 | 894.7950516679 | ||||||||
-4.6235008111 | 102.7456889769 | -0.0449994628 | 0.964664582 | -222.4346313453 | 213.1876297232 | ||||
0.0017559593 | 0.0004550554 | 3.858781671 | 0.0013893627 | 0.000791285 | 0.0027206336 | ||||
2.5031306672 | 37.9082099165 | 0.0660313603 | 0.9481709009 | -77.8586844188 | 82.8649457531 | ||||
EducSquare | 1.0998013269 | 3.2062233847 | 0.3430208052 | 0.7360482386 | -5.6970886167 | 7.8966912705 | |||
133.9875545043 | -21.9399354566 | ||||||||
131.1171773043 | 29.0637750766 | ||||||||
29.1115561314 | -8.8483982367 | ||||||||
63.978106348 | 36.8748348285 | ||||||||
203.0849695454 | -33.0122422727 | ||||||||
123.534637123 | -27.7527308225 | ||||||||
108.2150867978 | -25.496298919 | ||||||||
77.4779177031 | -4.8886319888 | ||||||||
63.8076307709 | -18.9076307709 | ||||||||
132.6405862956 | -1.526949932 | ||||||||
102.9740587886 | -37.4698143065 | ||||||||
128.3827897221 | 3.0414527022 | ||||||||
68.2038367103 | 6.7172159213 | ||||||||
88.6756412531 | -20.5821833092 | ||||||||
31.9440882422 | 0.3188225559 | ||||||||
89.5273140246 | 16.4057216896 | ||||||||
157.0202576235 | -6.3002576235 | ||||||||
169.1469020429 | 60.5704892614 | ||||||||
37.1485981704 | 0.8329321727 | ||||||||
109.1854164715 | 52.8998294302 |
Reg Interaction 2
Then after trying Interactions we need to check assumptions | |||||||||
0.8606615376 | |||||||||
0.7407382823 | |||||||||
0.6921267102 | |||||||||
29.9979357384 | |||||||||
41136.6687503818 | 13712.2229167939 | 15.2379001695 | 0.000059774 | ||||||
14398.0183770764 | 899.8761485673 | ||||||||
-28.846752865 | 65.5808169087 | -0.4398657142 | 0.665918108 | -167.871874157 | 110.1783684269 | ||||
0.0015026329 | 0.001422549 | 1.0562960951 | 0.3065220677 | -0.0015130362 | 0.004518302 | ||||
13.1513869246 | 14.7700751885 | 0.8904075813 | 0.386440352 | -18.1597737374 | 44.4625475866 | ||||
GNP*Educ | 0.0000473736 | 0.0002901615 | 0.1632661944 | 0.8723532506 | -0.0005677412 | 0.0006624884 | |||
135.590307996 | -23.5426889483 | ||||||||
132.6672559418 | 27.5136964391 | ||||||||
28.111385676 | -7.8482277812 | ||||||||
64.3873019352 | 36.4656392413 | ||||||||
199.6883538608 | -29.6156265881 | ||||||||
125.0173459979 | -29.2354396974 | ||||||||
108.1281306401 | -25.4093427613 | ||||||||
75.8832902901 | -3.2940045758 | ||||||||
65.1040286646 | -20.2040286646 | ||||||||
130.8694787326 | 0.2441576311 | ||||||||
103.8635631458 | -38.3593186636 | ||||||||
129.0341380661 | 2.3901043582 | ||||||||
69.2867726566 | 5.634279975 | ||||||||
89.5930074123 | -21.4995494684 | ||||||||
29.981204445 | 2.2817063532 | ||||||||
89.2834657683 | 16.649569946 | ||||||||
158.7084100041 | -7.9884100041 | ||||||||
167.8695945581 | 61.8477967462 | ||||||||
36.4918544827 | 1.4896758603 | ||||||||
109.6052352992 | 52.4800106024 |
Reg Hetero
Model: predicted sales/cap = 2.96 + 2.299 GNP +0.132 Educ | |||||||
0.8801524423 | We improved Heteroscedasticity based on the graph | ||||||
0.7746683217 | |||||||
0.7481587125 | |||||||
0.3122437451 | 0.6244874901 | ||||||
5.6980986183 | 2.8490493092 | 29.2221705608 | 0.0000031549 | ||||
1.6574346575 | 0.0974961563 | ||||||
7.3555332758 | |||||||
2.9600956906 | 0.2979299156 | 9.9355436814 | 0.000000017 | 2.3315185135 | 3.5886728676 | ||
0.0000229976 | 0.000004644 | 4.9520705368 | 0.0001211517 | 0.0000131995 | 0.0000327956 | ||
0.1320067348 | 0.0685649256 | 1.9252807997 | 0.0710914359 | -0.0126526133 | 0.276666083 | ||
Predicted | Ln(sales/per cap) | ||||||
4.8664151216 | – | 0.1474911706 | |||||
4.821891849 | 0.2544122795 | ||||||
3.5727534645 | – | 0.5639491094 | |||||
4.0162854696 | 0.5973779583 | ||||||
5.4975624351 | – | 0.361336282 | |||||
4.7361567248 | – | 0.1740829272 | |||||
4.5895694612 | – | 0.1741227039 | |||||
4.2032555711 | 0.0815617602 | ||||||
3.9877228706 | – | 0.1832850759 | |||||
4.918158826 | – | 0.0420944256 | |||||
4.5053065302 | – | 0.3231915884 | |||||
4.8380816297 | 0.0403489526 | ||||||
4.0178500007 | 0.2985849269 | ||||||
4.266040145 | – | 0.0451590019 | |||||
3.6095954847 | – | 0.1356771868 | |||||
4.3509003762 | 0.3119067797 | ||||||
5.181068006 | – | 0.1656441951 | |||||
5.1580715764 | 0.2787782436 | ||||||
3.677116635 | – | 0.0400166371 | |||||
4.595043003 | 0.4930794031 |
4.86641512164656 4.82189184903841 3.57275346447401 4.01628546958415 5.49756243505022 4.7361567248122 4.58956946120262 4.20325557109386 3.98772287064779 4.91815882598398 4.50530653015494 4.83808162973825 4.01785000065614 4.26604014503506 3.60959548473841 4.35090037621422 5.18106800597221 5.15807157635807 3.6771166350487 0.147491170642622 0.254412279531499 0.563949109352671 0.59737795826261 0.36133628199686 0.174082927188254 0.174122703852715 0.0815617601528986 0.183285075899581 0.042094425645347 0.323191588358695 0.04034895263064 0.298584926895512 0.0451590019284591 0.135677186783743 0.311906779722502 0.165644195144155 0.278778243593867 0.0400166370848507 0.493079403088434
Unemployment out
0.574 is Pearson Correlation | |||
4.718923951 | 287680 | 33.64 | |
5.0763041286 | 277831 | 34.81 | |
3.0088043551 | 22925 | 12.25 | 0.5741560564 |
4.6136634278 | 19.36 | close to 1 means highly correlated | |
5.1362261531 | 521808 | 70.56 | |
4.5620737976 | 253707 | 32.49 | |
4.4154467573 | 204470 | 21.16 | |
4.2848173312 | 123513 | 15.21 | |
3.8044377947 | 78591 | 26.01 | |
4.8760644003 | 259978 | 18.49 | |
4.1821149418 | 192450 | ||
4.8784305824 | 264800 | ||
4.3164349276 | 77560 | 31.36 | |
4.2208811431 | 135228 | ||
3.473918298 | 30690 | 10.89 | |
4.6628071559 | 154968 | ||
5.0154238108 | 360808 | ||
5.43684982 | 394820 | 57.76 | |
3.637099998 | 36778 | 13.69 | |
5.0881224061 | 208992 | 23.04 |
Time series
Country1 | Year 2018 | ||||
Year 2019 | |||||
Year 2020 | |||||
Country2 | This kind of data is called time series data | ||||
Country3 | |||||
Country 4 | |||||
Country 5 | |||||
Country24 |