Statistic
1. (6)
It has been suggested that the average Facebook user spends 6 hours a week on the site. If we
assume a normal distribution, with a population standard deviation of 3 hours…
i.(3)
What p
ercentage of users spend more than 9 hours on Facebook?
ii.(3)
What p
ercentage of users spend between 2 and 4 hours on Facebook?
2. (6)
On final exams, Sarah scored 90 in Economics and Martha scored 85 in Geography. We can assume
that the students’ scores within each subject are approximately “normal” in distribution. The
mean scores
in Economics and Geography were both 72. The standard deviation in Economics was 12 points, while in
Geography it was 8.
Which of the two students has a better score, compared with his/her fellow students? Please show how you
arrived at your
conclusion.
3. (4)
What is the difference between random sampling, stratified random sampling, and cluster sampling? How
might random sampling of the entire population in the US help us understand the current spread of COVID
–
19,
rather than just testing
people who show symptoms?
4. A group of researchers at the University of Texas
–
Houston conducted a comprehensive study of
pregnant cocaine
–
dependent women (
Journal of Drug Issues
, Summer 1997). All the women in the study
used cocaine on a regular basis for
more than a year. One of the many variables measured was birth
weight (in grams) of the baby delivered. For a sample of 16 cocaine
–
dependent women, the mean birth
weight was 2,971 grams and the standard deviation was 410 grams. Test the hypothesis that th
e true
mean birth weight of babies delivered by cocaine
–
dependent women is less than 3,100 grams. Use alpha =
.05.
5. (3)
Confidence bands for population means are smaller if you (1) know the standard deviation of the
population, instead of estimating it
from the sample, or (2) if you decrease the level of confidence
required. In addition to these two possibilities, how else could you obtain a smaller confidence band?
6. (3)
We can calculate confidence bands for means by using z
–
values
from a normal dis
tribution table (or t
-values
from a t
–
distribution table), even if the population under study is not normally distributed. Briefly explain the
property of random samples that makes this possible.
1. (6)
It has been suggested that the average Facebook user spends 6 hours a week on the site. If we
assume a normal distribution, with a population standard deviation of 3 hours…
i.(3)
What percentage of users spend more than 9 hours on Facebook?
ii.(3)
What p
ercentage of users spend between 2 and 4 hours on Facebook?
2. (6)
On final exams, Sarah scored 90 in Economics and Martha scored 85 in Geography. We can assume
that the students’ scores within each subject are approximately “normal” in distribution. The
mean scores
in Economics and Geography were both 72. The standard deviation in Economics was 12 points, while in
Geography it was 8.
Which of the two students has a better score, compared with his/her fellow students? Please show how you
arrived at your
conclusion.
3. (4)
What is the difference between random sampling, stratified random sampling, and cluster sampling? How
might random sampling of the entire population in the US help us understand the current spread of COVID
–
19,
rather than just testing
people who show symptoms?
4. A group of researchers at the University of Texas
–
Houston conducted a comprehensive study of
pregnant cocaine
–
dependent women (
Journal of Drug Issues
, Summer 1997). All the women in the study
used cocaine on a regular basis for
more than a year. One of the many variables measured was birth
weight (in grams) of the baby delivered. For a sample of 16 cocaine
–
dependent women, the mean birth
weight was 2,971 grams and the standard deviation was 410 grams. Test the hypothesis that th
e true
mean birth weight of babies delivered by cocaine
–
dependent women is less than 3,100 grams. Use alpha =
.05.
5. (3)
Confidence bands for population means are smaller if you (1) know the standard deviation of the
population, instead of estimating it
from the sample, or (2) if you decrease the level of confidence
required. In addition to these two possibilities, how else could you obtain a smaller confidence band?
6. (3)
We can calculate confidence bands for means by using z
–
values from a normal dis
tribution table (or t
–
values
from a t
–
distribution table), even if the population under study is not normally distributed. Briefly explain the
property of random samples that makes this possible.
1. (6) It has been suggested that the average Facebook user spends 6 hours a week on the site. If we
assume a normal distribution, with a population standard deviation of 3 hours…
i.(3) What percentage of users spend more than 9 hours on Facebook?
ii.(3) What percentage of users spend between 2 and 4 hours on Facebook?
2. (6) On final exams, Sarah scored 90 in Economics and Martha scored 85 in Geography. We can assume
that the students’ scores within each subject are approximately “normal” in distribution. The mean scores
in Economics and Geography were both 72. The standard deviation in Economics was 12 points, while in
Geography it was 8.
Which of the two students has a better score, compared with his/her fellow students? Please show how you
arrived at your conclusion.
3. (4) What is the difference between random sampling, stratified random sampling, and cluster sampling? How
might random sampling of the entire population in the US help us understand the current spread of COVID-19,
rather than just testing people who show symptoms?
4. A group of researchers at the University of Texas-Houston conducted a comprehensive study of
pregnant cocaine-dependent women (Journal of Drug Issues, Summer 1997). All the women in the study
used cocaine on a regular basis for more than a year. One of the many variables measured was birth
weight (in grams) of the baby delivered. For a sample of 16 cocaine-dependent women, the mean birth
weight was 2,971 grams and the standard deviation was 410 grams. Test the hypothesis that the true
mean birth weight of babies delivered by cocaine-dependent women is less than 3,100 grams. Use alpha =
.05.
5. (3) Confidence bands for population means are smaller if you (1) know the standard deviation of the
population, instead of estimating it from the sample, or (2) if you decrease the level of confidence
required. In addition to these two possibilities, how else could you obtain a smaller confidence band?
6. (3) We can calculate confidence bands for means by using z-values from a normal distribution table (or t-values
from a t-distribution table), even if the population under study is not normally distributed. Briefly explain the
property of random samples that makes this possible.
>Q 1 99 97 98 99 97 96 98 98 99
91 95 91 98 99 94 95 94 98 93 97 98 98 97 90 93 93 98 100 99 94 93 91 95 94 94 98 94 98 100 92 93 96 97 95 96 100 96 98 97 95 % export value
19 19 20 18 31 ia
84 62 16 13 41 20 70 18 14 Country Corruption Index 16 36 35 77 76 68 75 40 61 35 41 17 20 25 81 67 36 35 59 59 88 30 35 35 24 73 34 85 72 58 80 41 27 37 20 29 76 46 76 41 38 73 61 52 44 73 49 27 , South
57 29 29 58 28 41 32 59 81 , Republic of
37 25 32 47 32 54 51 28 33 37 43 23 29 53 31 82 ealand
87 25 33 37 28 29 35 36 60 64 47 28 56 45 39 30 85 50 60 10 43 58 38 43 85 85 63 25 36 36 43 41 26 32 80 71 70 23 33 35 22 016 2
91
427 5.838 427 00
434 498 8
5
423 115 5.5 50 521 10 115 62 70 6.8 35 56 2
1 175 100 0.12 3 0.02 0.0013 0.0013 0.0012 0.0011 0.0011 0.0010 0.0017 0.0016 0.0015 0.0014 0.0023 0.0021 0.0019 0.0026 9
6
78
085
0.9977 0.9979 0.9983 0.9984 0.9985 0.9986 0.9987 0.9987 0.9988 0.9989 0.9989 0.9990 0.9990 0.9991 0.9991 0.9992 0.9992 0.9992 0.9993 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 ormation on how this distribution is used in hypothesis testing, see t-test for independent samples and t-test for dependent samples in the chapter on Basic Statistics and Tables. See also, Student’s t Distribution. As indicated by the chart below, the areas given at the top of this table are the right tail areas for the t-value inside the table. To determine the 0.05 critical value from the t-distribution with 6 degrees of freedom, look in the 0.05 column at the 6 row: t(.05,6) = 0.
0.4 0.25 0.1 0.05 0.025 0.01 0.005 1 1.94318 639
2.75 http://faculty.vassar.edu/lowry/tabs.html #t http://faculty.vassar.edu/lowry/tabs.html 1.
The “Q3” tab in the midterm excel data sheet contains data on the maximum Augu st temperatures for Denver between 1958 and 2 017 . Calculate the means and the standard deviations for the entire sample , and then separately for 1958 – 1987 and 1988 -2017 . Give the equations used to calculate the values. 2. Transparency International compiles an index of corruption f or over 150 world nations. The data for these nations for 2018 is provided in the midterm excel file. The 24 nations for whom oil and gas make up more than 50% of export revenues are separated from the other 127 nations. Note that higher values in the inde x indicate lower levels of corruption. i.(4) Find the means and variances of the corruption index for each of the two groups of countries. ii.(8) Construct a 95% confidence interval for the true mean difference between the two groups. iii.(8) Using t -st at, determine whether the difference in mean corruption between the two groups of countries is statistically significant at the 95% confidence level. Give your answer in the form of a classical hypothesis test. 3. A “Happiness” statistic of different countries was compiled by the World Value Surve y. The “Happiness (net)” statistic was calculated by the percentage of people who rated themselves as either “quite happy” or “very happy” minus the percentage of people who rated themselves as either “not very happy” or “not at all happy”. Now we want to know if reading comprehension affects happiness. Conduct a regression analysis for predicting “Happiness” from “Reading Comprehension” by answering the following questions. You can use the regression tool in the data analysis toolpak, rather than performin g regression manually in the spreadsheet. i. (2) State the dependent and independent variables and explain your selection. ii. (2) Make a scatterplot of the variables and comment on the relationship between X and Y evident from the scatterplot. iii. (5) Perform a linear regression on the dataset. What is the regression equation ? What is the r 2? What does the r 2 mean? i v. (2) Add a regression line to the scatter v. (2) Provide a .05 level of significance test for slope being 0 vi. (5) Show and comment on the residual plot. Are there any apparent violations of regression assumptions or outliers? vii. (2) Calculate the expected net happiness of Hungary, which has a reading comprehension index of 470. 13. (20) In the worksheet, Q12 lists a data set that you want to investigate for the relationship between body weight and brain weight of some animals. Show and comment on the scatterplot, discussing the apparent relationship between the two variables. Conduct a regression analysis (in data analysis toolpa k) for the two variables and report the results. In particular, comment on the strength of the relationship (i.e., coefficient of determination), the regression coefficients, their significance, and analyze the residual plot for violations of regression as sumptions and outliers. Try try to improve your initial regression by performing log transformations and/or eliminating outlier(s). After each change, perform a new regression analysis and report the results, making sure to discuss all of the points raise d above. Pay particular attention to changes in the coefficient of determination and in the residual plot. After performing 2 regression analyses with transformed variables and/or removed outliers, compare your results and determine the best regression equ ation. Explain why you chose that particular result. 1. The “Q3” tab in the midterm excel data sheet contains data on the maximum August temperatures for Denver 2. i.(4) ii.(8) iii.(8) 3. i. (2) ii. (2) iii. (5) Perform a linear regression on the dataset. What is the regression equation? What is the r iv. (2) v. (2) vi. (5) vii. (2) 13. (20) Show and comment on the scatterplot, discussing the apparent relationship between the two variables. Conduct a regression analysis (in data analysis toolpa Try try to improve your initial regression by performing log transformations and/or eliminating outlier(s). After 1. The “Q3” tab in the midterm excel data sheet contains data on the maximum August temperatures for Denver between 1958 and 2017. Calculate the means and the standard deviations for the entire sample, and then separately for 1958-1987 and 1988-2017. Give the equations used to calculate the values. 2.Transparency International compiles an index of corruption for over 150 world nations. The data for these nations for 2018 is provided in the midterm excel file. The 24 nations for whom oil and gas make up the index indicate lower levels of corruption. i.(4) Find the means and variances of the corruption index for each of the two groups of countries. ii.(8) Construct a 95% confidence interval for the true mean difference between the two groups. iii.(8) Using t-stat, determine whether the difference in mean corruption between the two groups of countries is statistically significant at the 95% confidence level. Give your answer in the form of a classical hypothesis test. 3.A “Happiness” statistic of different countries was compiled by the World Value Survey. The “Happiness (net)” statistic was calculated by the percentage of people who rated themselves as either happy” or “not at all happy”. Now we want to know if reading comprehension affects happiness. Conduct a regression analysis for predicting “Happiness” from “Reading Comprehension” by answering the following questions. You can use the regression tool in the data analysis toolpak, rather than performing regression manually in the spreadsheet. i. (2) State the dependent and independent variables and explain your selection. ii. (2) Make a scatterplot of the variables and comment on the relationship between X and Y evident from the scatterplot. iii. (5) Perform a linear regression on the dataset. What is the regression equation? What is the r2? What does the r2 mean? iv. (2) Add a regression line to the scatter v. (2) Provide a .05 level of significance test for slope being 0 vi. (5) Show and comment on the residual plot. Are there any apparent violations of regression assumptions or outliers? vii. (2) Calculate the expected net happiness of Hungary, which has a reading comprehension index of 470. 13. (20) In the worksheet, Q12 lists a data set that you want to investigate for the relationship between body weight and brain weight of some animals. Show and comment on the scatterplot, discussing the apparent relationship between the two variables. Conduct a regression analysis (in data analysis toolpak) for the two variables and report the results. In particular, comment on the strength of the relationship (i.e., coefficient of determination), the regression coefficients, their significance, and analyze the residual plot for violations of regression assumptions and outliers. Try try to improve your initial regression by performing log transformations and/or eliminating outlier(s). After raised above. Pay particular attention to changes in the coefficient of determination and in the residual plot. After performing 2 regression analyses with transformed variables and/or removed outliers, compare your results and determine the best regression equation. Explain why you chose that particular result.
2
3
Year
Maximum August Temperature
2
0
1
7
9
5
20
6
97
20
15
9
8
201
4
91
20
13
99
20
12
98
20
11
20
10
2009
2008
104
2007
2006
96
2005
2004
2003
2002
100
2001
95
2000
19
1998
1997
1996
1995
101
19
94
19
93
19
92
1991
19
90
19
89
19
88
19
87
19
86
19
85
19
84
19
83
19
82
19
81
19
80
19
79
19
78
19
77
19
76
19
75
19
74
19
73
19
72
19
71
19
70
19
69
19
68
19
67
19
66
19
65
19
64
19
63
19
62
19
61
19
60
19
59
19
58
Q5
Oil & Gas >
50
Corruption data – https://www.transparency.org/cpi20
18
Country
Corruption Index
1
Angola
2
Azerbaijan
25
3
Bahrain
36
4
Chad
5
Congo, Democratic Republic of the
6
Ecuador
34
7
Gabon
31
8
Iran
28
9
Iraq
10
Kazakhstan
11
Kuwait
41
12
Libya
17
13
Niger
27
14
Norway
15
Oman
52
16
Qatar
17
Saudi Arabia
49
18
Sudan
19
Syria
20
Trinidad and Tobago
21
Turkmenistan
22
United Arab Emirates
23
Venezuela
24
Yemen
Oil & Gas <50% export value
1
Afghanistan
2
Albania
3
Algeria
35
4
Argentina
40
5
Armenia
6
Australia
7
Austria
8
Bangladesh
26
9
Barbados
10
Belarus
44
11
Belgium
12
Benin
13
Bolivia
29
14
Bosnia and Herzegovina
38
15
Botswana
16
Brazil
17
Bulgaria
42
18
Burkina Faso
19
Burundi
20
Cambodia
21
Cameroon
22
Canada
23
Chile
24
China
39
25
Colombia
26
Costa Rica
56
27
Côte d’Ivoire
28
Croatia
48
29
Cuba
47
30
Cyprus
31
Czech Republic
32
Denmark
33
Dominican Republic
34
Egypt
35
El Salvador
36
Eritrea
37
Estonia
38
Ethiopia
39
Finland
40
France
41
Georgia
42
Germany
43
Ghana
44
Greece
45
45
Guatemala
46
Guyana
47
Haiti
48
Honduras
49
Hong Kong
50
Hungary
51
Iceland
52
India
53
Indonesia
54
Ireland
55
Israel
56
Italy
57
Jamaica
58
Japan
59
Jordan
60
Kenya
61
Korea
62
Kyrgyzstan
63
Laos
64
Latvia
65
Lebanon
66
Lesotho
67
Liberia
68
Lithuania
69
Luxembourg
70
Macedonia
71
Madagascar
72
Malawi
73
Malaysia
74
Mali
75
Malta
76
Mauritius
77
Mexico
78
Moldova
79
Mongolia
80
Morocco
81
Mozambique
82
Myanmar
83
Namibia
84
Nepal
85
Netherlands
86
New
Z
87
Nicaragua
88 Niger 34
89
Pakistan
90
Panama
91
Papua New Guinea
92
Paraguay
93
Peru
94
Philippines
95
Poland
96
Portugal
97
Romania
98
Russia
99
Rwanda
100
Senegal
101
Serbia
102
Sierra Leone
103
Singapore
104
Slovakia
105
Slovenia
106
Somalia
107
South Africa
108
Spain
109
Sri Lanka
110
Suriname
111
Sweden
112
Switzerland
113
Taiwan
114
Tajikistan
115
Tanzania
116
Thailand
117
Tunisia
118
Turkey
119
Uganda
120
Ukraine
121
United Kingdom
122
United States
123
Uruguay
124
Uzbekistan
125
Vietnam
126
Zambia
127
Zimbabwe
Q12
Reading Comprehension
Qingling Wu: Percentage of 13-year-old students who come from households with at least one computer, 2002Happiness
Qingling Wu: This statistic is compiled from responses to the survey question: “Taking all things together, would you say you are: very happy, quite happy, not very happy, or not at all happy?”. The “Happiness (net)” statistic was obtained via the following formula: the percentage of people who rated themselves as either “quite happy” or “very happy” minus the percentage of people who rated themselves as either “not very happy” or “not at all happy”. Albania
405
4.644
Reading Comprehension Data – http://www.businessinsider.com/pisa-worldwide-ranking-of-math-science-reading-skills
-2
-1
Algeria
350
5.872
Happiness Index Data -http://worldhappiness.report/ed/2017/
Argentina
425
6.599
Australia
503
7.284
Austria
485
7.006
Belgium
499
6.8
Brazil
407
6.635
Bulgaria
432
4.714
Canada
527
7.316
Chile
459
6.652
China
494
5.273
Colombia 425
6.357
Costa Rica
427
7.079
Croatia
487
5.293
Cyprus
443
5.621
Czech Republic 487
6.609
Denmark
500
7.522
Dominican Republic
358
5.230
Estonia
519
5.611
Finland
526
7.469
France 499
6.442
Georgia
401
4.286
Germany
509
6.951
Greece
467
5.227
Hong Kong 527
5.
472
Hungary
470
5.324
Iceland
482
7.504
Indonesia
397
5.262
Ireland
521
6.977
Israel
479
7.213
Italy 485
5.964
Japan
516
5.920
Jordan
408
5.336
Kazahkstan
5.819
Korea
517
5.838
Kosovo
347
5.279
Latvia
488
5.850
Lebanon 347
5.225
Lithuania 472
5.902
Luxembourg
481
6.863
Macedonia
352
5.
175
Malaysia
431
6.084
Malta
447
6.527
Mexico
423
6.578
Moldova
416
Montenegro
5.237
Netherlands 503
7.377
New Zealand 509
7.314
Norway
513
7.537
Peru
398
5.715
Poland
506
5.973
Portugal
498
5.195
Qatar
402
6.375
Romania
434
5.825
Russia
495
5.963
Singapore
535
6.572
Slovak Republic
453
6.098
Slovenia
505
5.758
Spain
496
6.403
Sweden 500 7.284
Switzerland
492
7.494
Taiwan
497
6.422
Thailand
409
6.424
Trinidad and Tobago 427
6.168
Tunisia
361
4.805
Turkey
428
5.5
UAE
6.648
United Kingdon
6.714
United States 497
6.993
Uruguay
437
6.454
Vietnam 487
5.074
Q13
Body Weight (kg)
Brain Weight (g)
Brachiosaurus
87000
154.5
Rat
0.2
1.9
Beaver
1.3
8.1
Cow
465
Grey Wolf
36.33
119.5
Goat
27.66
Guinea
Pig
1.04
Diplodocus
11700
Asian elephant
2547
4603
Donkey
187.1
419
Horse
655
Potar monkey
Cat
3.3
25.6
Giraffe
529
680
Gorilla
207
406
Human
1320
African elephant
6654
5712
Triceratops
9400
Rhesus monkey
179
Kangaroo
Hamster
0.1
Mouse
0.02
0.4
Rabbit
2.5
1
2.1
Sheep
55.5
Jaguar
157
Chimpanzee
52.16
440
Mole
Pig
192
180
Source: Jerison, H. J. Evolution of the Brain and Intelligence New York: Academic Press, 1973.
ztable
Standard Normal Distribution Table
Probability (x <= Z)
Mean = 0
Standard Deviation = 1
Z 0
0.01
0.03
0.04
0.05
0.06
0.07
0.08
0.09
-3
0.0013
0.0012
0.0011
0.0010
–
2.9
0.0019
0.0018
0.0017
0.0016
0.0015
0.0014
–
2.8
0.0026
0.0025
0.0024
0.0023
0.0022
0.0021
0.0020
–
2.7
0.0035
0.0034
0.0033
0.0032
0.0031
0.0030
0.0029
0.0028
0.0027
–
2.6
0.0047
0.0045
0.0044
0.0043
0.0041
0.0040
0.0039
0.0038
0.0037
0.0036
-2.5
0.0062
0.0060
0.005
0.0057
0.0055
0.0054
0.0052
0.0051
0.0049
0.0048
–
2.4
0.0082
0.0080
0.0078
0.0075
0.0073
0.0071
0.0069
0.0068
0.0066
0.0064
–
2.3
0.0107
0.0104
0.0102
0.0099
0.0096
0.0094
0.0091
0.0089
0.0087
0.0084
–
2.2
0.0139
0.0136
0.0132
0.0129
0.0125
0.0122
0.0119
0.0116
0.0113
0.0110
-2.1
0.0179
0.0174
0.0170
0.0166
0.0162
0.0158
0.0154
0.0150
0.0146
0.0143
-2
0.0227
0.0222
0.0217
0.0212
0.0207
0.0202
0.0197
0.0192
0.0188
0.0183
-1.9
0.0287
0.0281
0.0274
0.0268
0.0262
0.025
0.0250
0.0244
0.0239
0.0233
–
1.8
0.0359
0.0351
0.0344
0.0336
0.0329
0.0322
0.0314
0.0307
0.0301
0.0294
–
1.7
0.0446
0.0436
0.0427
0.0418
0.0409
0.0401
0.0392
0.0384
0.0375
0.0367
–
1.6
0.0548
0.0537
0.0526
0.0516
0.0505
0.0495
0.0485
0.0475
0.0465
0.0455
–
1.5
0.0668
0.0655
0.0643
0.0630
0.0618
0.0606
0.0594
0.0582
0.0571
0.0559
–
1.4
0.0808
0.0793
0.0778
0.0764
0.0749
0.0735
0.0721
0.0708
0.0694
0.0681
-1.3
0.0968
0.0951
0.0934
0.0918
0.0901
0.0885
0.0869
0.0853
0.0838
0.0823
–
1.2
0.1151
0.1131
0.1112
0.1093
0.1075
0.1056
0.1038
0.1020
0.1003
0.0985
–
1.1
0.1357
0.1335
0.1314
0.1292
0.1271
0.1251
0.1230
0.1210
0.1190
0.1170
-1
0.1587
0.1562
0.1539
0.1515
0.1492
0.1469
0.1446
0.1423
0.1401
0.1379
–
0.9
0.1841
0.1814
0.1788
0.1762
0.1736
0.1711
0.1685
0.1660
0.1635
0.1611
–
0.8
0.2119
0.2090
0.2061
0.2033
0.2005
0.1977
0.1949
0.1922
0.1894
0.1867
–
0.7
0.2420
0.2389
0.2358
0.2327
0.2296
0.2266
0.2236
0.2206
0.2177
0.2148
–
0.6
0.2743
0.2709
0.2676
0.2643
0.2611
0.25
0.2546
0.2514
0.2483
0.2451
–
0.5
0.3
0.3050
0.3015
0.2981
0.2946
0.2912
0.2877
0.2843
0.2810
0.2776
-0.4
0.3446
0.3409
0.3372
0.3336
0.3300
0.3264
0.3228
0.3192
0.3156
0.3121
-0.3
0.3821
0.3783
0.3745
0.3707
0.3669
0.3632
0.3594
0.3557
0.3520
0.3483
-0.2
0.4207
0.4168
0.4129
0.4090
0.4052
0.4013
0.3974
0.3936
0.3897
0.3859
-0.1
0.4602
0.4562
0.4522
0.4483
0.4443
0.4404
0.4364
0.4325
0.4286
0.4247
0
0.5000
0.4960
0.4920
0.4880
0.4840
0.4801
0.4761
0.4721
0.4681
0.4641
0 0.5000
0.5040
0.5080
0.5120
0.5160
0.5199
0.5239
0.5279
0.5319
0.5359
0.1
0.5398
0.5438
0.5478
0.5517
0.5557
0.5596
0.5636
0.5675
0.5714
0.5753
0.2
0.5793
0.5832
0.5871
0.5910
0.5948
0.5987
0.6026
0.6064
0.6103
0.6141
0.3
0.6179
0.6217
0.6255
0.6293
0.6331
0.6368
0.6406
0.6443
0.6480
0.6517
0.4
0.6554
0.6591
0.6628
0.6664
0.6700
0.6736
0.6772
0.6808
0.6844
0.6879
0.5
0.6915
0.6950
0.6985
0.7019
0.7054
0.7088
0.7123
0.7157
0.7190
0.7224
0.6
0.7257
0.7291
0.7324
0.7357
0.7389
0.7422
0.7454
0.7486
0.7517
0.7549
0.7
0.7580
0.7611
0.7642
0.7673
0.7704
0.7734
0.7764
0.7794
0.7823
0.7852
0.8
0.7881
0.7910
0.7939
0.7967
0.7995
0.8023
0.8051
0.8078
0.8106
0.8133
0.9
0.8159
0.8186
0.8212
0.8238
0.8264
0.8289
0.8315
0.8340
0.8365
0.8389
1
0.8413
0.8438
0.8461
0.8485
0.8508
0.8531
0.8554
0.8577
0.8599
0.8621
1.1
0.8643
0.8665
0.8686
0.8708
0.8729
0.8749
0.8770
0.8790
0.8810
0.8830
1.2
0.8849
0.8869
0.8888
0.8907
0.8925
0.8944
0.8962
0.8980
0.8997
0.9015
1.3
0.9032
0.9049
0.9066
0.9082
0.9099
0.9115
0.9131
0.9147
0.9162
0.9177
1.4
0.9192
0.9207
0.9222
0.9236
0.9251
0.9265
0.9279
0.9292
0.9306
0.9319
1.5
0.9332
0.9345
0.9357
0.9370
0.9382
0.9394
0.9406
0.9418
0.9429
0.9441
1.6
0.9452
0.9463
0.9474
0.9484
0.9495
0.9505
0.9515
0.9525
0.9535
0.9545
1.7
0.9554
0.9564
0.9573
0.9582
0.9591
0.9599
0.9608
0.9616
0.9625
0.9633
1.8
0.9641
0.9649
0.9656
0.9664
0.9671
0.9678
0.9686
0.9693
0.9699
0.9706
1.9
0.9713
0.9719
0.9726
0.9732
0.9738
0.9744
0.9750
0.9756
0.9761
0.9767
2
0.9773
0.9778
0.9783
0.9788
0.9793
0.9798
0.9803
0.9808
0.9812
0.9817
2.1
0.9821
0.9826
0.9830
0.9834
0.9838
0.9842
0.9846
0.9850
0.9854
0.9857
2.2
0.9861
0.9864
0.9868
0.9871
0.9875
0.9878
0.9881
0.9884
0.9887
0.9890
2.3
0.9893
0.9896
0.9898
0.9901
0.9904
0.9906
0.9909
0.9911
0.9913
0.9916
2.4
0.9918
0.9920
0.9922
0.9925
0.9927
0.9929
0.9931
0.9932
0.9934
0.9936
2.5
0.9938
0.9940
0.9941
0.9943
0.9945
0.9946
0.9948
0.9949
0.9951
0.9952
2.6
0.9953
0.9955
0.9956
0.9957
0.9959
0.9960
0.9961
0.9962
0.9963
0.9964
2.7
0.9965
0.9966
0.9967
0.9968
0.9969
0.9970
0.9971
0.9972
0.9973
0.9974
2.8 0.9974
0.9975
0.9976
0.9977
0.9978
0.9979
0.9980
0.9981
2.9 0.9981
0.9982
0.9983
0.9984
0.9985
0.9986
3
0.9987
0.9988
0.9989
0.9990
3.1
0.9991
0.9992
0.9993
3.2
0.9994
0.9995
3.3 0.9995 0.9995 0.9995
0.9996
0.9997
3.4
0.9998
ttable
The Shape of the Student’s t distribution is determined by the degrees of freedom. As shown in the animation above, its shape changes as the degrees of freedom increases. For more
inf
1.94318
t table with right tail probabilities
df\p
0.0005
1
0.32492
3.077684
6.313752
12.7062
31.82052
63.65674
636.6192
2
0.288675
0.816497
1.885618
2.919986
4.30265
6.96456
9.92484
31.5991
3
0.276671
0.764892
1.637744
2.353363
3.18245
4.5407
5.84091
12.924
4
0.270722
0.740697
1.533206
2.131847
2.77645
3.74695
4.60409
8.6103
5
0.267181
0.726687
1.475884
2.015048
2.57058
3.36493
4.03214
6.8688
6
0.264835
0.717558
1.439756
2.44691
3.14267
3.70743
5.9588
7
0.263167
0.711142
1.414924
1.894579
2.36462
2.99795
3.49948
5.4079
8
0.261921
0.706387
1.396815
1.859548
2.306
2.89646
3.35539
5.0413
9
0.260955
0.702722
1.383029
1.833113
2.26216
2.82144
3.24984
4.7809
10
0.260185
0.699812
1.372184
1.812461
2.22814
2.76377
3.16927
4.5869
11
0.259556
0.697445
1.36343
1.795885
2.20099
2.71808
3.10581
4.437
12
0.259033
0.695483
1.356217
1.782288
2.17881
2.681
3.05454
4.3178
13
0.258591
0.693829
1.350171
1.770933
2.16037
2.65031
3.01228
4.2208
14
0.258213
0.692417
1.34503
1.76131
2.14479
2.62449
2.97684
4.1405
15
0.257885
0.691197
1.340606
1.75305
2.13145
2.60248
2.94671
4.0728
16
0.257599
0.690132
1.336757
1.745884
2.11991
2.58349
2.92078
4.015
17
0.257347
0.689195
1.333379
1.739607
2.10982
2.56693
2.89823
3.9651
18
0.257123
0.688364
1.330391
1.734064
2.10092
2.55238
2.87844
3.9216
19
0.256923
0.687621
1.327728
1.729133
2.09302
2.53948
2.86093
3.8834
20
0.256743
0.686954
1.325341
1.724718
2.08596
2.52798
2.84534
3.8495
21
0.25658
0.686352
1.323188
1.720743
2.07961
2.51765
2.83136
3.8193
22
0.256432
0.685805
1.321237
1.717144
2.07387
2.50832
2.81876
3.7921
23
0.256297
0.685306
1.31946
1.713872
2.06866
2.49987
2.80734
3.7676
24
0.256173
0.68485
1.317836
1.710882
2.0639
2.49216
2.79694
3.7454
25
0.25606
0.68443
1.316345
1.708141
2.05954
2.48511
2.78744
3.7251
26
0.255955
0.684043
1.314972
1.705618
2.05553
2.47863
2.77871
3.7066
27
0.255858
0.683685
1.313703
1.703288
2.05183
2.47266
2.77068
3.6896
28
0.255768
0.683353
1.312527
1.701131
2.04841
2.46714
2.76326
3.6739
29
0.255684
0.683044
1.311434
1.699127
2.04523
2.46202
2.75
3.6594
30
0.255605
0.682756
1.310415
1.697261
2.04227
2.45726
3.646
inf
0.253347
0.67449
1.281552
1.644854
1.95996
2.32635
2.57583
3.2905
Calculator for p values given a t score. One and two tailed tests
between 1958 and 2017. Calculate the
means
and the
standard deviations
for the
entire sample
, and then
separately for 1958
–
1987 and 1988
–
2017
. Give the equations used to calculate the values.
Transparency International compiles an index of corruption f
or over 150 world nations. The data for
these nations for 2018 is provided in the midterm excel file. The 24 nations for whom oil and gas make up
more than 50% of export revenues are separated from the other 127 nations. Note that higher values in
the inde
x indicate lower levels of corruption.
Find the means and variances of the corruption index for each of the two groups of countries.
Construct a 95% confidence interval for the true mean difference between the two groups.
Using t
–
st
at, determine whether the difference in mean corruption between the two groups of countries is
statistically significant at the 95% confidence level. Give your answer in the form of a classical hypothesis test.
A “Happiness” statistic of different countries was compiled by the World Value Surve
y. The
“Happiness (net)” statistic was calculated by the percentage of people who rated themselves as either
“quite happy” or “very happy” minus the percentage of people who rated themselves as either “not very
happy” or “not at all happy”. Now we want to
know if reading comprehension affects happiness. Conduct
a regression analysis for predicting “Happiness” from “Reading Comprehension” by answering the
following questions.
You can use the regression tool in the data analysis toolpak, rather than
performin
g regression manually in the spreadsheet.
State the dependent and independent variables and explain your selection.
Make a scatterplot of the variables and comment on the relationship between X and Y evident from
the scatterplot.
2
? What
does the r
2
mean?
Add a regression line to the scatter
Provide a .05 level of significance test for slope being 0
Show and comment
on the residual plot. Are there any apparent violations of regression
assumptions or outliers?
Calculate the expected net happiness of Hungary, which has a reading comprehension index of
470.
In the worksheet, Q12 lists a data set that
you want to investigate for the relationship between
body weight and brain weight of some animals.
k) for the two variables and report the results. In
particular, comment on the strength of the relationship (i.e., coefficient of determination), the regression
coefficients, their significance, and analyze the residual plot for violations of regression as
sumptions and
outliers.
each change, perform a new regression analysis and report the results, making sure to discuss all of the points
raise
d above. Pay particular attention to changes in the coefficient of determination and in the residual plot. After
performing 2 regression analyses with transformed variables and/or removed outliers, compare your results and
determine the best regression equ
ation. Explain why you chose that particular result.
more than 50% of export revenues are separated from the other 127 nations. Note that higher values in
“quite happy” or “very happy” minus the percentage of people who rated themselves as either “not very
each change, perform a new regression analysis and report the results, making sure to discuss all of the points