Introduction to Econometrics
Economics113
UC Santa Cruz
Winter 2020
Assignment 2
1. The following table contains the minimum wage and unemployment rates for a sample of six
states.
state unemp minwage
Arizona 8 9
California 12 10
Nevada 5 6
New Mexico 6 5
Texas 3 5
Utah 6 7
a) Carefully graph unemployment and minimum wage, with unemployment on the vertical
access. Draw a line that best fits the points and label the approximate intercept and slope.
b) Compute the variance of unemployment and minimum wage and compute their covariance.
c) Estimate the regression of a state’s unemployment rate on its minimum wage and write the
resulting regression equation.
d) Interpret the coefficient on minimum wage in a sentence.
e) How do your estimated intercept and slope in part c) compare to your approximations in
part a)?
2. Use the unemployment and minimum wage data above to answer this question.
a) What is the predicted unemployment rate for a state with a minimum wage of 10 dollars?
How does this compare to the actual unemployment rate in California?
b) If a state increases its minimum wage from 10 dollars to 15 dollars, what is the predicted
change in the unemployment rate?
c) Compute the predicted unemployment rate and the prediction error for each state using the
regression you estimated in Question 1.
d) Compute the R-squared for your regression line. Interpret your R-squared in a sentence.
3. Use the unemployment and minimum wage data above to answer this question.
a) What assumption is required in order use regression through the origin when regressing
unemployment on minimum wage? Do you think this is a reasonable assumption in this
case? Explain.
b) Compute the regression through the origin coefficients and write the resulting equation.
c) What is the predicted unemployment rate for a state with a minimum wage of 10 dollars?
Compare your answer to part 2 a).
4. The following regression equation examines the relationship between house prices (in dollars)
and the number of parks in a town.
ℎ??????????̂ = 180,000 + 2,200 ∗ ??????
a) Interpret the coefficient on the number of parks.
b) What is the predicted house price in a town with 15 parks? A town with 30 parks?
c) How many parks would there need to be in a city in order for the predicted house price to
be 500,000 dollars?
d) Suppose that a house in a town with 12 parks sells for 194,000 dollars. What is the error in
the predicted house price?
e) Pittsburgh has 1,000 parks. What is the predicted house price in Pittsburgh? Do you think
this is reasonable? If no, explain the shortcoming of the regression.
5. You decide to estimate the regression using the number of parks per 1,000 households. You get
the following result.
ℎ??????????̂ = 150,000 + 400,000 ∗ ?????????ℎ???????
a) Interpret the coefficient on parks per thousand households.
b) What is the predicted house price in a city with 10 parks and 30,000 households?
c) What is the predicted house price in Pittsburgh if there are 1,000 parks and 1.5 million
people? How does this compare to your answer in part 4e)? Which estimate do you think
is more accurate?
Now you regress the natural log of house price on the number of parks.
ln (ℎ??????????)̂ = 12.72 + 0.05 ∗ ??????
d) Interpret the coefficient on parks in a sentence.
e) What is the expected difference in price for a house in a city with 20 parks and city with
25 parks?
6. The regression below presents the results of a regression of a baseball team’s wins on their
payroll (in millions) and whether or not they have a new coach (a binary variable equal to 1 of the
coach is new and 0 otherwise).
?????̂ = 52.5 + 0.22??????? − 4.2 ∗ ???????ℎ?
a) Interpret the coefficient on “payroll” in sentence. Be precise.
b) Interpret the coefficient on “newcoach” in a sentence. Be precise.
c) What is the predicted win total for a team with a payroll of $110 million and
a new coach?
d) How much does a team need to increase its payroll to offset the negative effect of having
a new coach?
7. The Stata data set “college_gpa” has data on students’ college GPAs, high school GPAs, ACT
scores, lectures skipped during the academic year, and other characteristics. We wish to examine
the predictors of college GPAs. You must submit your do-file (commands).
a) Regress college GPA on high school GPA and write the estimated regression line.
b) Interpret the intercept of your regression in a sentence. Does this coefficient make sense?
Explain
c) Interpret the coefficient on high school GPA in a sentence. Does this coefficient make
sense? Explain.
d) What is the predicted college GPA of a student with a high school GPA of 3.8?
e) What high school GPA is needed for a student to have a predicted college GPA of 4.0?
f) What percent of the variation in college GPAs is predicted by high school GPA?
8. Using the same data, we will examine the relationship between being male, skipping classes,
and college GPA.
a) What percent of individuals in the data are male?
b) Regress college GPA on male and write the regression line. Interpret the coefficient on
male in a sentence.
c) You think that the negative effect of being male on college GPA might be strongest for
students with weaker ACT scores. Estimate the effect of being male on college GPA for
students who have an ACT score of 22 or lower. How does the estimate compare to the
one you found in part b)?
d) Regress college GPA on the number of lectures skipped and write the resulting regression
equation. What is the predicted GPA for a student who skips 40 lectures?
e) Make a new variable called “skipmany” that equals 1 if a student skipped 50 lectures or
more and 0 otherwise. Now regress college GPA on “skip many” and write the resulting
regression equation. Interpret the coefficient on “skipmany” in a sentence.