paper (due in 26 hours)
Paper 1: First DraftPrompt
Use the results from the first two assignments, what you have learned in class,and what you have learned from reading the papers and textbook for the course to answer the following questions.
•Did assigning a person to get an encouraging phone call increase their probability of voting?
AND
•Can we get an estimate of the causal effect of getting a call encouraging you to vote on your probability of voting with non-experimental data by using regression to adjust for differences between people who got a call and those that didn’t (why or why not)?
[NOTA BENE: This is a paper about whether or not a non-experimental research design can be used to generate credible causal estimates. This is done by comparing and contrasting experimental estimates and non-experimentalregression adjustment estimates. The context has to do with voting,but the focus of the paper should be empirical designs, notvoting.]
Structure of Paper
You can use the papers on the reading list as a guide for what a paper should look like. You can include your tables in the body of the paper or put them at the end of the paper. Your paper should be 10 pages long without counting tablesand you willwant to include the following sections:
1.Title: Your paper should have an informative title.
2.Abstract: One paragraphs that sums up the key elements of the paper. Write this first.
3.Introduction: 1-1.5pagesthat sums the paper up with more detail than the abstract. Should tell the reader why the questions the paper answers are important and cover data, econometric methods, results,and conclusion.
4.Data: Describe the data you use in the analysis and how it was generated. You may need to do some research online.
5.Empirical Methods: Describe the statistical methods used. Include the equations for the regressions you will run.Describe each part of the regression and what the coefficients will reveal. Describe the assumptions under which the results will generate causal effects.
6.Results: Describe and interpret your statistical findings.Be detailed.Discuss the robustness of your estimates.
7.Conclusion: Interpret your findingsand how they answer the motivating questions.
Q1: treatment: 14,870 control: 86,124 received and listened: 6,874
Q2:
Q3. Since the p-value is large enough for us to not reject the hypothesis that the mean of
control group and treatment group are the same for the sample characteristics, the
randomization worked well. The reason is that if the randomization was implemented
correctly, there would be no huge difference in characteristics between two groups.
Q4. The change in probability when the individual goes from not receiving the call to
receiving the call increases 1.1 percentage points and is significant. 1 percent increase in the
voting behavior is large in practical. If you call 100 hundred people, then there will be one
more voter in the selection.
Q5.
Q6. Adding the contact=1 control variable changes the coefficient of treatment effect
dramatically. This shows that being assigned to a treatment group won’t increase the voting
rate but being assigned and answered the call will increase the rate.
Adding other control variables did not change the coefficient much. They only low down the
magnitude. All of this shows that the covariates and being assigned to the treatment group are
correlated to some degree.
mean of control
group
mean of treatment
group
mean
difference p-value
newreg 0.0481399 0.0489576 -0.00082 0.667442
age 55.79974 55.76005 0.039688 0.813659
female 0.5631061 0.5565458 0.00656 0.141801
vote00 0.7337211 0.73154 0.002181 0.578586
vote98 0.5710255 0.5741089 -0.00308 0.482903
Q7. It won’t generate the causal effect. We can see from the last table. When adding nearly all
other factors into the regression model, the treatment effect drops significantly. Individuals
are more likely to vote if they are new registers no matter whether they received an
encouraging call.
(1)
(2)
(3)
(4)
(5)
(6)
(7)
VARIABLES
vote02
vote02
vote02
vote02
vote02
vote02
vote02
1.treat_real
0.0466**
-0.165***
-0.161***
-0.119***
-0.0916***
-0.0434
-0.0580**
(0.0182)
(0.0235)
(0.0238)
(0.0242)
(0.0249)
(0.0280)
(0.0290)
1.contact
0.470***
0.466***
0.379***
0.319***
0.253***
0.280***
(0.0340)
(0.0344)
(0.0351)
(0.0359)
(0.0401)
(0.0414)
newreg
-1.356***
-1.019***
-1.009***
0.716***
0.917***
(0.0323)
(0.0332)
(0.0336)
(0.0383)
(0.0387)
age
0.0218***
0.0244***
0.0184***
0.00848***
(0.000363)
(0.000378)
(0.000422)
(0.000451)
female
-0.112***
-0.133***
-0.138***
(0.0137)
(0.0153)
(0.0159)
1.vote00
2.473***
1.971***
(0.0203)
(0.0214)
1.vote98
1.357***
(0.0173)
Constant
0.381***
0.381***
0.446***
-0.774***
-0.776***
-2.379***
-2.187***
(0.00694)
(0.00694)
(0.00714)
(0.0212)
(0.0225)
(0.0301)
(0.0306)
Observations
100,994
100,994
100,994
100,994
98,310
98,310
98,310
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
mean of not answering |
mean of answering |
mean difference |
p-value |
|
newreg |
0.0536518 |
0.0434972 |
0.0101546 |
0.0042193 |
age |
53.52564 |
58.35918 |
-4.833542 |
5.50E-55 |
female |
0.5480307 |
0.5662047 |
-0.018174 |
0.0280706 |
vote00 |
0.6898449 |
0.7800407 |
-0.090196 |
2.45E-35 |
vote98 |
0.5427714 |
0.6105615 |
-0.06779 |
7.18E-17 |
Q2. No. Table 1 does not suggest that the control group will provide a good counterfactual for the treatment group’s voting potential outcomes. Because table 1 shows the difference between treatment and comparison group is significant. It does not say anything with the control group.
However, it might prove the counterfactual to some degree if we treat people in treatment group who didn’t receive phone call as control group.
Q3. Yes, it is different. The mean of not answering in table 1 is not much different from the previous table but the mean of answering and difference is significantly different. The p-value is much smaller.
The reason is that answering the phone call will change voters’ behavior. The experiment might be successful.
Q4.
Q5. Adding covariates has an effect on the treatment effect because in increases the significance of getting a convincing variance in the experiment. Besides, on the relationship between the covariates, it indicates an increase in the precision of the estimates. On the outcome, it creates a predictive equation to use in the prediction of the coefficients being estimated. Lastly, on the treatment, the effect is that it is likely to have a positive outcome in general when applied in the right way.
Q6. a) Adding covariates in an RCT design increases the precision of the estimates hence improving on its significance and the regression adjustment design produces highly predictive results.
b) the magnitude and statistical significance of the point estimates for the two designs result in a positive and statistically significant because of the strong covariates added.
c) The two tables differ because of the possibility of there being a bias in the results. More so, in Table, the level of bias is low compared to Table 1.
Q7. Adding covariates to the regression in Table 2 reduced the bias because it caused the estimates to become more similar to the correct casual estimate. I think that it did because of the significance of the added covariates implying that the estimates achieved a better predictive value of the outcome. Additionally, the variable that reduced the bias the most is vote00 because it was increasing the precision of the regression equation and model.
Q8. I think that not all bias in the estimates was eliminated by adding the covariates to the regression because some of the estimates were not statistically significant hence the inability to eliminate bias in general.