Statistic

While this is older news, the election 2016 was an interesting one because all of the election polls favored Hilary Clinton.  This was obviously not the case and Trump was declared the winner by some very narrow margins.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Read at least two of the articles and write about why the election polls went wrong and what (if anything) could be done to have more accurate election polls.  Do you think you think the election would have turned out the same if they knew that Trump was in the lead?  Will this information change how you view polls and statistics you hear and read about in the news?  Make sure to cite which sources you use

https://www.telegraph.co.uk/news/2016/11/09/how-wrong-were-the-polls-in-predicting-the-us-election/

https://www.chronicle.com/article/academic-pollsters-didnt-see-all-those-trump-voters-coming-either-why-not/

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

https://www.wpr.org/polls-missed-mark-2016-experts-say-things-are-different-2020

The Polls Missed Trump. We Asked Pollsters Why.

https://www.usatoday.com/story/news/politics/elections/2016/2016/11/09/pollsters-donald-trump-hillary-clinton-2016-presidential-election/93523012/

https://www.huffpost.com/entry/pollsters-and-forecasters-had-a-rough-night_n_5822c343e4b0e80b02cdee13

https://www.pewresearch.org/fact-tank/2016/11/09/why-2016-election-polls-missed-their-mark/

PART 2

Also provided is a video on how to do linear regression in excel.  It is IMPERATIVE that you have Microsoft excel for this assignment and the remainder of this course as we ill be using it exclusively to solve the problems

Linear Regression in Excel

6. (13.5) Zagat publishes restaurant ratings for various locations in the United States.  The file RESTAURANT contains the Zagat ratings for food, decor, service, and the cost per person for a sample of 100 restaurants located in New York city and in a suburb of New York City.  Develop a regression Model to predict the cost per person, based  on a variable that represents the sum of the ratings for food, decor, and service

a. construct a scatter plot in excel

b. Assume a linear relationship, find bo and b1

c. Interpret the meaning of the y-intercept bo and the slope b1 in this problem

d. Predict the mean cost per person for a restaurant with a summated rating of 50

e.  What should you tell the owner of a group of restaurants in this geographical area about the relationship between summated rating and the cost of a meal

2

>Sheet2

House Price

Square feet 2

4 5 1

400 3

12 1

6

00 SUMMARY OUTPUT 2

7 9 1700 30

8 1875 Regression

Statistics 199 1

10

0 Multiple R 0.7621137132 219 1550 R Square 0.5808173119 405 2350 Adjusted R Square 0.5284194759 324 2450 Standard Error 41.3303236503 319 1425 Observation

s

10
255

1700
ANOVA df SS MS F Significance F Regression 1

18934.934775692

18934.934775692

11.0847576166 0.0103940164 Residual

8

13665.565224308 1708.1956530385 Total

9

32600.5 Coefficients

Standard Error

t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 98.2483296214 58.0334785847 1.6929595126 0.1289188121 -35.5771118647 232.0737711075

-35.5771118647 232.0737711075
Square feet

0.1097677378 0.0329694433 3.3293779624

0.0103940164

0.0337400654 0.1857954103

0.0337400654 0.1857954103
RESIDUAL OUTPUT Observation

Predicted House Price Residuals 1

251.9231625835 -6.9231625835 2

273.8767101495 38.1232898505 3

284.8534839325 -5.8534839325 4

304.0628380528 3.9371619472 5

218.9928412345 -19.9928412345 6

268.388323258 -49.388323258 7

356.2025135221 48.7974864779 8

367.1792873051 -43.1792873051 9

254.6673560293 64.3326439707 10 284.8534839325

-29.8534839325

Square feet Line Fit Plot

House Price 1400 1600 1700 1875 1100 1550 2350 2450 1425 1700 245 312 279 308 199 219 405 324 319 255 Predicted House Price 1400 1600 1700 1875 1100 1550 2350 2450 1425 1700 251.92316258351892 273.87671014953867 284.8534839325485 304.06283805281578 218.99284123448933 268.38832325803372 356.20251352211261 367.1792873051225 254.66735602927139 284.8534839325485 Square feet
House Price

Sheet3

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 13-*
Chapter 13
Regression Analysis:
PART 1: Simple Linear Regression
Basic Business Statistics
10th Edition
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 13-*
Learning Objectives
In this chapter, you learn:
How to use regression analysis to predict the value of a dependent variable (Y) based on an independent variable (X): X causes Y
How to evaluate the assumptions of regression analysis and know what to do if the assumptions are violated
To make inferences about the slope in a linear regression (linear relation b/w X and Y)
To estimate mean values and predict individual values

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 13-*
Analysis when Two Variables are Related
A scatter diagram can be used to show the relationship between two variables
Scatter diagrams were first presented in Ch. 2
Correlation analysis is used to measure strength of the association (linear relationship) between two variables (Ch. 3)
Correlation is only concerned with strength of the relationship
No causal effect is implied with correlation
Regression analysis is used to show causation
Changes in X cause changes in Y
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.

Correlation Coefficient (r)
–1 < r < 1 The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker the linear relationship Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 3-* Scatter Plots of Data with Various Correlation Coefficients Y X Y X Y X Y X Y X r = -1 r = -.6 r = 0 r = +.3 r = +1 Y X r = 0 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Introduction to Regression Analysis Regression analysis is used to: Explain the impact of changes in an independent variable on changes in the dependent variable Predict the value of a dependent variable based on the value of one or more independent variables Dependent variable (Y): the variable we wish to predict or explain Independent variable (X): the variable used to explain the dependent variable Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Simple Linear Regression Example A real estate agent wishes to examine the relationship between the selling price of a home and its size (measured in square feet) Dependent variable (Y) = house price Independent variable (X) = square feet Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Sample Data for House Price Model House Price (Y) Square Feet (X) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Types of Relationships Y X Y X Y Y X X Linear relationships Non-linear relationships Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Simple Linear Regression Model Only one independent variable, X Relationship between X and Y is described by a linear function: Y = intercept + slope(X) Y = a + bX Changes in Y are assumed to be caused by changes in X Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Linear Relationships Y X Y X Y Y X X Strong relationships Weak relationships Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Linear Relationships Y X Y X No relationship (continued) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Linear component Simple Linear Regression Model Population Y intercept Population Slope Coefficient Random Error term Dependent Variable Independent Variable Random Error component Population Parameters Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* The simple linear regression equation provides an estimate of the population regression line Sample Statistics: Regression Equation Estimate of the regression intercept Estimate of the regression slope Estimated (or predicted) Y value for observation i Value of X for observation i Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Formulas: Slope and Intercept Slope: b1 = Intercept: b0 = Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* INTERCEPT: b0 is the estimated average value of Y when the value of X is zero SLOPE: b1 is the estimated change in the average value of Y as a result of a one-unit change in X Interpretation of the Slope and the Intercept Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Simple Linear Regression Example A real estate agent wishes to examine the relationship between the selling price of a home and its size (measured in square feet) A random sample of 10 houses is selected Dependent variable (Y) = house price in $1000s Independent variable (X) = square feet Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Sample Data for House Price Model House Price in $1000s (Y) Square Feet (X) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Graphical Presentation House price model: scatter plot Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chart2 1400 1600 1700 1875 1100 1550 2350 2450 1425 1700 House Price Square Feet House Price ($1000s) 245 312 279 308 199 219 405 324 319 255 Sheet4 SUMMARY OUTPUT Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.08476 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 RESIDUAL OUTPUT Observation Predicted House Price Residuals 1 251.9231625835 -6.9231625835 2 273.8767101495 38.1232898505 3 284.8534839325 -5.8534839325 4 304.0628380528 3.9371619472 5 218.9928412345 -19.9928412345 6 268.388323258 -49.388323258 7 356.2025135221 48.7974864779 8 367.1792873051 -43.1792873051 9 254.6673560293 64.3326439707 10 284.8534839325 -29.8534839325 Sheet4 1400 1400 1600 1600 1700 1700 1875 1875 1100 1100 1550 1550 2350 2350 2450 2450 1425 1425 1700 1700 House Price Predicted House Price Square Feet House Price Square Feet Line Fit Plot 245 0 312 0 279 0 308 0 199 0 219 0 405 0 324 0 319 0 255 0 Sheet1 House Price Square Feet 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700 Sheet1 0 0 0 0 0 0 0 0 0 0 House Price Square Feet House Price 0 0 0 0 0 0 0 0 0 0 Sheet2 Sheet3 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Regression Using Excel Data / Data Analysis / Regression Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Excel Output The regression equation is: Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Graphical Presentation House price model: scatter plot and regression line Slope = 0.10977 Intercept = 98.248 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chart2 1400 1600 1700 1875 1100 1550 2350 2450 1425 1700 House Price Square Feet House Price ($1000s) 245 312 279 308 199 219 405 324 319 255 Sheet4 SUMMARY OUTPUT Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.08476 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 RESIDUAL OUTPUT Observation Predicted House Price Residuals 1 251.9231625835 -6.9231625835 2 273.8767101495 38.1232898505 3 284.8534839325 -5.8534839325 4 304.0628380528 3.9371619472 5 218.9928412345 -19.9928412345 6 268.388323258 -49.388323258 7 356.2025135221 48.7974864779 8 367.1792873051 -43.1792873051 9 254.6673560293 64.3326439707 10 284.8534839325 -29.8534839325 Sheet4 1400 1400 1600 1600 1700 1700 1875 1875 1100 1100 1550 1550 2350 2350 2450 2450 1425 1425 1700 1700 House Price Predicted House Price Square Feet House Price Square Feet Line Fit Plot 245 0 312 0 279 0 308 0 199 0 219 0 405 0 324 0 319 0 255 0 Sheet1 House Price Square Feet 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700 Sheet1 0 0 0 0 0 0 0 0 0 0 House Price Square Feet House Price 0 0 0 0 0 0 0 0 0 0 Sheet2 Sheet3 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Interpretation of the Intercept, b0 b0 is the estimated average value of Y when the value of X is zero (if X = 0 is in the range of observed X values) Here, no houses had 0 square feet, so b0 = 98.24833 just indicates that, for houses within the range of sizes observed, $98,248.33 is the portion of the house price not explained by square feet Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Interpretation of the Slope Coefficient, b1 b1 measures the estimated change in the average value of Y as a result of a one-unit change in X Here, b1 = .10977 tells us that the average value of a house increases by .10977($1000) = $109.77, on average, for each additional one square foot of size Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Predict the price for a house with 2000 square feet: The predicted price for a house with 2000 square feet is 317.85($1,000s) = $317,850 Predictions using Regression Analysis Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable The coefficient of determination is also called R-squared and is denoted as r2 (also, R2) Coefficient of Determination (r2) note: Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* r2 = 1 Examples of Approximate r2 Values Y X Y X r2 = 1 r2 = 1 Perfect linear relationship between X and Y: 100% of the variation in Y is explained by variation in X Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Examples of Approximate r2 Values Y X Y X 0 < r2 < 1 Weaker linear relationships between X and Y: Some but not all of the variation in Y is explained by variation in X Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Examples of Approximate r2 Values r2 = 0 No linear relationship between X and Y: The value of Y does not depend on X. (None of the variation in Y is explained by variation in X) Y X r2 = 0 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* R-squared from Excel Output 58.08% of the variation in house prices is explained by variation in square feet Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Assumptions of Regression Use the acronym LINE: Linearity The underlying relationship between X and Y is linear Independence of Errors Error values are statistically independent Normality of Error Error values (ε) are normally distributed for any given value of X Equal Variance (Homoscedasticity) The probability distribution of the errors has constant variance Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Hypothesis testing about the Slope: t Test t test for a population slope Is there a linear relationship between X and Y? Null and alternative hypotheses H0: β1 = 0 (no linear relationship) H1: β1  0 (linear relationship does exist) Test statistic where: b1 = regression slope coefficient β1 = hypothesized slope Sb = standard error of the slope 1 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Standard Error of the Slope The standard error of the regression slope coefficient (b1) is estimated by where: = Estimate of the standard error of the least squares slope = Standard error of the estimate Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Standard Error from Excel Output Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Simple Linear Regression Equation: The slope of this model is 0.1098 Does square footage of the house affect its sales price at 95% CL? Hypothesis testing about the Slope: t Test House Price (y) Square Feet (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Example H0: β1 = 0 H1: β1  0 From Excel output: t b1 (continued)   Coefficients Standard Error t Stat P-value Intercept 98.24833 58.03348 1.69296 0.12892 Square Feet 0.10977 0.03297 3.32938 0.01039 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Critical Value for t statistic Use “t” table on page 814-815 Depends on: degrees of freedom: df = n – 2 Significance level: 𝛼 95% CL has 𝛼 = 0.05 90% CL has 𝛼 = 0.10 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 8-* t Table (p. 814-815) Upper Tail Area df .25 … .025 1 1.000 … 12.7062 … … … 8 0.7064 … 2.3060 t 0 2.3060 The body of the table contains t values, not probabilities Let: n = 10 df = n - 2 = 8 95% CL  = 0.05 /2 = 0.025 /2 = 0.025 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Result H0: β1 = 0 H1: β1  0 Test Statistic: t = 3.329 There is sufficient evidence that square footage affects house price From Excel output: Reject H0 t b1 Decision: Conclusion: Reject H0 Reject H0 a/2=.025 -tα/2 Do not reject H0 0 tα/2 a/2=.025 -2.3060 2.3060 3.329 d.f. = 10-2 = 8 (continued)   Coefficients Standard Error t Stat P-value Intercept 98.24833 58.03348 1.69296 0.12892 Square Feet 0.10977 0.03297 3.32938 0.01039 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Confidence Interval Estimate for the Slope Confidence Interval Estimate of the Slope: Excel Printout for House Prices: At 95% level of confidence, the confidence interval for the slope is (0.0337, 0.1858) d.f. = n - 2   Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Confidence Interval of Slope Since the confidence interval from above does not contain the 0, we can reject the null CONCLUSION: We are 95% confident that there is a positive linear relationship between the size of a house and its price. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Pitfalls of Regression Analysis Lacking an awareness of the assumptions underlying regression methodology Not knowing how to evaluate the assumptions Not knowing the alternatives to regression if a particular assumption is violated Using a regression model without knowledge of the subject matter Extrapolating outside the relevant range Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Strategies for Avoiding the Pitfalls of Regression Start with a scatter diagram of X vs. Y to observe possible relationship Perform residual analysis to check the assumptions Plot the residuals vs. X to check for violations of assumptions such as homoscedasticity Use a histogram, stem-and-leaf display, box-and-whisker plot, or normal probability plot of the residuals to uncover possible non-normality Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Strategies for Avoiding the Pitfalls of Regression If there is violation of any assumption, use alternative methods or models If there is no evidence of assumption violation, then test for the significance of the regression coefficients and construct confidence intervals Avoid making predictions or forecasts outside the relevant range (continued) Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. i i 1 0 i ε X β β Y + + = i 1 0 i X b b Y ˆ + = _______ __________ ) )( ( å - - Y Y X X i i å - 2 ) ( X X i X b Y 1 - 0 50 100 150 200 250 300 350 400 450 050010001500200025003000 Square Feet House Price ($1000s) feet) (square 0.10977 98.24833 price house + = 0 50 100 150 200 250 300 350 400 450 050010001500200025003000 Square Feet House Price ($1000s) feet) (square 0.10977 98.24833 price house + = feet) (square 0.10977 98.24833 price house + = 317.85 0) 0.1098(200 98.25 (sq.ft.) 0.1098 98.25 price house = + = + = 1 R 0 2 £ £ squares of sum total squares of sum regression SST SSR r 2 = = 0.58082 32600.5000 18934.9348 SST SSR r 2 = = = 1 b 1 1 S β b t - = 2 n d.f. - = å - = = 2 i YX YX b ) X (X S SSX S S 1 1 b S 2 n SSE S YX - = 0.03297 S 1 b = (sq.ft.) 0.1098 98.25 price house + = 1 b S 32938 . 3 03297 . 0 0 10977 . 0 S β b t 1 b 1 1 = - = - = 1 b 2 n 1 S t b - ± Regression Analysis using Excel 2007 MTH 305 Statistics Data Needed in Regression Analysis At least two variables that have information about several observations Only one variable will be defined as the Y variable. There can be one or more X variables in regression analysis. Observation ID Variable 1 Variable 2 1 2 3 Data Example For example, we are interested in analyzing the linear relationship between amount of sugar and calories in a box of cereals. We are testing whether sugar amount causes calories amount. In Excel the dataset will look like…see next slide Data Example Ways to Check Linear Relationship Scatter Plot between Y and X Correlation Value Regression Analysis SCATTER PLOT Scatter-plot of Two variables Select the data of two variables you wish to analyze. Under Insert tab  Chart  and select “scatter plot” Example from data above: Looks like there is no linear relationshiip!!! CORRELATION COEFFICIENT Correlation Value in Excel In any Excel cell, type: =CORREL(range of Y data, range of X data) For example, for the dataset above (cereal data) where Y data are in cells B2 through B19 and X data are in cells C2 through C19, we will type: =CORREL(B2:B19, C2:C19) The result of 0.2296 shows that there is a weak relationship between those variables. REGRESSION ANALYSIS Regression Analysis Under Data  Data Analysis  Regression Excel Output: Intercept and Slope The regression equation is: Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 Excel Output: R-squared 58.08% of the variation in house prices is explained by variation in square feet Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-* Excel Output: Standard Error Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA   df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.0848 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 To Plot Regression Line Under the Regression window, place a mark “Line Fit Plots” Regression Line House price model: scatter plot and regression line Slope = 0.10977 Intercept = 98.248 Chart2 1400 1600 1700 1875 1100 1550 2350 2450 1425 1700 House Price Square Feet House Price ($1000s) 245 312 279 308 199 219 405 324 319 255 Sheet4 SUMMARY OUTPUT Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS F Significance F Regression 1 18934.9348 18934.9348 11.08476 0.01039 Residual 8 13665.5652 1708.1957 Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 RESIDUAL OUTPUT Observation Predicted House Price Residuals 1 251.9231625835 -6.9231625835 2 273.8767101495 38.1232898505 3 284.8534839325 -5.8534839325 4 304.0628380528 3.9371619472 5 218.9928412345 -19.9928412345 6 268.388323258 -49.388323258 7 356.2025135221 48.7974864779 8 367.1792873051 -43.1792873051 9 254.6673560293 64.3326439707 10 284.8534839325 -29.8534839325 Sheet4 1400 1400 1600 1600 1700 1700 1875 1875 1100 1100 1550 1550 2350 2350 2450 2450 1425 1425 1700 1700 House Price Predicted House Price Square Feet House Price Square Feet Line Fit Plot 245 0 312 0 279 0 308 0 199 0 219 0 405 0 324 0 319 0 255 0 Sheet1 House Price Square Feet 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700 Sheet1 0 0 0 0 0 0 0 0 0 0 House Price Square Feet House Price 0 0 0 0 0 0 0 0 0 0 Sheet2 Sheet3 MULTIVARIATE REGRESSION Multivariate Regression Multivariate = two or more X variables than influence Y Scatter-Plot: get them separately for each pair of X and Y. Correlation Coefficient: compute them separately for each pair of X and Y. Regression Analysis: If we want to analyze how two or more X variables have an impact on Y, then we will do the same as above for the case of one X but select the data in all the X variables at the same time. feet) (square 0.10977 98.24833 price house + = 0.58082 32600.5000 18934.9348 SST SSR r 2 = = = 0.03297 S 1 b = 0 50 100 150 200 250 300 350 400 450 050010001500200025003000 Square Feet House Price ($1000s) feet) (square 0.10977 98.24833 price house + = ProductCaloriesSugar (grams) Kellogg's20018 Sam's Choice Extra raisin (Wal-Mart)21023 Kountry Fresh (Winn-Dixie)17017 Post Premium19020 American Fare (kmart)17017 America's Choice (A&P)20018 Safeway20018 Kroger20018 General Mills Total18019 Post The Original Shredded Wheat 'N Bran2001 Post The Original Shredded Wheat, Spoon Size1700 Kellogg's Raisin Squares Mini-Wheats18012 Healthy Choice Toasted Brown Sugar Squares1909 Kountry Fresh Frosted Bite Size (Winn-Dixie)20011 Post Frosted Bite Size19012 Kroger Frosted Bite Size19011 Kellogg's Frosted Mini-Wheats20012 Safeway Frosted Bite Size19011

DATA

0

City

24

0

City

14

0

City

23 24

0

City 20

19

0

City 19

0

City

23 20

0

City 19

19

0

City 21

19

0

City 16

17 48 0 43

City 20

19

0

City 23 15 17 55 0

City 22 23 21

0

City 21 16 20

0 56

City 19 16 18

0 32

City

22 22

0 56

City 22

17

0 23

City 21 12 16

0

City 22 19 20

0

City 17 15 19 51 0 44
City 23 18 21 62 0 40
City 21 17 20

0

City 23 23 21 67 0 57
City 19 17 17 53 0 43
City 22 16 19 57 0 49
City 21 20 20 61 0

City 19 16 16 51 0

City 24 20 24 68 0 79
City 19 18 17

0

City 19 11 12 42 0 21
City 23 16 18 57 0 40
City 19 20 23 62 0 49
City 19 18 18 55 0 45
City 23 20 21 64 0 54
City 25 21 22 68 0 64
City 20 20 17 57 0 48
City 18 14 17 49 0

City 24 19 20

0

City 22 24 21 67 0 53
City 18 15 17 50 0 27
City 22 17 21 60 0 44
City 23 20 22 65 0 58
City 21 19 21 61 0 68
City 22 26 20 68 0 59
City 18 18 18 54 0 61
City 23 17 20 60 0 59
City 22 14 18 54 0 48
City 24 24 25

0

City 19 21 18 58 0 65
City 20 15 19 54 0 42

22 17 21 60 1 53

Suburban 22 18 21 61 1 45
Suburban 20 13 17 50 1 39
Suburban 21 16 20 57 1 43
Suburban 24 19 20 63 1 44
Suburban 19 16 16 51 1 29
Suburban 22 22 21 65 1

Suburban 23 16 20 59 1 34
Suburban 18 19 19 56 1 33
Suburban 18 17 17 52 1 37
Suburban 22 17 22 61 1 54
Suburban 22 17 20 59 1

Suburban 19 22 17 58 1 49
Suburban 21 12 19 52 1 44
Suburban 15 20 16 51 1 34
Suburban 22 20 22 64 1 55
Suburban 20 18 18 56 1 48
Suburban 18 16 18 52 1

Suburban 22 16 21 59 1 29
Suburban 22 21 23 66 1 40
Suburban 19 19 19 57 1 38
Suburban 24 18 20 62 1 38
Suburban 25 21 24

1 55

Suburban 24 21 20 65 1 43
Suburban 20 13 17 50 1 33
Suburban 18 19 18 55 1 44
Suburban 22 15 19 56 1 41
Suburban 18 15 20 53 1 45
Suburban 23 25 21 69 1 41
Suburban 20 22 22 64 1 42
Suburban 20 19 17 56 1 37
Suburban 24 19 22 65 1 56
Suburban 24 27 24

1 60

Suburban 21 18 21 60 1 46
Suburban 17 14 18 49 1

Suburban 23 15 22 60 1 35
Suburban 24 21 21 66 1 68
Suburban 25 17 22 64 1 40
Suburban 21 19 20 60 1 51
Suburban 23 12 24 59 1 32
Suburban 21 15 19 55 1 28
Suburban 19 19 18 56 1 44
Suburban 26 13 18 57 1 26
Suburban 19 18 20 57 1 42
Suburban 21 11 16 48 1 37
Suburban 27 20 23 70 1 63
Suburban 24 20 20 64 1 37
Suburban 19 11 16 46 1 22
Suburban 23 21 20 64 1 53
Suburban 24 18 22 64 1 62
Location Food Décor Service Summated Rating Coded Location Cost
City 2

1 19 2

0 60 62
24 20 68 67
22 14 50 23
27 74 79
13 52 32
11 18 48 38
21 64 46
17 55 43
16 56 39
15
26 65 44
29
66 59
57
53
25 69
12 51
49 40
61 45
58 33
28
35
54 42
41
63 34
73 78
Suburban
37
30
36
70
75
31

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.

Order your essay today and save 30% with the discount code ESSAYHELP