statistics report

  Statistics report approx 7-8 pages as per the template 

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

i need it asap this is urgent

MAT 243 Project Two Summary Report

[Full Name]

[SNHU Email]

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Southern New Hampshire University

Note: Replace the bracketed text on page one (the cover page) with your personal information.

Introduction: Problem Statement

Discuss the statement of the problem in terms of the statistical analyses that are being performed. In your response, you should address the following questions:

· What is the problem you are going to solve?

· What data set are you using?

· What statistical methods will you be using to do the analysis for this project?

Answer the questions in a paragraph response. Remove all questions and this note before submitting! Do not include Python code in your report.

Introduction: Your Team and the Assigned Team

In the Python script, you picked the same team and years that you picked for Project One. The assigned team and its range of years will be the same as in Project One as well.

See Steps 1 and 2 in the Python script to address the following items in the table below:

· What team did you pick and what years were picked to do the analysis?

· What team and range of years were you assigned for the comparative study? (Hint: this is called the assigned team in the Python script.) Present this information in a formatted table as shown below.

Table 1. Information on the Teams

Name of Team

Years Picked

1. Yours

Team (e.g. Knicks)

XXXX-YYYY (e.g. 2013 – 2015)

2. Assigned

Team (e.g. Bulls)

XXXX-YYYY (e.g. 1996- 1998)

Answer the questions in a paragraph response. Remove all questions and this note (but not the table) before submitting! Do not include Python code in your report.

Hypothesis Test for the Population Mean (I)

Suppose a relative skill level of 1420 represents a critically low skill level in the league. The management of your team has hypothesized that the average relative skill level of your team is greater than 1420. You tested this claim using a 5% level of significance. For this test, you assumed that the population standard deviation for relative skill level is unknown. Explain the steps you took to test this problem and interpret your results.

See Step 3 in the Python script to address the following items:

· In general, how is hypothesis testing used to test claims about a population mean?

· Summarize all important steps of the hypothesis test. This includes:

a. Null Hypothesis (statistical notation and its description in words)

b. Alternative Hypothesis (statistical notation and its description in words)

c. Level of Significance

d. Report the Test Statistic and the P-value in a formatted table as shown below:

Table 2: Hypothesis Test for the Population Mean (I)

Statistic

Value

Test Statistic

X.XX
*Round off to 2 decimal places.

P-value

X.XXXX
*Round off to 4 decimal places.

e. Conclusion of the hypothesis test and its interpretation based on the P-value

· What are the implications of your findings from this hypothesis test? What is its practical significance?

Answer the questions in a paragraph response. Remove all questions and this note (but not the table) before submitting! Do not include Python code in your report.

Hypothesis Test for the Population Mean (II)

Your team’s coach has hypothesized that average number of points scored by your team in the team’s years is less than 110 points. For this test, you assumed that the population standard deviation for points scored is unknown. You tested the claim using a 1% level of significance. Explain the steps you took to test this problem and interpret your results.

See Step 4 in the Python script to address the following items:

· Summarize all important steps of the hypothesis test. This includes:
a. Null Hypothesis (statistical notation and its description in words)
b. Alternative Hypothesis (statistical notation and its description in words)
c. Level of Significance
d. Report the Test Statistic and the P-value in a formatted table as shown below:

Table 3: Hypothesis Test for the Population Mean (II)

Statistic

Value

Test Statistic

X.XX
*Round off to 2 decimal places.

P-value

X.XXXX
*Round off to 4 decimal places.

e. Conclusion of the hypothesis test and its interpretation based on the P-value
· What are the implications of your findings from this hypothesis test? What is its practical significance?

Answer the questions in a paragraph response. Remove all questions and this note (but not the table) before submitting! Do not include Python code in your report.

Hypothesis Test for the Population Proportion

Suppose the management claims that the proportion of games that your team wins when scoring 80 or more points is 0.50. You tested this claim using a 5% level of significance. Explain the steps you took to test this problem and interpret your results.

See Step 5 in the Python script to address the following items:

· In general, how is hypothesis testing used to test claims about a population proportion?

· Summarize all important steps of the hypothesis test. This includes:
a. Null Hypothesis (statistical notation and its description in words)
b. Alternative Hypothesis (statistical notation and its description in words)
c. Level of Significance
d. Report the Test Statistic and the P-value in a formatted table as shown below:

Table 4: Hypothesis Test for the Population Proportion

Statistic

Value

Test Statistic

X.XX
*Round off to 2 decimal places.

P-value

X.XXXX
*Round off to 4 decimal places.

e. Conclusion of the hypothesis test and its interpretation based on the P-value
· What are the implications of your findings from this hypothesis test? What is its practical significance?

Answer the questions in a paragraph response. Remove all questions and this note (but not the table) before submitting! Do not include Python code in your report.

Hypothesis Test for the Difference Between Two Population Means

You were asked to compare your team’s skill level (from its years) with the assigned team’s skill level (from the assigned time frame). You tested the claim that the skill level of your team is the same as the skill level of the assigned team, using a 1% level of significance.

See Step 6 in the Python script to address the following items:

· In general, how is hypothesis testing used to test claims about the difference between two population means?

· Summarize all important steps of the hypothesis test. This includes:
a. Null Hypothesis (statistical notation and its description in words)
b. Alternative Hypothesis (statistical notation and its description in words)
c. Level of Significance
d. Report the Test Statistic and the P-value in a formatted table as shown below:

Table 5: Hypothesis Test for the Difference Between Two Population Means

Statistic

Value

Test Statistic

X.XX
*Round off to 2 decimal places.

P-value

X.XXXX
*Round off to 4 decimal places.

e. Conclusion of the hypothesis test and its interpretation based on the P-value
· What are the implications of your findings from this hypothesis test? What is its practical significance?

Answer the questions in a paragraph response. Remove all questions and this note (but not the table) before submitting! Do not include Python code in your report.

Conclusion

Describe the results of your statistical analyses clearly, using proper descriptions of statistical terms and concepts.

· What is the practical importance of the analyses that were performed?

· Describe what these results mean for the scenario.

Answer the questions in a paragraph response. Remove all questions and this note before submitting! Do not include Python code in your report.

Citations

You were not
required to use external resources for this report. If you did not use any resources, you should remove this entire section. However, if you did use any resources to help you with your interpretation, you must
cite them. Use proper APA format for citations.

Insert references here in the following format:

Author’s Last Name, First Initial. Middle Initial. (Year of Publication). Title of book: Subtitle of book, edition. Place of Publication: Publisher.

Project Two Guidelines and Rubric

Scenario

You are a data analyst for a basketball team. You have found a large set of historical data, and are working to analyze and find patterns in the data set. The coach of the team and your management have requested that you perform several hypothesis tests to find the statistical significance of the claims that are being made about your team. This analysis will provide evidence to validate critical claims and get statistically valid findings that will help make key decisions to make the team better in upcoming seasons. You will use the Python programming language to perform statistical analysis and will also need to present a report of your findings to the team’s management. Since the managers are not data analysts, you will need to interpret your findings and describe their practical implications. The managers will use your report to find areas where the team can improve its performance.

Note: This data set has been “cleaned” for the purposes of this assignment.

Reference

FiveThirtyEight. (April 26, 2019). FiveThirtyEight NBA Elo dataset. Kaggle. Retrieved from https://www.kaggle.com/fivethirtyeight/fivethirtyeight-nba-elo-dataset/

Directions

For this project, you will submit the Python script you used to make your calculations and a summary report explaining your findings.

1. Python Script: To complete the tasks listed below, open the Project Two Jupyter Notebook link in the Assignment Information module. Your project contains the NBA data set and a Jupyter Notebook with your Python scripts. In the notebook, you will find step-by-step instructions and code blocks that will help you complete the following tasks:

· Hypothesis tests for a population parameter

· Hypothesis tests for a population mean

· Hypothesis test for a population proportion

· Hypothesis test for the difference between two population parameters

· Hypothesis test for difference between two population means

2. Summary Report: Once you have completed all the steps in your Python script, you will create a summary report to present your findings. Use the provided template to create your report. You must complete each of the following sections:

· Introduction: Set the context for your scenario and the analyses you will be performing.

· Hypothesis tests for the population mean: Discuss all steps of the hypothesis tests and interpret your results.

· Hypothesis test for the population proportion: Discuss all steps of the hypothesis test and interpret your results.

· Hypothesis test for the difference between two population means: Discuss all steps of the hypothesis test and interpret your results.

· Conclusion: Summarize your findings and explain their practical implications.

What to Submit

To complete this project, you must submit the following:

Python Script

C:\Users\kingh\Downloads\Project Two Jupyter Script.html

Summary Report Zip File

Use the provided template to create your summary report. The template contains guiding questions to help you complete each section. Be sure to remove these questions before submitting your report. Your summary report should be submitted as a 3- to

5

-page Microsoft Word document. It should include an APA-style cover page and APA citations for any sources used. Use double spacing, 12-point Times New Roman font, and one-inch margins.

Shows progress toward proficiency, but with errors or omissions (55%)

Does not attempt criterion (0%)

Exceeds proficiency by writing with exceptional clarity, insight, and mastery of statistical terminology (100%)

Shows progress toward proficiency, but with errors or omissions (55%)

Does not attempt criterion (0%)

25

Exceeds proficiency by writing with exceptional clarity, insight, and mastery of statistical terminology (100%)

Shows progress toward proficiency, but with errors or omissions (55%)

Does not attempt criterion (0%)

25

Exceeds proficiency by writing with exceptional clarity, insight, and mastery of statistical terminology (100%)

Shows progress toward proficiency, but with errors or omissions (55%)

Does not attempt criterion (0%)

10

Project Two Rubric

Criteria

Exemplary (

10

0%)

Proficient (85%)

Needs Improvement (55%)

Not Evident (0%)

Value

Python Script: Hypothesis Testing

N/A

Accurately performs hypothesis tests for one population parameter and for the difference in two population parameters by executing the appropriate functions in the programming environment (

100%

)

Shows progress toward proficiency, but with errors or omissions (55%)

Does not attempt criterion (0%)

10

Summary Report: Hypothesis Tests for the Population Mean

Exceeds proficiency by writing with exceptional clarity, insight, and mastery of statistical terminology (100%)

Reports results of hypothesis tests for the population mean by discussing all steps and interpreting results in terms of statistical significance (85%)

25

Summary Report: Hypothesis Test for the Population Proportion

Reports results of hypothesis test for the population proportion by discussing all steps and interpreting results in terms of statistical significance (85%)

Summary Report: Hypothesis Test for the Difference between Two Population Means

Reports results of hypothesis test comparing two population means by discussing all steps and interpreting results in terms of statistical significance (85%)

Summary Report: Introduction and Conclusion

Communicates all ideas by presenting context, as well as summarizing and interpreting the practical implications of the results (85%)

Articulation of Response

Exceeds proficiency in an exceptionally clear, insightful, sophisticated, or creative manner (100%)

Clearly conveys meaning with correct grammar, sentence structure, and spelling, demonstrating an understanding of audience and purpose (85%)

Shows progress toward proficiency, but with errors in grammar, sentence structure, and spelling, negatively impacting readability (55%)

Submission has critical errors in grammar, sentence structure, and spelling, preventing understanding of ideas (0%)

5

Total:

100%

Project Two:

H

ypothesis Testing¶

This notebook contains the step-by-step directions for Project Two. It is very important to run through the steps in order. Some steps depend on the outputs of earlier steps. Once you have completed the steps in this notebook, be sure to write your summary report.

You are a data analyst for a basketball team and have access to a large set of historical data that you can use to analyze performance patterns. The coach of the team and your management have requested that you perform several hypothesis tests to statistically validate claims about your team’s performance. This analysis will provide evidence for these claims and help make key decisions to improve the performance of the team. You will use the Python programming language to perform the statistical analyses and then prepare a report of your findings for the team’s management. Since the managers are not data analysts, you will need to interpret your findings and describe their practical implications.

There are four important variables in the data set that you will study in Project Two.

hat does it represent?

measure of relative skill level of the team in the league

Variable W
pts Points scored by the team in a game elo_n A year_id Year when the team played the games fran_id Name of the NBA team

The E

L

O rating, represented by the variable elo_n, is used as a measure of the relative skill of a team. This measure is inferred based on the final score of a game, the game location, and the outcome of the game relative to the probability of that outcome. The higher the number, the higher the relative skill of a team.

In addition to studying data on your own team, your management has also assigned you a second team so that you can compare its performance with your own team’s.

Team What does it represent
Your Team This is the team that has hired you as an analyst. This is the team that you will pick below. See Step 2.
Assigned Team This is the team that the management has assigned to you to compare against your team. See Step 1.

Reminder: It may be beneficial to review the summary report template for Project Two prior to starting this Python script. That will give you an idea of the questions you will need to answer with the outputs of this script.

Step 1: Data Preparation & the Assigned Team¶

This step uploads the data set from a CSV file. It also selects the Assigned Team for this analysis. Do not make any changes to the code block below.

  1. The Assigned Team is Chicago Bulls

    from the years

    1996

    – 1998

Click the block of code below and hit the Run button above.

In [1]:

import numpy as np
import pandas as pd
import scipy.stats as st
import matplotlib.pyplot as plt
from IPython.display import display, HTML
nba_orig_df = pd.read_csv(‘nbaallelo.csv’)
nba_orig_df = nba_orig_df[(nba_orig_df[‘lg_id’]==’NBA’) & (nba_orig_df[‘is_playoffs’]==0)]
columns_to_keep = [‘

game_id

‘,’year_id’,’fran_id’,’pts’,’

opp_pts

‘,’elo_n’,’

opp_elo_n

‘, ‘

game_location

‘, ‘

game_result

‘]
nba_orig_df = nba_orig_df[columns_to_keep]
# The dataframe for the assigned team is called assigned_team_df.
# The assigned team is the Bulls from 1996-1998.
assigned_years_league_df = nba_orig_df[(nba_orig_df[‘year_id’].between(1996, 1998))]
assigned_team_df = assigned_years_league_df[(assigned_years_league_df[‘fran_id’]==’Bulls’)]
assigned_team_df = assigned_team_df.reset_index(drop=True)
display(HTML(assigned_team_df.head().to_html()))
print(“printed only the first five observations…”)
print(“Number of rows in the dataset =”, len(assigned_team_df))

game_id year_id fran_id pts opp_pts elo_n opp_elo_n game_location game_result
0

30CHI

1996 Bulls

H W

1

0CHI

1996 Bulls

H W

2

1996 Bulls

H W

3

0CLE

1996 Bulls

A W

4

1996 Bulls 110 106

1

H W

19

95 110 105 91 1598.2924 1531.7449
19951

104 107 85 1604.3940 1458.6415
199511070CHI 117 108 1605.7983 1310.9349
19951

109 106 88 1618.8701 1452.8268
199511110CHI 1621.1591 1490.2

86

printed only the first five observations…
Number of rows in the dataset = 246

Step 2: Pick Your Team¶

In this step, you will pick your team. The range of years that you will study for your team is

2013

-2015. Make the following edits to the code block below:

  1. Replace ??TEAM?? with your choice of team from one of the following team names.
    *Bucks, Bulls, Cavaliers, Celtics, Clippers, Grizzlies,

    Hawks

    , Heat, Jazz, Kings, Knicks, Lakers, Magic, Mavericks, Nets, Nuggets, Pacers, Pelicans, Pistons, Raptors, Rockets, Sixers, Spurs, Suns, Thunder, Timberwolves, Trailblazers, Warriors, Wizards*
    Remember to enter the team name within single quotes. For example, if you picked the Suns, then ??TEAM?? should be replaced with ‘Suns’.

After you are done with your edits, click the block of code below and hit the Run button above.

In [2]:

# Range of years: 2013-2015 (Note: The line below selects all teams within the three-year period 2013-2015. This is not your team’s dataframe.
your_years_leagues_df = nba_orig_df[(nba_orig_df[‘year_id’].between(2013, 2015))]
# The dataframe for your team is called your_team_df.
# —- TODO: make your edits here —-
your_team_df = your_years_leagues_df[(your_years_leagues_df[‘fran_id’]==’Hawks’)]
your_team_df = your_team_df.reset_index(drop=True)
display(HTML(your_team_df.head().to_html()))
print(“printed only the first five observations…”)
print(“Number of rows in the dataset =”, len(your_team_df))

game_id year_id fran_id pts opp_pts elo_n opp_elo_n game_location game_result

0

0ATL

2013 Hawks 102 109

64

H L

1

2013 Hawks 104 95

A W

2

2013 Hawks

86

H W

3

2013 Hawks 89 95

H L

4

2013 Hawks 76 89

A L

20121

102 1532.

76 1524.9491
201211040OKC 1551.4714 1640.7040
201211070ATL 89 1555.2542 1551.0842
201211090ATL 1547.6481 1667.3300
201211110LAC 1540.6207 1587.7803

printed only the first five observations…
Number of rows in the dataset = 246

Step 3: Hypothesis Test for the Population Mean (I)¶

A relative skill level of 1420 represents a critically low skill level in the league. The management of your team has hypothesized that the average relative skill level of your team in the years 2013-2015 is greater than 1420. Test this claim using a 5% level of significance. For this test, assume that the population standard deviation for relative skill level is unknown. Make the following edits to the code block below:

    1. Replace ??DATAFRAME_YOUR_TEAM?? with the name of your team’s dataframe. See Step 2 for the name of your team’s dataframe.
  1. Replace ??RELATIVE_SKILL?? with the name of the variable for relative skill. See the table included in the Project Two instructions above to pick the variable name. Enclose this variable in single quotes. For example, if the variable name is var2 then replace ??RELATIVE_SKILL?? with ‘var2’.
  1. Replace ??NULL_HYPOTHESIS_VALUE?? with the mean value of the relative skill under the null hypothesis.

After you are done with your edits, click the block of code below and hit the Run button above.

In [3]:

import scipy.stats as st
# Mean relative skill level of your team
mean_elo_your_team = your_team_df[‘elo_n’].mean()
print(“Mean Relative Skill of your team in the years 2013 to 2015 =”, round(mean_elo_your_team,2))

# Hypothesis Test
# —- TODO: make your edits here —-
test_statistic, p_value = st.ttest_1samp(your_team_df[‘elo_n’], 1420)

print(“Hypothesis Test for the Population Mean”)
print(“Test Statistic =”, round(test_statistic,2))
print(“P-value =”, round(p_value,4))

Mean Relative Skill of your team in the years 2013 to 2015 = 1539.22
Hypothesis Test for the Population Mean
Test Statistic = 26.67
P-value = 0.0

Step 4: Hypothesis Test for the Population Mean (II)¶

A team averaging 110 points is likely to do very well during the regular season. The coach of your team has hypothesized that your team scored at an average of less than 110 points in the years 2013-2015. Test this claim at a 1% level of significance. For this test, assume that the population standard deviation for relative skill level is unknown.

You are to write this code block yourself.

Use Step 3 to help you write this code block. Here is some information that will help you write this code block. Reach out to your instructor if you need help.

  1. The dataframe for your team is called your_team_df.
  2. The variable ‘pts’ represents the points scored by your team.
  3. Calculate and print the mean points scored by your team during the years you picked.
  4. Identify the mean score under the null hypothesis. You only have to identify this value and do not have to print it. (Hint: this is given in the problem statement)
  5. Assuming that the population standard deviation is unknown, use Python methods to carry out the hypothesis test.
  6. Calculate and print the test statistic rounded to two decimal places.
  7. Calculate and print the P-value rounded to four decimal places.

Write your code in the code block section below. After you are done, click this block of code and hit the Run button above. Reach out to your instructor if you need more help with this step.

In [5]:

from scipy.stats import ttest_1samp
import numpy as np
mean_pts = your_team_df[‘pts’].mean()
print(“Mean Points =”,mean_pts)
tstat, pval = ttest_1samp(your_team_df[‘pts’], 110)
print(‘T Stat = %.2f, P Value = %.4f’ % (tstat, pval))
if pval < 0.01: print("Reject the null hypothesis") else: print("Accept the null hypothesis") Mean Points = 100.5 T Stat = -12.85, P Value = 0.0000 Reject the null hypothesis

Step 5: Hypothesis Test for the Population Proportion¶

Suppose the management claims that the proportion of games that your team wins when scoring 80 or more points is 0.50. Test this claim using a 5% level of significance. Make the following edits to the code block below:

  1. Replace ??COUNT_VAR?? with the variable name that represents the number of games won when your team scores over 80 points. (Hint: this variable is in the code block below).
  1. Replace ??NOBS_VAR?? with the variable name that represents the total number of games when your team scores over 80 points. (Hint: this variable is in the code block below).
  1. Replace ??NULL_HYPOTHESIS_VALUE?? with the proportion under the null hypothesis.

After you are done with your edits, click the block of code below and hit the Run button above.

In [6]:

from statsmodels.stats.proportion import proportions_ztest
your_team_gt_80_df = your_team_df[(your_team_df[‘pts’] > 80)]
# Number of games won when your team scores over 80 points
counts = (your_team_gt_80_df[‘game_result’] == ‘W’).sum()
# Total number of games when your team scores over 80 points
nobs = len(your_team_gt_80_df[‘game_result’])
p = counts*1.0/nobs
print(“Proportion of games won by your team when scoring more than 80 points in the years 2013 to 2015 =”, round(p,4))

# Hypothesis Test
# —- TODO: make your edits here —-
test_statistic, p_value = proportions_ztest(counts,nobs,80)
print(“Hypothesis Test for the Population Proportion”)
print(“Test Statistic =”, round(test_statistic,2))
print(“P-value =”, round(p_value,4))

Proportion of games won by your team when scoring more than 80 points in the years 2013 to 2015 = 0.6017
Hypothesis Test for the Population Proportion
Test Statistic = -2491.56
P-value = 0.0

Step 6: Hypothesis Test for the Difference Between Two Population Means¶

The management of your team wants to compare the team with the assigned team (the Bulls in 1996-1998). They claim that the skill level of your team in 2013-2015 is the same as the skill level of the Bulls in 1996 to 1998. In other words, the mean relative skill level of your team in 2013 to 2015 is the same as the mean relative skill level of the Bulls in 1996-1998. Test this claim using a 1% level of significance. Assume that the population standard deviation is unknown. Make the following edits to the code block below:

  1. Replace ??DATAFRAME_ASSIGNED_TEAM?? with the name of assigned team’s dataframe. See Step 1 for the name of assigned team’s dataframe.

Replace ??DATAFRAME_YOUR_TEAM?? with the name of your team’s dataframe. See Step 2 for the name of your team’s dataframe.

  1. Replace ??RELATIVE_SKILL?? with the name of the variable for relative skill. See the table included in Project Two instructions above to pick the variable name. Enclose this variable in single quotes. For example, if the variable name is var2 then replace ??RELATIVE_SKILL?? with ‘var2’.

After you are done with your edits, click the block of code below and hit the Run button above.

In [7]:

import scipy.stats as st
mean_elo_n_project_team = assigned_team_df[‘elo_n’].mean()
print(“Mean Relative Skill of the assigned team in the years 1996 to 1998 =”, round(mean_elo_n_project_team,2))
mean_elo_n_your_team = your_team_df[‘elo_n’].mean()
print(“Mean Relative Skill of your team in the years 2013 to 2015 =”, round(mean_elo_n_your_team,2))

# Hypothesis Test
# —- TODO: make your edits here —-
test_statistic, p_value = st.ttest_ind(assigned_team_df[‘elo_n’],your_team_df[‘elo_n’])
print(“Hypothesis Test for the Difference Between Two Population Means”)
print(“Test Statistic =”, round(test_statistic,2))
print(“P-value =”, round(p_value,4))

Mean Relative Skill of the assigned team in the years 1996 to 1998 = 1739.8
Mean Relative Skill of your team in the years 2013 to 2015 = 1539.22
Hypothesis Test for the Difference Between Two Population Means
Test Statistic = 36.16
P-value = 0.0

End of Project Two¶

Download the HTML output and submit it with your summary report for Project Two. The HTML output can be downloaded by clicking File, then Download as, then HTML. Do not include the Python code within your summary report.

# coding: utf-8
# # Project Two: Hypothesis Testing
#
# This notebook contains the step-by-step directions for Project Two. It is very important to run through the steps in order. Some steps depend on the outputs of earlier steps. Once you have completed the steps in this notebook, be sure to write your summary report.
#
# You are a data analyst for a basketball team and have access to a large set of historical data that you can use to analyze performance patterns. The coach of the team and your management have requested that you perform several hypothesis tests to statistically validate claims about your team’s performance. This analysis will provide evidence for these claims and help make key decisions to improve the performance of the team. You will use the Python programming language to perform the statistical analyses and then prepare a report of your findings for the team’s management. Since the managers are not data analysts, you will need to interpret your findings and describe their practical implications.
#
#
# There are four important variables in the data set that you will study in Project Two.
#
#
# |

Variable

|

What does it represent?

|
# | — | — |
# |

pts

|

Points scored by the team in a game

|
# |

elo_n

|

A measure of relative skill level of the team in the league

|
# |

year_id

|

Year when the team played the games

|
# |

fran_id

|

Name of the NBA team

|
#
#
# The ELO rating, represented by the variable **elo_n**, is used as a measure of the relative skill of a team. This measure is inferred based on the final score of a game, the game location, and the outcome of the game relative to the probability of that outcome. The higher the number, the higher the relative skill of a team.
#
#
# In addition to studying data on your own team, your management has also assigned you a second team so that you can compare its performance with your own team’s.
#
#
# |

Team

|

What does it represent

|
# | — | — |
# |

Your Team

|

This is the team that has hired you as an analyst. This is the team that you will pick below. See Step 2.

|
# |

Assigned Team

|

This is the team that the management has assigned to you to compare against your team. See Step 1.

|
#
#
# Reminder: It may be beneficial to review the summary report template for Project Two prior to starting this Python script. That will give you an idea of the questions you will need to answer with the outputs of this script.
#
#
# **——————————————————————————————————————————————————————————————————–**
# ## Step 1: Data Preparation & the Assigned Team
# This step uploads the data set from a CSV file. It also selects the Assigned Team for this analysis. Do not make any changes to the code block below.
#
# 1. The **Assigned Team** is Chicago Bulls from the years 1996 – 1998
#
# Click the block of code below and hit the **Run** button above.
# In[1]:

import numpy as np
import pandas as pd
import scipy.stats as st
import matplotlib.pyplot as plt
from IPython.display import display, HTML
nba_orig_df = pd.read_csv(‘nbaallelo.csv’)
nba_orig_df = nba_orig_df[(nba_orig_df[‘lg_id’]==’NBA’) & (nba_orig_df[‘is_playoffs’]==0)]
columns_to_keep = [‘game_id’,’year_id’,’fran_id’,’pts’,’opp_pts’,’elo_n’,’opp_elo_n’, ‘game_location’, ‘game_result’]
nba_orig_df = nba_orig_df[columns_to_keep]
# The dataframe for the assigned team is called assigned_team_df.
# The assigned team is the Bulls from 1996-1998.
assigned_years_league_df = nba_orig_df[(nba_orig_df[‘year_id’].between(1996, 1998))]
assigned_team_df = assigned_years_league_df[(assigned_years_league_df[‘fran_id’]==’Bulls’)]
assigned_team_df = assigned_team_df.reset_index(drop=True)
display(HTML(assigned_team_df.head().to_html()))
print(“printed only the first five observations…”)
print(“Number of rows in the dataset =”, len(assigned_team_df))

# ## Step 2: Pick Your Team
# In this step, you will pick your team. The range of years that you will study for your team is 2013-2015. Make the following edits to the code block below:
#
# 1. Replace ??TEAM?? with your choice of team from one of the following team names.
# *Bucks, Bulls, Cavaliers, Celtics, Clippers, Grizzlies, Hawks, Heat, Jazz, Kings, Knicks, Lakers, Magic, Mavericks, Nets, Nuggets, Pacers, Pelicans, Pistons, Raptors, Rockets, Sixers, Spurs, Suns, Thunder, Timberwolves, Trailblazers, Warriors, Wizards*
# Remember to enter the team name within single quotes. For example, if you picked the Suns, then ??TEAM?? should be replaced with ‘Suns’.
#
# After you are done with your edits, click the block of code below and hit the **Run** button above.
# In[2]:

# Range of years: 2013-2015 (Note: The line below selects all teams within the three-year period 2013-2015. This is not your team’s dataframe.
your_years_leagues_df = nba_orig_df[(nba_orig_df[‘year_id’].between(2013, 2015))]
# The dataframe for your team is called your_team_df.
# —- TODO: make your edits here —-
your_team_df = your_years_leagues_df[(your_years_leagues_df[‘fran_id’]==’Hawks’)]
your_team_df = your_team_df.reset_index(drop=True)
display(HTML(your_team_df.head().to_html()))
print(“printed only the first five observations…”)
print(“Number of rows in the dataset =”, len(your_team_df))

# ## Step 3: Hypothesis Test for the Population Mean (I)
# A relative skill level of 1420 represents a critically low skill level in the league. The management of your team has hypothesized that the average relative skill level of your team in the years 2013-2015 is greater than 1420. Test this claim using a 5% level of significance. For this test, assume that the population standard deviation for relative skill level is unknown. Make the following edits to the code block below:
#
# 1. Replace ??DATAFRAME_YOUR_TEAM?? with the name of your team’s dataframe. See Step 2 for the name of your team’s dataframe.
#
#
# 2. Replace ??RELATIVE_SKILL?? with the name of the variable for relative skill. See the table included in the Project Two instructions above to pick the variable name. Enclose this variable in single quotes. For example, if the variable name is **var2** then replace ??RELATIVE_SKILL?? with ‘var2’.
#
#
# 3. Replace ??NULL_HYPOTHESIS_VALUE?? with the mean value of the relative skill under the null hypothesis.
#
# After you are done with your edits, click the block of code below and hit the **Run** button above.
# In[3]:

import scipy.stats as st
# Mean relative skill level of your team
mean_elo_your_team = your_team_df[‘elo_n’].mean()
print(“Mean Relative Skill of your team in the years 2013 to 2015 =”, round(mean_elo_your_team,2))

# Hypothesis Test
# —- TODO: make your edits here —-
test_statistic, p_value = st.ttest_1samp(your_team_df[‘elo_n’], 1420)

print(“Hypothesis Test for the Population Mean”)
print(“Test Statistic =”, round(test_statistic,2))
print(“P-value =”, round(p_value,4))

#
# ## Step 4: Hypothesis Test for the Population Mean (II)
#
# A team averaging 110 points is likely to do very well during the regular season. The coach of your team has hypothesized that your team scored at an average of less than 110 points in the years 2013-2015. Test this claim at a 1% level of significance. For this test, assume that the population standard deviation for relative skill level is unknown.
#
#
# You are to write this code block yourself.
#
# Use Step 3 to help you write this code block. Here is some information that will help you write this code block. Reach out to your instructor if you need help.
# 1. The dataframe for your team is called your_team_df.
# 2. The variable ‘pts’ represents the points scored by your team.
# 3. Calculate and print the mean points scored by your team during the years you picked.
# 4. Identify the mean score under the null hypothesis. You only have to identify this value and do not have to print it. (Hint: this is given in the problem statement)
# 5. Assuming that the population standard deviation is unknown, use Python methods to carry out the hypothesis test.
# 6. Calculate and print the test statistic rounded to two decimal places.
# 7. Calculate and print the P-value rounded to four decimal places.
#
# Write your code in the code block section below. After you are done, click this block of code and hit the **Run** button above. Reach out to your instructor if you need more help with this step.
# In[5]:

from scipy.stats import ttest_1samp
import numpy as np
mean_pts = your_team_df[‘pts’].mean()
print(“Mean Points =”,mean_pts)
tstat, pval = ttest_1samp(your_team_df[‘pts’], 110)
print(‘T Stat = %.2f, P Value = %.4f’ % (tstat, pval))
if pval < 0.01: print("Reject the null hypothesis") else: print("Accept the null hypothesis") # # ## Step 5: Hypothesis Test for the Population Proportion # Suppose the management claims that the proportion of games that your team wins when scoring 80 or more points is 0.50. Test this claim using a 5% level of significance. Make the following edits to the code block below: # # 1. Replace ??COUNT_VAR?? with the variable name that represents the number of games won when your team scores over 80 points. (Hint: this variable is in the code block below).
#
#
# 2. Replace ??NOBS_VAR?? with the variable name that represents the total number of games when your team scores over 80 points. (Hint: this variable is in the code block below).
#
#
# 3. Replace ??NULL_HYPOTHESIS_VALUE?? with the proportion under the null hypothesis.
#
# After you are done with your edits, click the block of code below and hit the **Run** button above.
# In[6]:

from statsmodels.stats.proportion import proportions_ztest
your_team_gt_80_df = your_team_df[(your_team_df[‘pts’] > 80)]
# Number of games won when your team scores over 80 points
counts = (your_team_gt_80_df[‘game_result’] == ‘W’).sum()
# Total number of games when your team scores over 80 points
nobs = len(your_team_gt_80_df[‘game_result’])
p = counts*1.0/nobs
print(“Proportion of games won by your team when scoring more than 80 points in the years 2013 to 2015 =”, round(p,4))

# Hypothesis Test
# —- TODO: make your edits here —-
test_statistic, p_value = proportions_ztest(counts,nobs,80)
print(“Hypothesis Test for the Population Proportion”)
print(“Test Statistic =”, round(test_statistic,2))
print(“P-value =”, round(p_value,4))

# ## Step 6: Hypothesis Test for the Difference Between Two Population Means
# The management of your team wants to compare the team with the assigned team (the Bulls in 1996-1998). They claim that the skill level of your team in 2013-2015 is the same as the skill level of the Bulls in 1996 to 1998. In other words, the mean relative skill level of your team in 2013 to 2015 is the same as the mean relative skill level of the Bulls in 1996-1998. Test this claim using a 1% level of significance. Assume that the population standard deviation is unknown. Make the following edits to the code block below:
#
# 1. Replace ??DATAFRAME_ASSIGNED_TEAM?? with the name of assigned team’s dataframe. See Step 1 for the name of assigned team’s dataframe.
#
#
# 2. Replace ??DATAFRAME_YOUR_TEAM?? with the name of your team’s dataframe. See Step 2 for the name of your team’s dataframe.
#
#
# 3. Replace ??RELATIVE_SKILL?? with the name of the variable for relative skill. See the table included in Project Two instructions above to pick the variable name. Enclose this variable in single quotes. For example, if the variable name is **var2** then replace ??RELATIVE_SKILL?? with ‘var2’.
#
#
# After you are done with your edits, click the block of code below and hit the **Run** button above.
# In[7]:

import scipy.stats as st
mean_elo_n_project_team = assigned_team_df[‘elo_n’].mean()
print(“Mean Relative Skill of the assigned team in the years 1996 to 1998 =”, round(mean_elo_n_project_team,2))
mean_elo_n_your_team = your_team_df[‘elo_n’].mean()
print(“Mean Relative Skill of your team in the years 2013 to 2015 =”, round(mean_elo_n_your_team,2))

# Hypothesis Test
# —- TODO: make your edits here —-
test_statistic, p_value = st.ttest_ind(assigned_team_df[‘elo_n’],your_team_df[‘elo_n’])
print(“Hypothesis Test for the Difference Between Two Population Means”)
print(“Test Statistic =”, round(test_statistic,2))
print(“P-value =”, round(p_value,4))

#
# ## End of Project Two
# Download the HTML output and submit it with your summary report for Project Two. The HTML output can be downloaded by clicking **File**, then **Download as**, then **HTML**. Do not include the Python code within your summary report.

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.

Order your essay today and save 30% with the discount code ESSAYHELP