title: “Regression, Mediation, Moderation”
author: “Enter Your Name”
date: “`r Sys.Date()`”
output: word_document

“`{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
*Title*: The influence of cognitive and affective based job satisfaction measures on the relationship between satisfaction and organizational citizenship behavior
*Abstract*: One of the most widely believed maxims of management is that a happy worker is a productive worker. However, most research on the nature of the relationship between job satisfaction and job performance has not yielded convincing evidence that such a relationship exists to the degree most managers believe. One reason for this might lie in the way in which job performance is measured. Numerous studies have been published that showed that using Organizational Citizenship Behavior to supplant more traditional measures of job performance has resulted in a more robust relationship between job satisfaction and job performance. Yet, recent work has suggested that the relationship between job satisfaction and citizenship may be more complex than originally reported. This study investigated whether the relationship between job satisfaction and citizenship could depend upon the nature of the job satisfaction measure used. Specifically, it was hypothesized that job satisfaction measures which reflect a cognitive basis would be more strongly related to OCB than measures of job satisfaction, which reflect an affective basis. Results from data collected in two midwestern companies show support for the relative importance of cognition based satisfaction over affect based satisfaction. Implications for research on the causes of citizenship are discussed.
# Dataset:

– Dependent variable (Y): OCB – Organizational citizenship behavior measure
– Independent variables (X)
– Affective – job satisfaction measures that measure emotion
– Cognitive – job satisfaction measures that measure cognitions (thinking)
– Years – years on the job
– Type_work – type of employee measured (secretary, assistant, manager, boss)
# Data Screening:
Assume the data is accurate with no missing values. You will want to screen the dataset using all the predictor variables to predict the outcome in a simultaneous multiple regression (all the variables at once). This analysis will let you screen for outliers and assumptions across all subsequent analyses/steps. Be sure to factor type_work.
“`{r starting}
## Outliers

a. Leverage:
i. What is your leverage cut off score?
ii. How many leverage outliers did you have?
“`{r leverage}

b. Cook’s:
i. What is your Cook’s cut off score?
ii. How many Cook’s outliers did you have?

“`{r cooks}

c. Mahalanobis:
i. What is your Mahalanobis df?
ii. What is your Mahalanobis cut off score?
iii. How many outliers did you have for Mahalanobis?

“`{r mahal}

d. Overall:
i. How many total outliers did you have across all variables?
ii. Delete them!
“`{r overall}
# Assumptions:
## Additivity:
a. Include a correlation table of your independent variables.
b. Do your correlations meet the assumption for additivity (i.e. do you have multicollinearity)?
“`{r additivity}
## Linearity:
a. Include a picture that shows how you might assess multivariate linearity.
b. Do you think you’ve met the assumption for linearity?
“`{r linearity}
## Normality:
a. Include a picture that shows how you might assess multivariate normality.
b. Do you think you’ve met the assumption for normality?
“`{r normality}
## Homogeneity and Homoscedasticity:
a. Include a picture that shows how you might assess multivariate homogeneity.
b. Do you think you’ve met the assumption for homogeneity?
c. Do you think you’ve met the assumption for homoscedasticity?

“`{r homogs}
# Hierarchical Regression:
a. First, control for years on the job in the first step of the regression analysis.
b. Then use the factor coded type of job variable to determine if it has an effect on organizational citizenship behavior.
c. Last, test if cognitive and affect measures of job satisfaction are predictors of organizational citizenship behavior.
d. Include the summaries of each step, along with the ANOVA of the change between each step.

“`{r hierarchical}
# Mediation
a. Calculate a mediation model wherein the number of years mediates the relationship between affective measurements and OCB.
b. Include each path and summaries of those models.
c. Include the Sobel test.
d. Include the bootstrapped indirect effect.
“`{r mediation}
# Write up:

Hierarchical regression only!
a. Include a brief description of the experiment, variables, and order entered into steps.
b. Include a brief section on the data screening/assumptions.
c. Include the all F-values for each step of the model – you can reference the above table.
d. Include all the b or beta values for variables in the step they were entered. So, you will not have double b values for any predictor – you can reference the above table.
e. Include an interpretation of the results (dummy coding, do our results match the study results, etc.).

title: “t-Tests”
author: “Enter Your Name”
date: “`r Sys.Date()`”
output: word_document

“`{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
*Title*: Estimation of physical activity levels using cell phone questionnaires: A comparison with accelerometry for evaluation of between-subject and within-subject variations
*Abstract*: Physical activity promotes health and longevity. From a business perspective, healthier employees are more likely to report to work, miss less days, and cost less for health insurance. Your business wants to encourage healthy livestyles in a cheap and affordable way through health care incentive programs. The use of telecommunication technologies such as cell phones is highly interesting in this respect. In an earlier report, we showed that physical activity level (PAL) assessed using a cell phone procedure agreed well with corresponding estimates obtained using the doubly labeled water method. However, our earlier study indicated high within-subject variation in relation to between-subject variations in PAL using cell phones, but we could not assess if this was a true variation of PAL or an artifact of the cell phone technique. Objective: Our objective was to compare within- and between-subject variations in PAL by means of cell phones with corresponding estimates using an accelerometer. In addition, we compared the agreement of daily PAL values obtained using the cell phone questionnaire with corresponding data obtained using an accelerometer.
# Dataset:
– Gender: male and female subjects were examined in this experiment.
– PAL_cell: average physical activity values for the cell phone accelerometer (range 0-100).
– PAL_acc: average physical activity values for the hand held accelerometer (range 0-100).
APA write ups should include means, standard deviation/error, t-values, p-values, effect size, and a brief description of what happened in plain English.
“`{r starting}
# Data screening:
## Accuracy:
a) Include output and indicate how the data are not accurate.
b) Include output to show how you fixed the accuracy errors, and describe what you did.

“`{r accuracy}
## Missing data:
a) Include output that shows you have missing data.
b) Include output and a description that shows what you did with the missing data.

“`{r missing}
## Outliers:
a) Include a summary of your mahal scores that are greater than the cutoff.
b) What are the df for your Mahalanobis cutoff?
c) What is the cut off score for your Mahalanobis measure?
d) How many outliers did you have?
e) Delete all outliers.

“`{r outliers}
# Assumptions:
## Additivity:
a) We won’t need to calculate a correlation table. Why not?
## Linearity:
a) Include a picture that shows how you might assess multivariate linearity.
b) Do you think you’ve met the assumption for linearity?

“`{r linearity}
## Normality:
a) Include a picture that shows how you might assess multivariate normality.
b) Do you think you’ve met the assumption for normality?
“`{r normality}
## Homogeneity/Homoscedasticity:
a) Include a picture that shows how you might assess multivariate homogeneity.
b) Do you think you’ve met the assumption for homogeneity?
c) Do you think you’ve met the assumption for homoscedasticity?
“`{r homog-s}
# Independent t-test:
1) Run an independent t-test to determine if there are differences in gender for the cell phone measurement of physical activity level.
a. Use the equal variances option to adjust for problems with homogeneity (if necessary).
b. Include means and sds for your groups.
c. Is there a significant difference in the ratings?

“`{r ind1}
2) Effect size: What is the effect size for this difference? Be sure to list which effect size you are using.
“`{r effect1}
3) Power: Determine the number of participants you should have used in this experiment given the effect size you found above.
“`{r power1}
4) Graphs: Include a bar graph of these results.
“`{r graph1}
5) Write up: include an APA style results section for this analysis (just the t-test not all the data screening).
# Dependent t-test:
6) Run a dependent t-test to tell if there are differences in the cell phone and hand held accelerometer results.
a. Include means and sds for your groups.
b. Is there a significant difference in the ratings?
“`{r dep1}
7) Effect size: What is the effect size for this difference? Be sure to list which effect size you are using.
“`{r effect2}
8) Power: Determine the number of participants you should have used in this experiment given the effect size you found above.
“`{r power2}
9) Graphs: Include a bar graph of these results.
“`{r graph2}
10) Write up: include an APA style results section for this analysis (just the t-test not all the data screening).
# Theory:
11) List the null hypothesis for the dependent t-test.
12) List the research hypothesis for the dependent t-test.
13) If the null were true, what would we expect the mean difference score to be?
14) If the null were false, what would we expect the mean difference score to be?
15) In our formula for dependent t, what is the estimation of systematic variance?
16) In our formula for dependent t, what is the estimation of unsystematic variance?



































































































male NA












female NA






























female NA





female NA







































female NA




















gender PAL_cell PAL_acc
male 72.4006570351277 85.1370858719345
76.2062996236089 74.6959138225082
73.9894180950249 66.7842060128155
70.8665372809887 65.4232948393563
54.8597935442081 67.4557297142323
77.9428213253685 83.2949210027534
69.106757360934 66.7267940307791
75.0644352678202 72.3854161281744
67.690800874886 51.9653490922108
73.0956931780271 74.5262699893203
NA 52.3786938575334
80.7885433804063 76.2271047043759
69.1510462374321 55.7394383985716
78.6761864584267 75.5666101223992
82.1840134797691 69.3291091417178
72.5881982207414 63.8441979148815
60.8752944426126 69.8522165528652
85.4194894102543 76.4244170597175
70.3795418254552 71.3548359622226
83.9443234031233 68.9070753560881
74.7783439061255 72.5852818708896
81.9823120573364 82.6472765986365
71.1537383216659 65.446526374528
74.1502037172106 68.2750448245837
77.6928590730127 68.5426984264035
70.8320280343964 70.7658184142912
63.3136123988265 61.3697366637545
73.0270354117649 74.1959036113411
72.0998103355967 66.9589501205765
63.9739048278267 64.3910024134725
72.5867264483627 70.3243824046032
65.1000733364443 63.2019880967467
78.4483955038249 69.2453539671429
90.2097428580619 74.2856646491723
76.2933898266054 69.6312660388654
84.4844617874705 82.241034494705
85.0554396824544 85.0694356496738
73.5877926539369 71.5126424751607
77.7323610551133 65.7271179464633
71.8797101617864 62.6315166212805
74.2292069418433 79.424840043561
79.1233476325559 77.7298876098711
77.819384036623 79.0504868020557
70.5362858869002 62.966500312717
70.2123528654554 69.7462910930523
73.0988593385123 75.3487112135503
73.0561505266601 60.7000763505234
71.4232123393147 60.549204963135
67.3508041186572 61.5220529118362
67.76346409569 59.402077669912
75.1795789090658 76.6171647013762
74.368405753655 67.9810318743127
74.7597829003647 80.6204749350757
85.8342232484146 84.7287736856981
59.0195558703319 58.8082479939406
77.9278594960994 69.730726605992
73.044817154869 64.7612654776963
72.274328833067 72.4651573922092
69.3051195146289 62.8878982708685
83.5182436761721 71.1858044277597
81.2188059597867 72.3438776551328
78.2664034073347 74.3044235344201
62.038733675014 59.7568589351898
70.1339026109148 61.7161203295751
72.745881780121 62.3704110527748
79.7551458199661 65.2151817402737
78.8219443792875 74.0659868010663
73.4362357962979 70.2232850790912
75.8394533142271 73.7603701723708
81.9345948789685 81.6726306622105
80.7780112459981 77.8168191770759
64.4389399660445 65.0891742238283
72.0851556073606 70.262435230257
78.0031596452471 78.9446639567774
83.2007915610454 76.8252997817864
70.884176513492 70.5079226605996
63.5767438349245 69.446648205887
63.1693061799022 67.0352126723534
72.5628327633008 62.170790184715
89.1763118099635 89.5578875531754
76.0563325757639 69.8494489157275
76.4530106412504 74.1872279784795
74.7441033759586 70.9552462013045
72.7007153532424 77.6801760009618
57.1097942447942 50.9140622561547
69.5381197904941 64.6277077720865
79.494275229266 77.8863848615241
76.5363085307671 65.100208652412
61.8604152881768 55.3871871381033
72.3791138110899 63.3832401383328
84.8051305473612 93.6115271073327
71.0076992718115 63.2341214060054
80.9325286599697 66.9488723062493
64.3073299954988 64.1717374509959
87.2804135659903 88.5637895094386
88.8566033134163 73.4761506320622
82.4864317997046 77.6827696888457
female 51.616168395204 58.6858930718538
65.1843694427764 57.702425741375
57.6460878839399 35.8330758962889
52.6546523844225 58.333214584317
52.7997212922793 49.2724870069073
61.226214758117 64.8385164463736
53.1153595876983 51.3802294811831
54.9720379204206 62.5531537959167
69.5866632975411 69.7654607399738
46.1836249123455 48.7005481878049
56.9946920845849 54.3556811001809
52.9654978248591 50.1938501196584
63.7205270732329 41.0057945673038
40.4447355736901 32.0332467565067
56.4387580922722 48.3211368945957
57.4474638303147 39.1965148497016
53.8853036603842 46.0202933783171
57.5615230471273 62.8306894177542
56.1749122849326 58.9976122775338
64.1352918514364 66.3472519699943
67.4802958845261 49.4521766855608
52.1348583779612 47.6347432456076
58.0591849873238 66.4055373507641
59.4533903316241 68.9282180713862
49.8982379566972 33.1053584564401
53.0529496507504 46.007294287504
64.138642177445 51.4168044511176
55.0272495005356 53.5706395032084
62.1159997505588 63.6384850714131
54.1614958447524 47.7875489495303
63.8642349286339 67.8321221810552
50.2286092147927 42.4014357623181
68.8569392117939 52.2712258484566
62.0625527496114 53.7929934466563
44.5148073957955 47.592812666847
48.9616359131003 55.8702159171738
56.8109523474601 60.9019862401383
47.6909666747157 43.8167161247407
53.1983381043173 57.4435810163839
75.2794214604864 68.5921001920408
55.910375914201 59.5351848982978
62.5187947298996 51.3387619611688
64.5380154166359 60.295703326528
51.5160392013021 44.7848896160542
58.1462923479262 60.5163145513654
56.5248903427374 45.4096392984706
54.8504714578117 51.4023461477762
53.8797239498076 57.8816954563351
59.8720985850282 60.0534226202175
53.4419227770739 35.3819751585917
59.2258722460321 44.1019980760806
61.598928543317 51.2944549607942
62.2915484062591 42.4340785961811
50.196023090885 52.9204525565029
50.0489972545898 36.1108458018804
57.4584538278174 40.1313131919599
56.1628882291369 53.3286883192153
41.3326138273037 32.9287056271615
56.1845480014142 49.9206677865009
59.3255642770734 52.4765728718719
43.6611457318824 46.5159064523431
48.475687689923 61.8942629272446
55.5744324171805 55.9475791431643
48.0797120780012 49.8802020437662
51.1149578338325 42.3755553934392
71.8395207127395 65.9105078442087
67.1551365153855 63.3566794159513
50.4912137243717 52.6298669162426
57.4535107424998 64.459078261412
53.1241348361738 46.8776245290464
78.4143751483127 72.4168679609986
63.8193489545053 46.1935591286504
65.0457959672517 66.7943293695867
57.3829445713105 51.6385184919046
54.1165704063913 58.8588059622564
62.9863042887545 53.924260267259
62.0773848345567 67.6852738372381
52.8070608618836 47.7286111634228
62.4811637151255 45.3512695918287
63.0707169540065 51.9111986070632
56.0193254790578 63.304253016477
56.4543595630912 54.2613025436027
55.0043438280189 68.8073777952618
48.8297631664403 23.6775691899542
59.7746897745555 48.8845264077475
46.994483410107 47.1793791441458
54.9882553143553 49.9612944208481
60.5237773187336 45.7733174149099
68.869835837059 82.9969108375415
54.7216388851341 52.0388453026458
48.9675338582696 57.144612047691
38.324566178386 50.865215973678

ANLY 500 – Principles of Analytics I

Final Project Guidelines

The purpose of the project is to learn how to formulate a problem statement or research question, determine how to best find a solution to the stated problem or answer to the research question, do that and then develop a final written report and presentation. The project is team-based or individual, I leave the choice to you. Individual grades will include points for how well they contributed to the team effort.

The course project has three (3) deliverables:

1) Project proposal,

2) Presentation and,

3) Final Report

Each of these deliverables will be described in the paragraphs below. As an overview, each team will select a company,


, or industry to target as their focus. Collect some data that will allow you to develop a case study to address an problem. Each case study presents a situation, challenge or problem the company, organization, or industry has had or is having. The primary objective of the course project is to determine how analytics could or can help the respective company address the situation or overcome the challenge or problem it is facing.

To do this each team must review the case study, formulate a problem statement or research question as appropriate, and then identify the appropriate analytics methods or techniques to complete an analysis where possible.

Part 1:
Each team must develop a proposal as described below consisting of 20% of the grade. Proposals due date refer to Moodle.

Part 2:
Last, the team should develop their presentation as described below consisting of 40% of the grade. Final presentations are due by the Third Executive Meeting.

Part 3: The third piece to the final project is the final written report as described below consisting of 40% of the grade. Final Reports consisting of 40% due date refer to Moodle.

1. Project Proposal

The project proposal is intended to introduce the company and its situation, problem or challenge. It should include all relevant information for that introduction. The proposal should try to answer the following questions:

· What is the problem you are trying to solve or question you are trying to answer?

· What data do you need?

· What work do you plan to do in the project?

· Which algorithms/techniques/models do you plan to use/develop? Be as specific as you can.

· How will you evaluate what you’ve done?

· What do you expect to submit/accomplish by the end of the project?

Proposal Requirements:

· 1-2 pages

· 12 pt font

· Times New Roman.

· Word or pdf.

· Double Spacing.

· APA formatting


2. Presentation

By the time your presentation is due you should have completed at least 90% of your project work. The presentation can serve as a draft of your final report but without your final analysis and results, but I do suggest having at least test results of a model. You should include at least the following in your presentation:

· What the problem is that you are trying to solve or question you are trying to answer.

· All relevant background information including any relevant literature you have/will use.

· The overall process you will follow for the entire project.

· A description of your data including how you obtained it.

· A description of any relevant, interesting exploratory data analyses.

· A description of the methods/techniques/tools/algorithms you have/will use to complete the project. Include test results if applicable.

· A description of the challenges you have had working on the project.

· A discussion of the parts of the project that have been completed.

· A discussion of the parts of the project that remain to be completed.

· A discussion of how you will finish the final project report and presentation.

Presentation Requirements:

· 10 slides minimum

· ppt

· APA formatting

3. Final Project Report

The final report and presentation should cover virtually everything about the project. It should cover the situation, problem or challenge that required attention, the relevant background, related work, data, and technical details of the analysis, conclusions and possible directions for future work. It is recognized that not all of the following sections will pertain to each report. However, it is strongly recommended that these section topics be used as a guideline for your final project reports. Final presentations can follow your final report in text and graphical content.

Introduction, motivation and general description of the situation, problem or challenge.

· Following the proposal and status report, what is the situation, problem or challenge you are addressing?

· What preliminary examination leads you to believe analytics could help?

· What are the shortcomings of the current work/analysis that analytics could help with?

Related work.

· Provide a thorough background for the project; e.g. about the company, about the situation, problem or challenge, about other companies that have undergone similar situations, problems or challenges and how they handled them or did not, etc.

· How does this project relate to other work that has been done on this situation, problem or challenge?


· Give a complete description of the data you use during the project, including any you reject.

· Provide the source(s) of your data.

· Provide a detailed description of your data.

· Provide any exploratory data analyses you complete.

Technical Approach

· Give a detailed description of the process for your entire project.

· Given a detailed description of your approach to the analytics you have proposed to use including any algorithms, methods, tools or techniques. You do not have to describe well known approaches themselves, e.g. linear regression. You do have to describe how you applied the approach you used.

Test and evaluation

· Describe how you test your approach to ensure that it is valid.

· Discuss the validity of your approach.

· Describe how you will evaluate your results and/or conclusions including any specific metrics, output data, completed analyses, etc.

· Discuss the baseline you will use to compare your results to.

· Discuss how well your approach worked to address the situation or challenge, solve the problem or answer the research question.

· Discuss any potential future work. For example, if you were not able to resolve the situation or problem or answer the research question what will it take to do so? What else needs to be done?

· Evaluate and report whether or not someone unfamiliar with your work could accurately replicate it.

Written work and Presentation Style

· Written work will be graded using the rubric provided.

· Presentation style will be graded on comprehensiveness and inclusiveness, as well as using the rubric provided.

Final Report Requirements:

· Refer to ANLY_500_Report_Formatting

Grading Guidelines for Deliverables:



