R code _Home work1

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Applied Multivariate Data Analysis HW2: Due Monday Feb9th.

All R code/output should be well commented, with relevant outputs highlighted.

(Q1) Consider the iris data available in R.

a. Construct side-by-side boxplots of the four quantitative variables (SL, SW, PL, PW) for each
species. Do not forget to properly label the axes and give a proper title to the plots when
needed. [Hint: your will have 3 plots, one for each species. Each plot will contain four boxplots.
See the function boxplot() in R ]

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

b. Construct a pairs-plot of the four variables for each of the three species.

c. Define the vector. x = [Sepal. Length, Sepal. Width, Petal. Length, Petal. Width]T

The dataset have 50 observations of this vector (one for each flower), and with 3 species.
Thus, we have a sample of size for each of the three species. Compute the sample mean (a
vector), the sample covariance matrix (a matrix) and the sample correlation matrix (a matrix)
for each species.

d. Looking at the pairs plot in (b) and the correlation matrices in (c), do you see any patterns or
differences among the species? Explain.

(Q2) Consider the skulls dataset in the HSAUR3 package in R. You will first need to install the
package in R to access the dataset. Use ?skulls command to get more details on the data. A
snapshot of the data is shown below

library(HSAUR3)
## Loading required package: tools
head(skulls)

a. Suppose we want to estimate the population mean of all the 4 variables for skulls with epoch
c4000BC. Write down the population, parameter, the sample and the statistic you will use to
answer the question above.

b. From the skulls data set, provide an estimate (numeric value) of the parameter mentioned
above. Explain how you obtained it.

c. Now suppose we want to estimate the population variance-covariance matrix. When
estimator will you use? Provide a numeric estimate.

d. Compute the variance covariance matrix of the estimator of the population mean in part (b).

e. Provide an estimate for the parameter vector (μmb − μnh, μbh − μnh)T and compute the
covariance matrix of the estimator.

(Q3)You can use R to answer this question

Let 𝐴𝐴 = � 9 −2−2 6 �

a) Is 𝐴𝐴 symmetric?
b) Determine the eigenvalues and eigenvectors of 𝐴𝐴.
c) Write out the spectral decomposition of 𝐴𝐴 = 𝑃𝑃Λ𝑃𝑃′, i.e. find the matrices

𝑃𝑃 and Λ .
d) Verify that 𝑃𝑃𝑃𝑃′ = 𝑃𝑃′𝑃𝑃 = 𝐼𝐼 using 𝑃𝑃 from part (c).
e) Find 𝐴𝐴−1.
f) Find the eigenvalues and eigenvectors of 𝐴𝐴−1.
g) Write out the spectral decomposition of 𝐴𝐴−1 = 𝑃𝑃Λ−1𝑃𝑃′, i.e. find the matrices

𝑃𝑃 and Λ−1.

h)Find the matrices 𝐴𝐴
1
2 and 𝐴𝐴−

1
2. Verify that 𝐴𝐴

1
2𝐴𝐴

1
2 = 𝐴𝐴 and 𝐴𝐴−

1
2𝐴𝐴−

1
2 = 𝐴𝐴−1.

(Q4) Consider the following 3 matrices A, B & C and determine which of these the
matrix are positive definite

A B C

1 0.5

0.5 1.25

 
 
 

1 0.5

0.5 0.26

 
 
 

1 0.5

0.5 0.01

 
 
 

(Q5) Suppose our multivariate data have sample covariance matrix

S =

[ 2 -3 2

-3 6 4

2 4 3 ]

a) Based on this covariance matrix, how many columns (variables) does the original
data matrix have? Can you tell how many rows the original data matrix has?

b) Find the inverse of S.

c) Find and write the correlation matrix for this data set.

(Q6) The air pollution data set is given on the canvas.

For this problem, we will focus only on the first 16 observations (cities).

You can read the data into R (as a data frame) with the code:

airpol.full <- read.table("airpoll.txt", header=T)

city.names <- as.character(airpol.full[1:16,1])

airpol.data.sub <- airpol.full[1:16,2:8]

# Perform your analysis on the ‘airpol.data.sub’ subset.

(a) Use R to calculate the sample covariance matrix and the sample correlation matrix
for this data subset.

(b)Identify which pairs of variables seem to be strongly associated. Write a paragraph
describing the nature (strength and direction) of the relationship between these variable
pairs.

City Rainfall Education Popden Nonwhite NOX SO2 Mortality
akronOH 36 11.4 3243 8.8 15 59 921.9
albanyNY 35 11.0 4281 3.5 10 39 997.9
allenPA 44 9.8 4260 0.8 6 33 962.4
atlantGA 47 11.1 3125 27.1 8 24 982.3
baltimMD 43 9.6 6441 24.4 38 206 1071.0
birmhmAL 53 10.2 3325 38.5 32 72 1030.0
bostonMA 43 12.1 4679 3.5 32 62 934.7
bridgeCT 45 10.6 2140 5.3 4 4 899.5
bufaloNY 36 10.5 6582 8.1 12 37 1002.0
cantonOH 36 10.7 4213 6.7 7 20 912.3
chatagTN 52 9.6 2302 22.2 8 27 1018.0
chicagIL 33 10.9 6122 16.3 63 278 1025.0
cinnciOH 40 10.2 4101 13.0 26 146 970.5
clevelOH 35 11.1 3042 14.7 21 64 986.0
colombOH 37 11.9 4259 13.1 9 15 958.8
dallasTX 35 11.8 1441 14.8 1 1 860.1
daytonOH 36 11.4 4029 12.4 4 16 936.2
denverCO 15 12.2 4824 4.7 8 28 871.8
detrotMI 31 10.8 4834 15.8 35 124 959.2
flintMI 30 10.8 3694 13.1 4 11 941.2
ftwortTX 31 11.4 1844 11.5 1 1 891.7
grndraMI 31 10.9 3226 5.1 3 10 871.3
grnborNC 42 10.4 2269 22.7 3 5 971.1
hartfdCT 43 11.5 2909 7.2 3 10 887.5
houstnTX 46 11.4 2647 21.0 5 1 952.5
indianIN 39 11.4 4412 15.6 7 33 968.7
kansasMO 35 12.0 3262 12.6 4 4 919.7
lancasPA 43 9.5 3214 2.9 7 32 844.1
losangCA 11 12.1 4700 7.8 319 130 861.8
louisvKY 30 9.9 4474 13.1 37 193 989.3
memphsTN 50 10.4 3497 36.7 18 34 1006.0
miamiFL 60 11.5 4657 13.5 1 1 861.4
milwauWI 30 11.1 2934 5.8 23 125 929.2
minnplMN 25 12.1 2095 2.0 11 26 857.6
nashvlTN 45 10.1 2082 21.0 14 78 961.0
newhvnCT 46 11.3 3327 8.8 3 8 923.2
neworlLA 54 9.7 3172 31.4 17 1 1113.0
newyrkNY 42 10.7 7462 11.3 26 108 994.6
philadPA 42 10.5 6092 17.5 32 161 1015.0
pittsbPA 36 10.6 3437 8.1 59 263 991.3
portldOR 37 12.0 3387 3.6 21 44 894.0
provdcRI 42 10.1 3508 2.2 4 18 938.5
readngPA 41 9.6 4843 2.7 11 89 946.2
richmdVA 44 11.0 3768 28.6 9 48 1026.0
rochtrNY 32 11.1 4355 5.0 4 18 874.3
stlousMO 34 9.7 5160 17.2 15 68 953.6
sandigCA 10 12.1 3033 5.9 66 20 839.7
sanfrnCA 18 12.2 4253 13.7 171 86 911.7
sanjosCA 13 12.2 2702 3.0 32 3 790.7
seatleWA 35 12.2 3626 5.7 7 20 899.3
springMA 45 11.1 1883 3.4 4 20 904.2
syracuNY 38 11.4 4923 3.8 5 25 950.7
toledoOH 31 10.7 3249 9.5 7 25 972.5
uticaNY 40 10.3 1671 2.5 2 11 912.2
washDC 41 12.3 5308 25.9 28 102 968.8
wichtaKS 28 12.1 3665 7.5 2 1 823.8
wilmtnDE 45 11.3 3152 12.1 11 42 1004.0
worctrMA 45 11.1 3678 1.0 3 8 895.7
yorkPA 42 9.0 9699 4.8 8 49 911.8
youngsOH 38 10.7 3451 11.7 13 39 954.4

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.

Order your essay today and save 30% with the discount code ESSAYHELP