Latex

Could you rewrite the presentation using Latex?

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

LECTURE 1 (3)
STA 811

STATISTICAL INFERENCE
By
Dr. Argwings Otieno

Department Of Mathematical Statistics
19th January 2021

INTRODUCTION
◼ A variable X follows a particular distribution◼ Distribution represented by ƒ(x,θ)◼ Where θ∈Θ.. in unknown◼ θ is called a parameter◼ Θ.. is called parameter space◼ Problem: Estimation of θ or τ(θ)◼ Where τ(θ) is some function◼ Estimation based on sample (X1, X2, X3, ………., Xn)◼ Two approaches to estimation

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

1. Point estimation: Gives a single value obtained from a specific estimator δ(X1, X2, X3, … … …, Xn)
2. Interval estimation: Construct an interval defined by two statistics δ1(X1, X2, X3, … … …, Xn)

and δ2(X1, X2, X3, … … …, Xn) where δ1 < δ2■ Then θ or τ(θ) will fall within interval with some specified probability■ Some properties of good estimators: 1. Unbiased Estimator θ on average estimates true parameter θ ϵ(θ) = θ, ∀ θ 2. Sufficiency: Estimator has all the information about parameter

3. Consistency: As sample becomes large estimator tends to true parameter

Simple consistency ϵ > 0
Limn→∞ θ – ϵ < θn < θ + ϵ = 1, for, θ ∈ Θ..

◼ Mean squared error Consistent
Limn→∞ ϵθn – θ2 < θ + ϵ = 1, for, θ ∈ Θ..

◼ Note◼ MSE consistent imply simple consistency
◼ θn is MSE consistent if

1. θn is unbiased
2. θn is simple consistent◼ Other properties◼ Minimum variance unbiased estimator◼ Best Asymptotically normal
For ϵ > 0 Lim n→∞ √n θn – θ ∼ N.. 0, σ2(θ)

◼ Class of unbiased estimators is infinite◼ Example■ Let (X1, X2, … … …, Xn) be a sample from ƒ(x, θ)■ If ϵ(Xι) = θ, ∀ θ the Xι is unbiased estimator
■ Also X– = ϵ ∑n ι=1 Xι

n
is unbiased estimator

ϵ(X– ) = ϵ ∑n ι=1 Xι
n

= ∑n ι=1 ϵ(Xι )
n

= θ
■ In general ∑n ι=1 κιXι

n
is unbiased estimator if

■ ϵ ∑n ι=1 κι Xι = ∑n ι=1 κι ϵ(Xι) = ∑n ι=1 κι θ = θ, ∀ θ■ This happens if ∑n ι=1 κι = 1■ Hence unbiased estimators are infinite.■ We need further restriction on the class of unbiased estimators.◼ Minimum Variance Unbiased Estimator MVUE■ An estimator δ(X1, X2, ………., Xn) in MVUE if
1. ϵδ(X1, X2, … … … …., Xn) = θ ∀ θ
2. Varδ(X1, X2, … …., Xn)] is minimum◼ Likelihood Function■ If (X1, X2, … …., Xn)is a random sample from ƒ(x,θ)
■ Likelihood function L.. (θ] =∏n ƒ(xi, θ)■ Sufficient Statistic θ = T.. (X1, X2, ……….., Xn)■ If L.. = θ, θ (X1, X2, … … …, Xn)
■ Where ((X1, X2, … … …, Xn)is a function of X1, X2, … … …, Xn

2 Statistical Inference.nb

■ Or a Constant
Statistical Inference.nb 3

LECTURE 1(9)
INTRODUCTION TO

REGRESSION ANALYSIS

COURSE LECTURER: DR. JULIUS K. KOECH, PHD

DEPARTMENT: MATHEMATICS & COMPUTER SCIENCE
■ In correlation, the two variables are treated as equals■ In regression analysis, one variable is considered as the independent, predictor, covariate,

denoted by (X) and the other dependent, outcome, response variable and is usually denoted by Y.

Regression: Overview

Basic Idea: Use data to identify relationships between variables, and use these relationships to
make predictions.

What is “Linear”?
■ Remember this:■ γ = mχ + ?

5 10 15

20

5

10

15

(functional form of a model with a line of best fit)

4 Statistical Inference.nb

■ The goal of regression analysis is to express the response/outcome variable as a function of the
predictor variables■ Hence non-representative or improperly collected data result in a poor fit and wrong conclusions

Effective Use of Regression Data
Thus, for effective use of regression analysis one must;■ Investigate the data collection process and ensure good data management plan■ Identify any limitations in data collected■ Restrict conclusions accordingly i.e discuss only important findings
Use of Regression Analysis■ Making Prediction■ Model Specification■ Parameter Estimation
The Linear Regression Model■ A linear regression model model expresses the conditional expectation of γ given Χ, ϵ(γ|Χ = χ),

as a linear function of Χ.■ Each sample observation is assumed to be generated by an underlying process described by the
linear model.■ We want to find the best line, (linear function) γ = ƒ(χ) to explain the data

5 10 15 20 25 30

5
10
15
20

(data and line of best fit)

Linear Regression Model

Relationship Between Variables is a linear function

Statistical Inference.nb 5

γ = β0 + β1 χ + ϵ where :γ is the Dependent(Resposnse Variableβ0 is the population γ – interceptβ1 is the population slopeχ is the Independent(Explanatory)Variableϵ is the Random Errror
Important questions in Regression Analysis● What is the association between γ and χ?● How can changes in γ be explained by changes in χ?● What are the functional relationships between γ and χ?
A functional relationship is symbolically written as:γ = ƒ (χ)


-2 -1 1 2

1.0

0.5

0.5
1.0

1.5

2.0

(Piecewise function of ƒ(χ) =

χ Log(χ) with a Tangent approaching Infinity at zero)
γ = 1 Χ where 1 is the SLOPE of the line
Example: Linear Relationshipγ = 0 + 1 χ where 0 is the intercept,1 is the slope◼ Concerns: ◼ The proposed functional relationship will not fit exactly, i.e something is either wrong with the data

(errors in measurement), or the model is adequate(errors in specification).◼ The relationship is not truly known until we assign values to the parameters of the model.
The possibility of errors into the proposed relationship is acknowledged in the functional symbolism
as follows:γ = ƒ(χ) + ϵ where ϵ is a random variable representing the result of both errors in model specification

6 Statistical Inference.nb

and measurement. The variance of ϵ is the background variability with respect to which we will asses
the significance of the factors (explanatory variables).

Linear Regression model with two or more covariatesγι = β0 + β1 χι+ ϵι for ι = 1, 2, … …., 
γι= α + β1 χ1 + β2 χ2 + ϵ
Regression Estimation assuming Binary and Categorical Variables

1. γι = α + β1age + β2gender  male = 1female = 0 
2. γι = α + β1 age + β2 education primary = 1secondary = 2

university = 3
3. γι = α + β1 + β2gender  male = 1female = 0 + β3 education

1 = none
2 = none
3 = none
4 = none

Interpreting Regression Coefficients● In the equation (1) above, assume
α = 2.35, β1 = 5.75 and β3= 8.20
Interpreting Linear coefficients ● Fix the values of parameters in equation (1) and discuss the effect of these covariates on the

outcome

Subscript[γ, ι] = 2.35 * 5.75 age + 8.20 gender  1 = male0 = female
Effects of Covariates on the response variable (assume γι in salary in US Dollars
● What is the effect of age on the response variable salary?
Linear Regression

The predicted value of γ is given by:
γ = β0 + ∑ j=1 χj + βj

The vector of coefficients β is the regression model.
If χ0 = 1, the formula becomes a matric product : γ = χ β

The error term:

Statistical Inference.nb 7

Another way to emphasize, ϵ = γ – ƒ(χ) or emphasizing that ƒ(χ) depends on unknown parameters.
γ = ƒ( χ | β0, β1) + ϵ
What if we don’t know the function form of the relationship?
Parameter estimates are given by the following equations:

β1 = xyxx
β0 =  – β χ
β1 = xyxx

Significance testing in linear regression

H
.
. 0 = β1 – β2 vs H.. 1 = β1 ≠ β2
Steps in Regression Analysis■ Examine the scatter plot of the data.■ Does the relationship look linear?■ Are there points in locations they shouldn’t be?■ Do we need a transformation?■ Assuming a linear function looks appropriate, estimate the regression parameters.■ How do we do this? [Use Method of Least Squares]■ Test whether there really is a statistically significant linear relationship. Just because we assumed

a linear function it does not follow that the data support this assumption.■ How do we test this?[F-Test for Variances]■ If there is a significant linear relationship, estimate the response, γ, for the given values of χ, and
compute the residuals.■ Examine the residuals for systematic inadequacies in the linear model as fit to the data.■ Is there evidence that a more complicated relationship (say a polynomial) should be considered;

are there problems with the regression assumptions? (Residual analysis).■ Are there specific data points which do not seem to follow the proposed relationship?
Linear Model Specification■ Identify the dependent and the independent variables of interest.■ The dependent variable is typically an outcome, such as wages earned or attendance at a post

secondary institution.■ The independent variables are factors known to affect the outcome variable of interest.■ Specifying a regression model involves selecting a dependent variable and the related
independent variables.

8 Statistical Inference.nb

■ The type of dependent variable determines the type of regression, which is either a linear or
logistic regression model.

THANK YOU

Statistical Inference.nb 9

LECTURE 1(10)
STA 814

MULTIVARIATE ANALYSIS
By Dr. Arwings Otieno
2020/2021

MOTIVATION

◼ Univariate : A single variable is considered● Examples :● Pupils score in a subject● Height of individuals● Body Mass Index (BMI)● Weight of babies at birth● Blood Pressure

BIVARIATE DATA
■ Bivariate means to 2 variables■ Examples■ (Age, Weight) for babies under 5 babies■ (Age, Height), for babies under 5 years■ (Wife, Husband) ages■ Notation: Variables (χ, γ)■ Values : (x,y)■ Notation : Joint pdf or pmf
ƒχ, γ(x,y)

10 Statistical Inference.nb

■ Parameters : ϵ(χ) = μχ, ϵ(γ) = μ γ, Var(χ) = σχ2, Var(γ) = σγ2
Covariance : Cov(χ,γ) = σχ,γ
= ϵ(χ – μχ)(γ – μγ)
■ Correlation between χ and γ
ρχ,γ = Cov(χ,γ)√Var (χ)Var(γ)= σχγσχ σγ
-1 ≤ ρχγ ≤ 1

χ =  χγ ’ ϵ(χ) = ϵ  χγ = μχμγ
Cov(χ) = Cov χγ 
=  Var (χ) Cov (χ, γ)Cov (γ, χ) Varγ 

= σχ2 σχγσχγ σγ2 = σχ2 ρχγ σχ σγρχγ σχ σγ σγ2

Basic Matrix Results
■ Matrix, m*n array of elements in  rows,  columns■ Also m*n = (αy),ι = 1,2,………,, = 1,2,………,■ Transpose of  is denoted ’ ■ Column Vector: χ*1 row vector χ ‘ = (χ1, χ2,………….., χ)
χ^ ‘ χ^ = ∑ ι=1 χι2
n*m m*r = ∑mj=1 aij bjk,ι = 1,2,………..,, = 1,2,……….,■ ≠  ■ If ,,ℭ are matrices with row and columns compatible, then:
()ℭ = (ℭ)
(+ℭ) =  + ℭ
(+)ℭ = ℭ + ℭ
■ Also : ’’ = 

Statistical Inference.nb 11

( +)’ = ’ + ’
()’ = ’ ’
■ Matrix n*n is non – singular if  | ≠ 0■ The Inverse Α-1 of matrix  is such that :
ΑΑ-1 = Α-1 Α = ℑ
■ Trace of a square matrix Α* is defined as (Α) = ∑ι=1n αιι
■ Properties: (Α+Β) = (Α) + (Β) and (ΑΒ) = (ΒΑ)
■ If Α* is diagonal, then | Α | = ∏ ι=1 αιι■ A square matrix is Τ* Upper Triangular■ If ij = 0, i > j ■ Lower Triangular■ If ij = 0, i < j ■ If Α* is symmetric then Α = Τ Τ'■ For a matrix Αn*n■ Rank ℛ(Α) = minimum independent(row, col)■ Rank Efficient if ℛ(Α) =  < min(,)■ Characteristic equation for Α* is | Α -λℐ | = 0■ Is a polynomial of degree  in λ■ Solutions λ1, λ2, ... ... ... .., λ are EIGENVALUES or characteristic roots■ For given characteristic root λ0, the solution χ to the homogeneous equation is (Α - λ0 ℐ) χ = 0

which is called an eigenvector■ A matrix Αn*n is ORTHOGONAL if Α-1 = Α’■ For a symmetric matrix Αn*n
■ Quadratic form χ ‘( Αχ) = ∑ ι=1 αιι χι2 + 2 ∑ι<1 αιj χιχj■ A symmetric matrix Α* is :■ Positive definite if χ ' Αχ > 0, ∀ χ ≠ 0■ Negative definite if χ ‘ Αχ < 0, ∀ χ ≠ 0■ Positive semi-definite if χ ' Αχ ≥ 0■ Negative semi-definite if χ ' Αχ ≤ 0

12 Statistical Inference.nb

LECTURE 2
LEAST SQUARES METHOD OF

ESTIMATION
By Dr. Julius Koech

Department of Mathematics and Computer Science
Overview of The Regression Model

■ Regression Model estimates the nature of the relationship between the dependent/outcome and
independent/predictor variables■ Looks at the effect of change on the dependent variable as a result of the changes in the
measured covariates■ Strength of the relationship■ Statistical significance of the relationship

The Bivariate and Multivariate Models

(Education) χ —————————————————–> γ (Income)
(Education ) χ1 —————————————————–> γ
(Sex) χ2 —————————————————–> γ Income}
(Years Of Experience) χ3 —————————————————–> γ
(Age) χ4 —————————————————–> γ
Causation is not association !!

Price of Rice <----------------------------------------------------------> Quantity of Rice Produced

Regression Line ■ The regression model is γ = β0 + β1 χ + ϵ■ Data about x and y are obtained from a sample.■ From the sample of values of x and y, estimates b0 of β0 and b1 of β1 are obtained using the least
squares or another method.■ The resulting estimate of the model is y = b0 + b1χ

Statistical Inference.nb 13

■ The symbol y is termed “y hat” and refers to the predicted values of the dependent variable y that
are associated with the values of x.

Uses of Regression■ Amount of change change in a dependent variable that results from changes in the independent
variable(s), can be used to estimate elasticities, returns on investment in human capital, etc.■ Attempt to determine causes of phenomena■ Prediction and forecasting of sales, economic growth, etc.■ Inform policy through use of improved theoretical models.

Challenge with determining the line of best fit
How would you draw a line through the points? How do you determine which line “fits best”?■ The line of “Best Fit” means difference between actual Y values and predicted Y values are a

Minimum. This implies variability when using linear estimation.■ Estimating the error term when using the method of least squares is given the equation : ■ The method of Least Square minimizes the sum of the squared differences errors (SSE=ϵ]
∑ι=1 [Yι – Y]2 = ∑ι=1 ϵι2■ The general form of the LPM is :■ Yι = B1 + B2 X2 ι + B3 X3 ι + ……….. + Bk Xkι + uι■ Or, the above equation can be written in matrix form as Y = χB + uι where B = χΤ χ-1 χΤY , B =  bm  and uι is the sum of squared errors■ Yι = BX + uι where Y is the outcome variable, X is a vector of predictors ( sometimes referred to

as the the design matrix), and  is the error term.■ B1 is the intercept■ B1 to Bk are the slope coefficients.■ Collectively, they are the regression coefficient or regression parameters.■ Each slope coefficient measures the (partial) rate of change in the MEAN VALUE of Y for a unit
change in the value of the covariate.

Linear Model in Matrix Form( Illustration with a sample data set)

ID Age (x1) Gender (x2) Dis_km (x3) Wt (y)
1 18 1 5 50
2 20 0 6 60
3 25 1 2 70

■ Gender : gender of respondent, 1= Male, 0 = Female■ Age: Age in years

14 Statistical Inference.nb

■ Dist_Km : Distance in Kilometers■ Weight : Weight of respondent in Kg. This is also the response variable of interest.
Linear Model Illustration

Y =
y1
y2
y3

=
50
60
70

X =
1 18 1 5
1 20 0 6
1 25 1 2

β = β0β1β2β3
Familiarize yourself with finding the inverse of an n*p matrix (data structure).

If A =  a bc d  then A-1 = 1ad – bc  d -bc – a ■ During your free time read on this concept!!■ Matrix Transpose ■ Can you try this assignment 1( Will give a window of two weeks for submission)■ We will later learn how to use R to solve for parameter values in a linear model.
The General Equation for a Linear Model■ The general linear model in matrix form for estimating coefficients with method of least squares is
given by the following equation :θ = χΤ χ-1 χΤY

Thank You All!!

Statistical Inference.nb 15

LECTURE 2A
Sufficient Statistics

■ Factorization criterion■ Let X1, X2, … … …, Xn be ranDOM SAMPLE FROM ƒ(χ,θ) ■ Then Statistic  = (X1, X2, … … …, Xn) is sufficient iff
ƒ(X1, X2, … … …, Xn;θ) = ((X1, X2, … … …, Xn), θ)ℊ((X1, X2, … … …, Xn)
= (, θ)ℊ((X1, X2, … … …, Xn)
■ Remark: If  = (X1, X2, … … …, Xn) is sufficient for θ then it is also sufficient for (θ) where (.) is

a one-to-one mapping■ Jointly sufficient Statistic■ Let X1, X2, … … …, Xn be a sample from■ pdf ƒ(χ, θ1, θ2, … … … .., θr)■ Where θι ι= 1,2,…………, are unknown parameters■ Then
1 = 1(X1, X2, … … …, Xn), 2 = 1(X1, X2, … … …, Xn),………., r= r(X1, X2, … … …, Xn)
are JOINTLY SUFFICIENT for θ1, θ2, … … … .., θr iff

ƒ(X1, X2, … … …, Xn; θ1, θ r) = (1(X1, X2, … … …, Xn), 2(X1, X2, … … …, Xn), …………,
r(X1, X2, … … …, Xn);θ1, θ2,………,θ r)ℊ(X1, X2, … … …, Xn)

= (1,2,……….., r , θ1, θ2, … … … .., θr)ℊ(X1, X2, … … …, Xn)

■ Example 1■ Let X1, X2, … … …, Xn be a random sample from Bernoulli()■ Show that :
(1). ∑ι=1χι and (2). χ = ∑ι=1 χι are sufficient for .■ 1) Solution :
■ Likelihood Function L(;X1, X2, … … …, Xn) = ∏ ι=1 ƒ(χι,)
= [∏ ι=1 χι (1 – ) 1-χι

=  ∑ι=1 χι (1 – )  – ∑ι=1 χι

= (1-)  – 

16 Statistical Inference.nb

= ƒ(,)ℊ(X1, X2, … … …, Xn)■ Hence  = ∑χι is SUFFICIENT for 
■ 2) Solution :
■ Write ∑ι=1 χι =  χ ■ Then :
L(; X1, X2, … … …, Xn) = χ(1-)  –  χ

= ƒ(χ,)ℊ(X1, X2, … … …, Xn)
■ Example : Let (X1, X2, … … …, Xn) ∼ Ν(μ, σ2■ Then :
ƒ (X1, X2, … … …, Xn; μ, σ2 = ∏ ι=1 ƒ(χι; μ, σ2

= (∏ ι=1) 1√2πσ2 exp -(χι- μ) 22σ2

■ Show that : ∑ ι=1 (χι – χ ) 2 are JOINTLY SUFFICIENT for μ, σ2)

■ If
ƒ(X1, X2, … … …, Xn; θ1, θ2,………,θ r) = (1(X1, X2, … … …, Xn), (2(X1, X2, … … …, Xn),
r(X1, X2, … … …, Xn); θ1, θ2,………,θ r)ℊ(X1, X2, … … …, Xn)

= 1(1,θ1) 2(2, θ2) … … r(r , θr)ℊ(X1, X2, … … …, Xn)

■ Then ι is SUFFICIENT for θι , ι = 1, 2, … … …, r
Complete Statistics■ A statistic  = (X1, X2, … … …, Xn) is said to be COMPLETE if :
ϵ[ψ()] = 0 ∀ θ
⟹ ψ() = 0 ∀ ■ Except possibly for a set for which the probability measure is 0 ∀ θ
Completeness■ Example :
χ ∼ Bernoulli()
ƒ(χ;) = χ (1 – ) 1-χ, χ = 0.1

Statistical Inference.nb 17

Likelihood

L(X1, X2, … … …, Xn; ) = [∏ ι=1 χι (1 – ) 1-χι = χ(1-)  –  χ

■ Note ∑ ι=1 χι = χ is sufficient for 
⟹  = χ is also sufficient for 
■ To show that  = ∑ ι=1 χι is complete :
ϵ[ψ()] = ∑ψ() ƒ(,)

= ∑ ι=1ψ()    (1 – )  – 
= ∑ ι=0 α() (1 – )  – , where α() = ψ()  
= 0 ∀ 
■ Thus α() = 0 for  = 0,1,2,……..,
⟹ ψ()   = 0 ∀ 
⟹ ψ() = 0 ∀ , ∵    ≠ 0
■ Hence  is COMPLETE
Lehmann-Scheffé Theorem■ Let χ have a pdf ƒ(χ;θ) and  = (X1, X2, … … …, Xn) be a SUFFICIENT STATISTIC for θ.

Suppose  is also complete. Then every estimable function ℊ(θ) possess an unbiased estimator
with UNIFORMLY MINIMUM VARIANCE (UMVUE)■ That is, If  is sufficient and complete for θ and ϵ[()] = ℊ(θ)■ Then () is UMVUE■ Example : χ ∼ Bernoulli() :

 = ∑ ι=1 χι is sufficient for 
Also  is complete
■ ϵ    = () =  ∀  , That is, Unbiased■ It follows that () = χ■ UMVUE

18 Statistical Inference.nb

Maximum Likelihood
Let X have pdf ƒ(χ,θ) and X1, X2, … … …, Xn be a random sample
Define : Likelihood Function

L(X1, X2, … … …, Xn; θ) = [∏ ι=1 ƒ(χι, θ)
Choose θ such that L(χ; θ) is maximum for fixed (X1, X2, … … …, Xn)

Also for more than one parameter

L(X1, X2, … … …, Xn; θ1 θ2,…….., θκ) =[∏ ι=1 ƒ(χι;θ1 θ2,…….., θ)

Choose (θ1 θ2,…….., θκ ) such that L(χ;θ) is maximum

Maximizing the likelihood is same as maximizing log-likelihood■ Log-Likelihood (θ) = log[L(χ, θ)]■ Example : (X1, X2, … … …, Xn) is a sample from Ν(μ, σ2
■ Likelihood L(μ, σ2 = [∏ ι=1 ƒχι; μ, σ2]
= (( 12πσ2 ) 2 exp ( -12σ2 ) ∑ ι=1 (χι – μ)2)
■ (μ, σ2) = -n2 ln 2πσ2) – ( 12πσ2 ) ∑ ι=1 (χι – μ)2
∂∂μ= 0 ⟹ ∑ ι=1 (χι – μ) = 0, ⟹ μ = χ
∂∂σ2 = 0, ⟹ ( -2πσ2 ) + ∑ ι=1 χι-μ22σ2 = 0, ⟹ σ2 = ∑ ι=1 χι-μ2 = ∑ ι=1 χι- χ2
■ Therefore the maximum likelihood estimates are
μ = χ, σ2 = ∑ ι=1 χι- χ2
However, μ = χ is UNBIASED while
σ2 = ∑ ι=1 χι-χ2 = BIASED

Conclusion

■ MLE are not necessarily unbiased■ Exercise : Find MLE of θ if ƒ(χ,θ) = 1θ , for 0 < χ < θ■ Show θ = max(X1, X2, ... ... ..., Xn) is the MLE biased

Statistical Inference.nb 19

θ = (+1) max(X1, X2, … … …, Xn)
Properties of MLE

■ 1. They are simple consistent and mean squared error consistent■ 2. They are functions of sufficient statistics■ 3. They have properties of invariance■ 4. They are asymptotically efficient and Best Asymptotic Normal(BAN) estimates
Invariance Property■ If χ is ƒ(χ,θ), (X1, X2, … … …, Xn) is a sample■ θ is MLE of θ ■ Then if ψ(θ) is a single values function of θ
■ Then ψ(θ) is the MLE of ψ(θ)■ Example : (X1, X2, … … …, Xn) is Poisson(θ) , Find MLE of Pr(χ = 0).
■ Solution : ƒ(χ,θ) = ⅇ-θ θχχ! , χ = 0,1,2,3,………..■ Then MLE θ = χ■ But Pr(χ = 0) = ⅇ-θ which is a single – valued function of θ■ Hence by invariance property: mle(Pr(χ = 0)) = ⅇ-χ■ Show that MLE are functions of sufficient statistics
■ L(χ,θ) = ∏ ι=1 ƒ(χι, θ)
= ((χ, θ))κ(X1, X2, … … …, Xn),, sufficient
■ Log-Likelihood :
(θ) = ln(((χ, θ)) + ln(κ(X1, X2, … … …, Xn)
Solving……..
∂∂θ = 0, ⟹ ∂ ln   χ,θ∂θ = 0
Solution……..
θ = ƒ()

Hence MLE are necessarily functions of sufficient statistics

■ Asymptotic normality
■ The MLE θ1, θ2,…….., θκ) for the parameters of the density
■ ƒ(χ; θ1, θ2,…….., θκ) From a sample size of  are for large sample approximately distributed as

multivariate normal with means θ1, θ2,…….., θκ■ And matrix R) in the quadratic form where

20 Statistical Inference.nb

R = (ι)
(ι) = -ϵ[ ∂2logƒ(χ,θ1,θ2,……..,θκ)∂θι ∂θθ  ]
■ The variance-Covariance matrix is 1 R -1■ Note ƒθ1, θ2,…….., θκ) = 1

2π k2  (R)-1  12 exp
-1
2 (θ – θ) ‘ (R)(θ – θ)

Statistical Inference.nb 21

LECTURE 4
INTRODUCTION TO

ESTIMABILITY

COURSE LECTURER: DR.JULIUS KOECH

REVIEW: Ordinary Least Squares Approach■ Find β that minimize |γ – χβ 2 = ϵ ϵ
■ The Ordinary Least Square Estimates are β = χ χ)-1 χ γ■ Under assumptions, the Ordinary Least Square estimates are Maximum Likelihood.
ϵ ∼ N0, σ2 ℐ for ⟹ γ ∼ Nχβ, σ2ℐ) and
⟹ β ∼ Nβ, σ2χ χ)-1)
σ2 = (ϵ  ϵ

N – 
■ To test a hypothesis, we construct “test statistics”.■ The Null Hypothesis ℌ0■ Typically what we want to disprove(no effect)■ ⇒ The Alternative Hypothesis ℌA expresses outcome of interest.

Contrasts■ We are usually not interested in the whole β vector.■ A contrast can select a specific effect of interest:■ ⇒ A contrast  is a vector of length .
⇒ ℭ β is a linear combination of regression coefficients β.
Let ℭ= [1 0 0 0 0 ……..]
ℭ β = 1 xβ1 + 1 xβ2 + 1 xβ2 + 1 xβ3 + 1 xβ4 + 1 xβ5 …..= β1
ℭ = [0 -1 1 0 0 ..]
ℭ β = 0 xβ1 + -1 xβ2 + 1 xβ3 + 0 xβ4 + 0 xβ4 + 0 xβ5 + …..= β3 – β2

22 Statistical Inference.nb

 (sample matrix normal distribution)

■ Under the assumption: ℭ β ∼ Nℭ β, σ2 ℭχ χ)-1) ℭ■ Definition : A linear function of the parameters,  α is said to be an estimable function of α if it is
identically equal to some linear function of the expected value of the vector of observations, γ.
That is,  α is estimable if :

 α =  ϵ[γ]
■ Consider the general linear model γ*1 = χ* β*1 + ϵ+1■ We say that β is identifiable if knowing the mean ϵ(γ) gives us β.■ Definition : The parameterization β is identifiable if for any β1 and β2
ƒ(β1) = ƒ(β2) implies β1 = β2■ Estimability ≠ Identifiability !!■ Identifiability : Attribute of the model■ Estimability : Attribute of the data■ Identifiability addresses the question of whether (and with what degree of certainty) it is possible to

uniquely estimate parameters for a given model and data set.■ Within the framework of identifiability, one considers whether the parameters can be estimated
uniquely, in the best-case scenario of noise-free, perfectly measured data.■ While this is unrealistic, it is a pre-requisite to successful estimation from real-world data.■ The term identifiability is also referred to as estimability (McLean and McAuley, 2012)■ Existence of sampling errors or variability may hinder the ability to uniquely estimate the
parameters of interest in a linear model.■ Therefore, every linear model can be viewed as estimable and identifiable.■ However, this may depend on the nature of the design matrix if its of FULL RANK or NOT.

Thank You

Statistical Inference.nb 23

LECTURE 5
ESTIMATION OF SPACE AND

ERROR SPACE

COURSE LECTURER: Dr. JULIUS

KOECH
DEPARTMENT: MATHEMATICS AND COMPUTER SCIENCE

THE LINEAR MODEL AND THE ERROR
■ Throughout the start of the lecture to this stage, we have seen that the only equation we ever

really need is the linear model and this is given below :

outcomeι = modelι + ϵ

The Norma Linear Model■ The normal linear model may be written in matrix form as : γ = χβ + ϵ where ϵ ∼ N0, σ2 ℐ ■ The fundamental idea can be predicted from a model and some error associated with that
prediction ϵι

ASSUMPTIONS OF THE LINEAR MODEL IN ESTIMATION■ As mentioned, the normal linear model incorporates strong assumptions about the data. some of
these assumptions include :■ Linearity■ Constant Variance (homoscedasticity)■ Normality■ Independence

The Error Component Problems

24 Statistical Inference.nb

■ errors may be heterogeneous (unequal variance)■ Errors may be correlated■ errors may not be normally distributed■ The last defect is less serious than the first two because even if the errors are not normal, the βs’
will tend to normality due to the power of the central limit theorem. With larger datasets, normality
of the data is not much of a problem.

Known Methods that Minimize The Error Prediction■ Mixed-effects models■ Random effects models■ Co-integration approach : Good model for minimizing errors■ Use of Bayesian Estimation, but need to specify priors of linear parameters and make assumption
of the given distribution■ And many other models not specified here

Errors In The Linear Model■ Familiarize yourself with these types of linear models as the first two will be covered in the
upcoming classes■ Read also on how to employ either SAS or Wolfram Mathematica statistical software’s in
analyzing data using these models.

QUESTIONS?

Thank You!!

Statistical Inference.nb 25

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.

Order your essay today and save 30% with the discount code ESSAYHELP