Statistics tasks need done

Attached are the questions that need to be done..

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

91

comparing two populations

Part 1 11/03/2020 Part 2 11/04/2020
Homework Due date 11/12/2020
Consider the data on crude oil production in California between 1981 and 2020. Compare X =
production in February and Y = production in September, for the years 1981 to 1995 following our
example below. Find 99% confidence intervals. First, assume that population variances are known and
that X and Y are independent. Second, determine if the data gives support for the idea that X and Y may
be independent. And lastly, assume the population variances are unknown and that X and Y are
dependent.

Part 1
random variables
X = random variable giving class median for class/frequency data
X[1], X[2], X[3], … X[N] N experiments
EX = random variable giving the expectation values for each experiment
EX[1], EX[2], EX[3], … EX[N] expectation values for each experiment
VarX = random variable giving the variance for each experiment
VarX[1], VarX[2], VarX[3], … VarX[N] variances for each experiment
P(X) = probability distribution for X based on a fit to data or empirical frequencies
N[k] = total number of data values for experiment k = 1, 2, … N
f[k] = frequency for value X[k] of experiment k
P(X[k]) = f[k]/N[k] empirical probabilities
EX[k] = sum[X[k]] P(X[k]) X[k] X values from experiment k
EX[k]2 = sum[X[k]] P(X[k]) X[k]2
VarX[k] = EX[k]2 – (EX[k])2
SDevX[k] = sqrt(VarX[k])
P(EX) = probability distribution for EX
EEX = sum[EX[k]] P(EX[k]) EX[k]
EEX2 = sum[EX[k]] P(EX[k]) EX[k]2
VarEX = EEX2 – (EEX)2
experimental values
EX[k] = (1/N[k]) sum[X[k]] X[k]
VarX[k] = (1/(N[k] – 1)) sum[X[k]] (X[k] – EX[k])2

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

central limit theorem
The central limit theorem, when N is sufficiently large,
log(P(EX)/w(EX)) = – (1/2) log(VarEX) – (1/2) log(2𝜋) – (1/2) log(e) Z2
where w(Y) is assumed to be a small width for each class median value Y, and the z-value is given as
Z = (X – EEX)/(VarEX)
Furthermore,
EEX = EX
VarEX = VarX/N

two independent random variables
X,Y = two independent random variables

comparison of experimental results
EX – EY
apply central limit theorem
E(EX – EY) = EX – EY
(when the number of experiments for each of X and Y is large enough)
Var(EX – EY) = VarEX + VarEY = VarX/NX + VarY/NY
NX = total experiments for X
NY = total experiments for Y
(consequence of symmetry)
P(EX – EY) = normal distribution

This indicates in outline how to find a confidence interval. We give an example problem on the next
page.

Part 2
data table (same as for Sec 8.5 on chi square distribution)

our study
An obvious problem with the above data is that probably nothing about it can be regarded as associated
with a couple of independent random variables, except maybe trivialities. Everything with this data is
totally intertwined with a big complex actual picture of evolution in California. Bear this in mind: Our
example assumes the two random variables X and Y are independent, and this, at least in this case, is
but the wildest fantasy. We also make assumptions to get the above picture applied to the comparison
of the two random variables. In this instance, this is an obvious pretty meaningless fabrication.

the two random variables for comparison
Let’s make a 95% confidence interval for EX – EY, where X = data value for January and Y = data value for
July. Here is the relevant data table

data table (see above data table for more information)
Year January value July value

X Y
LIST L1 L2
1981 3.0297 3.1383
1982 3.1352 3.1758
1983 3.1302 3.1880
1984 3.1894 3.2193
1985 3.2492 3.3154
1986 3.4258 3.1790
1987 3.0621 3.0897
1988 3.0547 3.0208
1989 2.8431 2.8143

1990 2.7322 2.6962

step 1 in constructing the confidence interval: find the point estimate (consult your text)
E(X – Y) = (1/N) sum[X – Y] X – Y
= (1/10) ((3.1383 – 3.0297) + (3.1758 – 3.1352) + …)
= (1/N) sum(L1 – L2)
= 0.00148

comment
We can see that because this is only about 1/3000th the value of X or Y, these two variables are HIGHLY
correlated and unlikely to be independent.

computation of the population standard deviations
We assume the given data values constitute both our “sample” and our “population”. Therefore, we
KNOW the exact population standard deviation in this case, which is necessary to use the text set up for
the confidence interval in this case.

expectation and variance
X
EX = (1/N) sum(L1) = 3.08516
EX2 = (1/N) sum(L12) = 9.55282
VarX = EX2 – (EX)2 = 0.04361
SDevX = sqrt(VarX) = 0.1860 population standard deviation

SDevX = sqrt((1/(N – 1)) sum[X] (X – EX)2)
= sqrt((1/9) sum((L1 – 3.08516)2 )
= 0.1961 sample standard deviation

comment
We are seeing a slight difference between population and sample standard deviation due to the N – 1 in
place of N in the sample standard deviation.

expectation and variance
Y
EY = (1/N) sum(L2) = 3.08368
EY2 = (1/N) sum(L22) = 9.54210
VarY = EY2 – (EY)2 = 0.03302
SDevY = sqrt(VarY) = 0.1817 population standard deviation
SDevY = sqrt((1/(N – 1)) sum[Y] (Y – EY)2)

= sqrt((1/9) sum((L2 – 3.08368)2 )
= 0.1915 sample standard deviation

the standard deviation for E(X – Y)
(via the central limit theorem)
SDevE(X – Y) = sqrt(VarX/N + VarY/N)
= 0.08754

step 2 in constructing the confidence interval: find the margin of error (consult your text)
margin of error = SDevE(X – Y) x (critical z-value for the ideal normal distribution)
critical z-value for a 95% confidence interval
alpha = probability of a type I error
= probability that a valid EX[k] for some experiment k will fall outside

the confidence interval
= 1 – (decimal percent for confidence interval)
= 1 – 0.95
alpha/2 = 0.025
upper and lower critical z-values

Probability of exceeding this z-value must be alpha/2 (and likewise the
probability of getting less than the lower critical z-value must also be alpha/2)

Upper critical z-value = – (Lower critical z-value)
algorithm for finding the lower critical z-value using the TI calculator

(we use the Solver, but you obviously also can use invNorm: see YouTube videos for
instruction.)
MATH > up arrow > 0:Solver > up arrow > 2nd VARS (DISTR) > 2:normalcdf(

> -1000, X) – 0.025 > ENTER > -2 (guess) > ALPHA ENTER
estimated lower critical z-value = – 1.95996
upper critical z-value = 1.95996

margin of error = (0.08754)(1.95996) = 0.1716

step 3 in constructing the confidence interval: lower and upper limits of the interval

lower limit of the interval = E(X – Y) – (margin of error) = 0.00148 – 0.1716 = -0.1701

upper limit of the interval = E(X – Y) + (margin of error) = 0.1731
confidence interval
-0.1701 < E(X – Y) < 0.1731

(the true value, apart from our estimate, 0.00148, lies in (-0.1701, 0.1731) at a 95% confidence
level.)

comment
Since we have the actual population in this case, we can compute the actual value of the
variance of E(X – Y) and check to see if our independence assumption was reasonable
E(X – Y) = 0.00148
E(X – Y)2 = (1/10) sum((L1 – L2)2) = 0.00870
Var(X – Y) = E(X – Y)2 – (E(X – Y))2 = 0.008698
SDev(X – Y) = sqrt(Var(X – Y)) = 0.09327
The SDev for X – Y based on independence is
sqrt(VarX + VarY) = sqrt(0.04361 + 0.03302) = 0.2768
This is three times larger than the actual. Therefore, the assumption of independence in this
case is not valid. But we cautioned you about this. This is merely a toy problem example.

We continue discussion of comparison of X and Y next time.

Part 3
the loss of quality in experimental data
For various reasons, some of which we have discussed, one usually cannot just construct a
confidence interval as a result of experimental data and truly expect it to be very meaningful
without usually investing significantly in skill, resources and understanding to obtain ideal or
near ideal conditions. The key problem is the philosophical background of the central limit
theorem: It is usually a very unrealistic idealization from actual experimental situations. You
typically need a great amount of extremely high quality data and usually it must be accumulated
over a fairly short time. These conditions force us to acknowledge inherent weaknesses in
experimental data and surveys.

t distribution
We have noted that one way to accommodate errors, poor experimental design or equipment,
lack of adequate skill or understanding, just plain stupidity and ignorance, presence of
extraneous signals, outliers of unknown significance, etc. is to use a random variable with
potentially broader tails such as the hypergeometric distribution or the t-distribution. This
approach attempts to capture a fair signal strength in the presence of a non-ideal data spread
and significant noise.

steps with the t-distribution (X,Y independent random variables)
These are identical to those of the normal distribution given above.
step 1 in constructing the confidence interval: find the point estimate, namely E(X – Y)
step 2 in constructing the confidence interval: find the margin of error

margin of error = SDevE(X – Y) x (critical z-value for the ideal t distribution)
Z = (X – E(X – Y))/SDevE(X – Y)
VarE(X – Y) = VarX/NX + VarY/NY

SDevE(X – Y) = sqrt(VarE(X – Y))
critical z-value for alpha/2 = 0.025 (95% confidence interval)

algorithm for finding the lower critical z-value using the TI calculator
(we use the Solver, but you obviously also can use invNorm: see YouTube videos for
instruction.)
MATH > up arrow > 0:Solver > up arrow > 2nd VARS (DISTR) > 6:tcdf(

> -1000, X, df) – 0.025 > ENTER > -2 (guess) > ALPHA ENTER
(use df = smaller of NX – 1 and NY – 1 to represent weakness of experimental data. The
text gives a subtler formula as well that you might want to use:

df = ((NX – 1)-1WX + (NY – 1)-1WY)-1
WX = (VarEX/VarE(X – Y))2
WY = (VarEY/VarE(X – Y))2
VarE(X – Y) = VarEX + VarEY

(Independence of X and Y, carries over to EX and EY)
VarEX = VarX/NX (central limit theorem)
VarEY = VarY/NY

(This represents an averaging of NX – 1 and NY – 1 since WX + WY ≤ 1. We have the
following result:

(smaller of NX – 1 and NY – 1) ≤ ((NX – 1)-1WX + (NY – 1)-1WY)-1
First, this inequalitiy tells us that the smaller of NX – 1 and NY – 1 sets the lower limit on loss of
information in an experiment. Second, the emphasis on the particular value NX – 1 or NY – 1 results
from the weighting factors, WX and WY. It is clear that WX relates to VarEX and WY to VarEY, because

as we have mentioned, many aspects of loss of information from data relates to how broad the
distribution, and this is measured by the SDevX, i.e. the square root of the variance. Why are we
squaring VarEX, or VarEY, and why do we divide by VarE(X – Y)? Partly, this is because VarEX/VarE(X – Y)
is less than 1, i.e. does not magnify the loss of information. Nevertheless, the whole philosophy here,
while it has a mathematical context in some depth, is worth stating more explicitly: It IS philosophy, and
a rather arbitrary choice in the end. But we will discuss it no further here.

step 3 in constructing the confidence interval: lower and upper limits of the interval

lower limit of the interval = E(X – Y) – (margin of error)
upper limit of the interval = E(X – Y) + (margin of error)

pooled variance
There is a bit of a different perspective too, on the experimental comparison of X and Y, when EX may
differ from EY but in terms of sample or experimental variance, we suspect ideally that the variances,
VarEX and VarEY are equal. This is not an uncommon phenomenon.

Here is the usual experimental variance
VarX = (1/(NX – 1)) sum[X] (X – EX)2
VarY = (1/(NY – 1)) sum[Y] (Y – EY)2
The pooled variance, in the case in which we expect population variances to be equal (an ideal but
sometimes reasonable simplification), is
VarPooled = (1/((NX – 1) + (NY – 1))) (sum[X] (X – EX)2 + sum[Y] (Y – EY)2)
= ((NX – 1) VarX + (NY – 1) VarY)/(NX + NY – 2)
We use this pooled variance as a common approximation for VarX and VarY, since we believe the
population variances of X and Y to be equal. The central limit theorem yields
VarE(X – Y) = VarX/NX + VarY/NY = VarPooled x (1/NX + 1/NY)
Thus, in construction of a confidence interval with pooled variance we would use
margin of error = SDevE(X – Y) x (critical z-value for the ideal t distribution)
= sqrt(VarPooled) x sqrt(1/NX + 1/NY)

the case when X and Y are not independent
It is of course to address this case, as this is the situation that we are often in. The main points about
independence are
E(X – Y) = EX – EY
Var(X – Y) = VarX + VarY
i.e. we make a separation where we can consider P(X) and P(Y) separately. On the other hand, with the
case of loss of independence we have
E(X – Y) = EX – EY
E(X – Y)2 = EX2 – 2EXY + EY2
Var(X – Y) = E(X – Y)2 – (EX – EY)2 = VarX + VarY – 2 CovXY
CovXY = EXY – (EX)(EY) covariance of X and Y = 0 when X and Y are independent
The definition of EXY is
EXY = sum[XY] P(XY) XY
We know the values of X and Y, and therefore the product XY. But, how does the probability
distribution P(XY) relate to P(X) and P(Y)? When we lack independence, there will be whole aspects of
correlations to the distribution of the random product XY that we will not be able to detect from just
P(X) and P(Y) alone. In general,
P(XY = c) = P(Y = c/X) = (P(Y = c/X)/P(X)) x P(X) = P(Y|X) P(X), Y = c = one of its data values

i.e. this probability distribution depends on conditional probability and when X and Y are independent,
P(Y|X) = P(Y) and P(XY) = P(X)P(Y) reduces to knowledge of just P(X) and P(Y) separately. When we lose
independence P(Y|X) is dependent on the particular values of X, as well as Y.

modeling X and Y dependence for X – Y comparison
We assume that P(X – Y) exists, the probability distribution for the difference X – Y. Then, the whole
apparatus of confidence intervals we discussed above works. The sole difference is that we must use as
variance,
Var(X – Y) = VarX + VarY – 2 CovXY
Plus the experimental variance is
Var(X – Y) = (1/(N – 1)) sum[X – Y] (X – Y – E(X – Y))2
and this is what we use in the determination of the margin of error for confidence intervals.

So, next time, we will finish up this section, give you homework for this section, and finish up.

Part 4
comment about dependence
The bottom line about independence is
P(XY) = P(X)P(Y)
This is equivalent to P(X|Y) = P(X) or P(Y|X) = P(Y). In the case of dependence we can only say
P(XY) = P(X|Y)P(Y) or P(XY) = P(Y|X)P(X)
When we previously studied Independence in classical probability its significance was unclear. Now, in
the context of fits to data distributions, we see quite clearly that this does not relate to peak values:
E(X – Y) = EX – EY or E(X + Y) = EX + EY
But for variance
Var(X – Y) = VarX + VarY – 2 CovXY or Var(X + Y) = VarX + VarY + 2 CovXY
and CovXY is a result of dependence
CovXY = EXY – (EX)(EY) (=0 in the case that X and Y are independent)
EXY = sum[XY] P(XY) XY
Thus dependence is entirely associated with the variance, and unrelated to expectation values.

three dependent random variables W, X and Y
For this, we have P(XY), P(YW) and P(WX) to consider. For these each we have
P(XY) = P(X|Y)P(Y)
for example, as above. Now, we can consider
P(WXY = c) = (P(W = c/XY)/P(XY)) x P(XY)
= P(W|XY)P(X|Y)P(Y), WXY = c
Also,
E(W + X + Y) = EW + EX + EY
Var(W + X + Y) = VarW + VarX + VarY + 2 CovWX + 2 CovWY + 2 CovXY
The relation
P(WXY) = P(W|XY)P(X|Y)P(Y)
If W, X and Y are independent, then P(W|XY) = P(W), P(X|Y) = P(X) and
P(WXY) = P(X)P(Y)P(W)
Thus the framework we developed can be used for comparing any number of random variables.

confidence interval, with X and Y assumed to be dependent
We have covered a lot of ground, but for our example, we just want to repeat the analysis and this time
assume dependence of X and Y, which is MUCH better supported by the actual data.

data table (see above data table for more information)
Year January value July value

X Y
LIST L1 L2
1981 3.0297 3.1383
1982 3.1352 3.1758
1983 3.1302 3.1880
1984 3.1894 3.2193
1985 3.2492 3.3154
1986 3.4258 3.1790
1987 3.0621 3.0897
1988 3.0547 3.0208
1989 2.8431 2.8143

1990 2.7322 2.6962

95% confidence interval using the t distribution (recall N = 10)
step 1 in constructing the confidence interval: find the point estimate
As remarked, the point estimate is not affected by independence vs dependence:
E(X – Y) = (1/N) sum[X – Y] X – Y
= (1/10) ((3.1383 – 3.0297) + (3.1758 – 3.1352) + …)
= (1/N) sum(L1 – L2)
= 0.00148
This is the same as before.

step 2 in constructing the confidence interval: find the margin of error
margin of error = SDevE(X – Y) x (critical z-value for the ideal t distribution)
Z = (X – E(X – Y))/SDevE(X – Y)
VarE(X – Y) = Var(X – Y)/N

Var(X – Y) = (1/(N – 1)) sum[X – Y] (X – Y – E(X – Y))2
= (1/9) sum((L1 – L2 – 0.00148)2)
= 0.009665

SDevE(X – Y) = sqrt(VarE(X – Y)) = 0.09915
critical z-value for alpha/2 = 0.025 (95% confidence interval)

algorithm for finding the lower critical z-value using the TI calculator
(we use the Solver, but you obviously also can use invNorm: see YouTube videos for
instruction.)
MATH > up arrow > 0:Solver > up arrow > 2nd VARS (DISTR) > 6:tcdf(

> -1000, X, df) – 0.025 > ENTER > -2 (guess) > ALPHA ENTER
(use df = smaller of NX – 1 and NY – 1 to represent weakness of experimental data.
df = 9
lower critical z-value = -2.2622
upper critical z-value = 2.2622

margin of error = 0.09915 x 2.2622 = 0.2243

step 3 in constructing the confidence interval: lower and upper limits of the interval
lower limit of the interval = E(X – Y) – (margin of error)
= 0.00148 – 0.2243 = -0.2228
upper limit of the interval = E(X – Y) + (margin of error)
= 0.2258
95% confidence interval
-0.2228 < E(X – Y) < 0.2258 This is a more appropriate result, as we saw before that the data just does not support the idea that X and Y are independent.

101

hypothesis testing

Part 1 11/12/2020 Part 2 11/13/2020 Part 3 11/15/2020
Homework assignment Due date: 11/22/2020
Carry out a two tailed hypothesis test for Elog absX = 3.5 at the 95% confidence level using the ideal
normal distribution, with the below data from June to September each year, as we show in the example
below. Then, carry out a right tailed hypothesis test at the 97.5% level, using the ideal t-distribution, as
in the later example below. Just follow my steps, or, if you wish, make up your own testing procedure,
but preferably after studying my examples.

Part 1
what is a hypothesis?
When we come to study a system via experiment or some sort of sampling procedure, we do not come
to the investigation in total absence of cognitive frameworks. The phenomena we investigate will
connect up with aspects of our knowledge of the world, or theoretically, as of mathematics. In other
words, we almost always come to a data collection process with preconceptions, prejudices, hunches,
biases, and knowledge from prior work. Thus, we usually come to an experiment or sampling process
with certain expectations of outcomes or tentative guesses about what we expect. These preliminary
expectations are referred to as hypotheses. An experiment is sometimes designed partly around
exploring the extent to which empirical data can support these preliminary expectations. When we try
to assess the extent of support, this is called hypothesis testing. Thus, in a sense, hypothesis testing is a
pretty common human pursuit, and something akin to mere curiosity about the world.

However, when we refer to hypothesis testing in statistics we mean explicitly the statistical framework
first enunciated by Neyman and Pearson in 1933. They had a very formal mathematical approach, in
which we see the experiment in the perspective of the following table:
Evidence supports hypothesis does not support hypothesis

Accuracy of hypothesis
(Description of world)
accurate correspondence Type 1 error

not accurate Type 2 error correspondence

accurate and evidence supports the hypothesis
This is a THE ideal. We propose a hypothesis or guess, and the experiment seems to support it. If the
hypothesis is of some broad interest, with respect to your goals. This is, however, unusual for actual
research. When a researcher poses a serious hypothesis of interest it USUALLY means it is VERY difficult
to get at by current methods and resources, and may require a very large expenditure of resources, and
access to skill and knowledge well beyond the researcher’s reach. This is particularly the case even in
well-funded academic research. Needed expertise, knowledge, skill and equipment is often
ASTRONOMICALLY expensive. Thus, at least in academic research, it is usual, when there are extremely
serious hypotheses that MUST be investigated from the viewpoint of funding agencies, and other
sources of support, it is usual to select a location, often very near a well-regarded research university, to
set up a special lab or institute focused on the research. This allows channeling of government
resources, and gives a center for talent to be directed. As we work more online, in these times, it is
somewhat less important, but even so, it is unusual to see faculty not associated with special labs and
institutes directing research toward some serious hypothesis.

not accurate and evidence does not support the hypothesis
This would seem not to be desirable: Formulate a hypothesis that you think might not be accurate and
then accumulate data indicating it has little support. That seems inefficient superficially. Strategically,
this approach is preferred by many top academic researchers. At first, to outsiders in the academic
research world, this type of “trivial pursuit” can seem bizarrely pointless. Let’s try to flesh this out a
little. First, when a research area is initially opened up, it usual entails new perspectives, resources,
skills and knowledge outside most researchers’ specializations. This means the “hot” research is usually
“out of reach”. Still, an academic researcher in a related field with expertise knows that if he or she can

“connect” it will probably lead to a publishable result via the bandwagon effect. Furthermore, to
maintain their status and power position an academic researcher usually needs a fair number of
relatively “high quality” publications: This state of affairs has always been known as “publish or perish”.

This is the merit of high specialization, expertise, and skill level at particular research specialty niches.
The academic researcher develops an extreme familiarity with one or perhaps several very, very narrow
fields of study: Becomes deeply enmeshed and capable at its intricacies, its basic and subtle methods,
and the classic approaches to research. This is done by the PhD apprenticeship, various internships,
post-doctoral positions, various research positions, special workshop training, collaborative teams, and
in teaching and basic scholarship. In this way, an academic research REALLY knows certain highly
intricate and complex details and subtleties concerning certain research areas. When he or she achieves
such levels of expertise, if it can be connected to the “hot” work going on, by formulating a suitable
hypothesis accessible to the researchers’ expertise, and the hypothesis is generally an acceptable one, if
the researcher can adduce evidence against it, the result is almost certainly one, possibly more,
publications.

There are a number of good points in favor of this strategy. First, most supported claims are associated
usually with some hidden anthropocentric biases or prejudices. Therefore, it is about 80% probable that
a typical accepted scientific claim has some incorrect aspects related to this bias. A second, perhaps
critical point about selected a well-favored hypothesis accessible to your specialty is that you are VERY
likely to note something interesting or inconsistent, or see little ways to innovate. In other words, in the
“publish or perish” world of academia, these two characteristics alone are telling you that this kind of
“straw man” hypothesis test is usually your meal ticket in academia. Of course, to maintain quality, you
must be proficient at the very high level of standards in academia because it is hugely rule heavy, and
writing a successful paper for publication is much like playing a good game of chess. An academic must
REALLY know how to “dot the i’s” and “cross the t’s”. But training in a specialization usually gives
proficiency at this: It is all self-motivated, and you have to basically want to play the game, because the
academic publish or perish game is usually pretty sharp and brutal.

Inevitably, this “gaming the system” turns off a lot of outsiders to academic research. We can ask the
question as to whether or not it is in the long run worthwhile and beneficial to the researcher’s culture
or to some broader segments of humanity. Chasing grants and writing papers and guiding students and
other researchers in your project involves a lot of time and effort. Is this a good pursuit, in terms of
benefits to humanity? Certainly it maintains the researcher’s status and power. In addition, it helps to
sustain the university or institute with grants and status and prestige. A research/academic setting of a
college or university is a lot like an athletic team, and its perform must be sustained. Therefore, in these
immediate ways all the research is valuable. The fact that only perhaps 5 to 10 other specialists may
read the papers means the results if important are not going to emerge in the public immediately and
likely never.

Training students and other researchers is of course a valuable exercise. So the apprenticeships of the
PhD have social value. Plus, the fact that the researchers usually have to chase grant money means at
least that their research must seem of some value to experts in related fields.

accurate and evidence does not support hypothesis: Type I errors
We have to see this in the context of typical academic research: Selection of a hypothesis that seems to
be commonly accepted. We do this to get publications by finding contrary evidence, and in a way that is
a pretty good “typical and useful task” to question common assumptions. Unfortunately, in this context

it is AWFUL to get a Type I error because your paper is likely to be published and may have a massive
effect of suppressing research in what was previously an accepted approach until your negative results.
Therefore, we need to adjust our statistics studies to minimize the chance of a Type I error: In academic
research this type can be tragic and devastating.

In constructing confidence intervals, we set the “probability” alpha of a Type I error in our examples at
5%. Why don’t we just set it at 0 if we need to minimize it??? That would mean the accepted
hypothesis we are testing is not accessible to any negative support from our experiments or surveys. In
other words: We don’t even have motivation to carry out the study and it gives no opportunity for a
publication.

Thus, we cannot afford to set alpha at the ideal value of 0, but it is not something that can have ANY
natural value. So setting the value is purely an abstract subjective decision based on what you think will
convince reviewers of your paper to allow its publication. The value NEEDS to be close to 0 but greater
than zero, and at a practical level to get successful negative results plus convince reviewers that your
paper is acceptable for publication when they do the “automated proof checking” of the “academic
chess game”. So it is basically a matter of what the “research public” charges for getting a publication:
Usually people accept a 5% or so value for alpha. Of course, it depends on prestige level. If you want to
get a Nobel prize in physics, you better set alpha around 0.000001% or so, i.e. 1/millionth of a percent
and the hypothesis being tested better be SUPER important.

setting alpha very close to zero
When we set alpha very close to zero, we start to make it more likely that we will make a Type II error:
Finding support for an inaccurate hypothesis. In context of the typical academic philosophy and
strategy, this is not too likely. We usually choose a hypothesis that is commonly accepted, and do not
try to publish unless lack of support is obtained from our data.

Nevertheless, it is quite common for academics to try to avoid this error. The probability of making this
error is called beta. The power of the hypothesis test is 1 – beta. Why do academics do this???

This too, is strategic. If we get only one convincing negative data item not supporting the hypothesis,
THAT is not too convincing. The idea about beta is to get researchers to find at least several data items
not in support of the hypothesis. Now, the unusual data values are established via fitting the data to a
distribution, like the normal distribution or the t distribution. When we get several clustered data items,
we can set up an alternative hypothesis involving a data fit to these values. The idea is to think of the
clustered extreme values not supporting the hypothesis as heavily weighted and the values close to our
original hypothesis center we were testing as in the tail of the distribution, and given a low weight.

By doing this, we are giving people an indicator that the true value, the true center, lies somewhere in
the vicinity of the center of the alternative hypothesis probability distribution fit and the clustered
extreme values weighing against the original hypothesis.

This usually gives added weight to a research paper and makes it more likely to get published, as it
predicts what an accurate hypothesis test might look like or at least points in a good direction.

historical context
Hypothesis testing was introduced by Neyman and Pearson in a 1933 paper. As we indicate above, it
has proven extremely useful for academic researchers to get publications from their research. Many

people have objected to this strategy of turning research into a type of “chess game” and researchers
“gaming the system” for status and power.

This completes our discussion for now.

Part 2
confidence intervals vs hypothesis testing
Because both confidence intervals and hypothesis testing focus on alpha, the probability of a Type I
error, associated with how the critical z-values are selected, the “method” of statistics is essentially
identical. It is the “philosophy” that is different: “confidence intervals” are fine for many statistical
studies of a traditional nature, but it is usually questioning “a commonly held belief” and “finding
statistical support challenging it” that is the key to cheap research and getting your research published
and at a satisfactory level of quality for the academic fields utilizing statistics. Therefore, the emphasis
between confidence intervals and hypothesis is totally that of pragmatism: Hypothesis testing is pretty
much the bread and butter of academics involved in the publish or perish environment of academic
research. So methodologically we do not expect much difference between a hypothesis test and
constructing a confidence interval.

p-value
It is common to state the p-value for a hypothesis test. Using the probability distribution we have fit the
data to for the hypothesis test, we compute, for the random variable X associated with our study, the
probability p of the most extreme deviation from the center value relative to the hypothesis we are
testing. Strategically, this is often regarded by academics trying to get their research published, as more
relevant than a critical z-value.

The strategic reason for academics’ preference for the p-value is that it avoids claims about alpha. In
fact it is not even necessary to calculate a critical z-value if you use p-values, but to enhance publication
chances it is often wise, as alpha is “traditional” and “recognized”.

The merit of the p-value is strategic: Especially in a world where we now often have access to enormous
amounts of data. With such a very large population to draw from, it is usually, by using software to
search the data, very feasible to find a large collection (but small compared to the total data population)
that is significantly not in support of the commonly accepted hypothesis. The p-value for the resulting
probability distribution fit will typically be MUCH less than 5%. This usually gives sharp researchers with
access to a lot of data enormous advantages in the publish or perish academic game.

The problem with this type of gamesmanship, of course, is that it greatly enhances chances of losing
faith in the conventional hypothesis which might actually be accurate. As we pointed out before, a
single researcher playing this kind of game will not kill convention, but if there is a “bandwagon” effect,
a beneficial and well-supported result might lose research directed in its favor, with a concomitant loss
of benefit.

hypothesis test example
Let’s test the claim that crude oil production is slowing in California at an exponential rate. This means
that log(abs((production at time t + T) – (production at time t))) is roughly constant for a particular time
T (relatively short) and t = variable time. Here, abs is the absolute value: For example, abs(5) = 5 and
abs(-5) = 5.

data (on California crude oil production: see Sec 8.5 for details)

strategy
We are just going to go through a very simple example of a test today. We will use
X[t] = (production in January + February + March + April + May from year t)

– (like total from prior year)
We take a sum to get a larger value to subtract because subtractions cost in loss of information and we
amplify and smooth over noise when we add several months together.

table (showing lists on TI calculator and instructions to create the lists)
year production, Y X = Y – Y(prior year) log abs(X)
seq(L1(X) – L1(X – 1),X,2,11) sto L2 log(abs(L2)) sto L3
L1 L2 L3
1981 149010
1982 153536 4526 3.6557
1983 153809 273 2.4362
1984 157064 3255 3.5126
1985 161217 4153 3.6184
1986 163369 2152 3.3328
1987 150882 -12487 4.0965
1988 149511 -1371 3.1370
1989 138432 -11079 4.0445
1990 133365 -679 3.7048
1991 132686 2.8319

comment
It does appear that the logarithm list is roughly constant, between about 2.8 and 4.0. This is unexpected
because production peaks then decreases so the differences X vary from positive to negative values.

hypothesis test step 1: find the point estimate Elog(abs(X)) using empirical frequencies
Elog(abs(X)) = sum[X] log(abs(X)) (f/N)
N = number of data items = 10 here
f = frequency for the particular value X (all frequencies = 1 here)
Elog(abs(X)) = (1/10) sum(L3) here (using the TI calculator)
= 3.4370

hypothesis test step 2: specify alpha and type of test (two tailed; left tailed; right tailed)
Since the null hypothesis is that the population of X targets a constant, we want a two – tailed test, i.e.
with alpha/2 as the probability for each tail. For the hypothesis test, alpha is called the level of
significance. For our test we take alpha = 0.05 or 5% and alpha/2 = 0.025. This corresponds to the 1 –
alpha, 95% confidence interval, but here we are making an actual test.

hypothesis test step 3: specify the ideal data fit probability distribution for the population
We assume the population is fit by the ideal normal distribution. Since the distribution is symmetric
about Z = 0, we only need the lower critical value using Solver on the TI
MATH > Solver > normalcdf(-1000, X) – 0.025 > alpha ENTER
We set the guess at -1.5. Result
critical z-value = 1.9600 (take the negative of the calculator result)

hypothesis test step 4: estimate the standard deviation of the list of expectation values (assuming the
population is normally distributed)
We use the sample empirical probability estimate of the variance for the list of X:
Varlog abs(X) = (1/(N – 1)) sum[X] (log abs(X) – Elog abs(X))2 f
(with f = 1 here).
Varlog abs(X) = (1/9) sum((L3 – 3.4370)2 ) (in this case)
= 1.5983
Estimate of variance of list of EX
VarElog abs(X) = Varlog abs(X)/N = 0.15983
(Using this for the hypothesis assumes the validity of the central limit theorem.)

hypothesis test step 5: specify a test value for the test for EX
We have a point estimate: Elog abs(X) = 3.4370, and the values range from about 2.8 to 4.0. Let’s select
as the test value Elog absXTest = 3.5, which seems to be suggested by the data median.

hypothesis test step 6: specify the lowest acceptable value for EX and the highest
(we have both a lowest and a highest for a two tailed test)
lowest acceptable value = Elog absXTest – VarElog absX x (critical z-value)
= 3.5 – 0.15983 x 1.9600 = 3.1867
highest acceptable value = 3.5 + 0.15983 x 1.9600 = 3.8133

conclusion of the hypothesis test
As the sample Elog absX = 3.4370 lies between the highest and lowest acceptable values, we conclude
that the data here does not satisfy the conditions for finding negative support for the hypothesis. As
this is the goal of successful hypothesis testing, this hypothesis test fails at the 95% confidence level, i.e.
we cannot reject the claim of the hypothesis.

Next time we will give a more complete treatment of the hypothesis test for the data here.

Part 3
comment about hypothesis testing
We have given a very simple worked example of a hypothesis test. We selected a hypothesis that seems
well accepted about declining resources, so it totally looked like we would be able to succeed at
adducing evidence against it, even in the very limited data we used from California. Still the simple
example shows the methodology. Of course, the sole interest, from a professional perspective of an
academic researcher, is to get a publication to maintain or further his or her career. So we failed in that
regard since getting a paper accepted in an academic journal usually implies, minimally, that you found
evidence against commonly accepted notions.

In a way, it is not tragic that our example failed. Usually, in actual academic research, the first proposals
do not work out for one reason or another. The researchers either revise their project or drop it as
unproductive. A lot often depends on the funding agency, and that means the review committees
supporting your work. Usually this is somewhat preferable to leaving it up to the individual researcher.
A committee of experts can sometimes see a bit more clearly than an individual, especially in practical
terms and terms of utilization of scarce resources and expertise and time. The downside to allowing a
committee to decide is that as specialists, their assessment of your work is like chess experts evaluating
the play of a master: They can pretty much tell you whether you are playing a good game or not. The
trouble with that kind of specialized “game” orientation is that maybe you aren’t playing chess at all,
and they are missing the importance of what you are working on.

what next?
It is important to present some other perspectives with respect to hypothesis testing. Our first test led
to failure plus a bit of a weird unexpected result. That used the normal distribution. As we know from
prior studies, the normal distribution is usually too narrow (like the binomial distribution) to fit data
well, and this is related to the usual suspects.

Obviously then, as we have done in the past, we want to go on to the t-distribution, a broader, more
flexible distribution, with df = degrees of freedom as an additional parameter. This leads to some
tolerance for picking up on other types of processes than those related to the peak of the distribution,
which may be introducing outliers. In addition, the data collecting/analysis process has inherent
tendencies toward anthropocentric bias, and the investigators are almost certain to be making some
errors when large amounts of data are collected. This entails an overall noise throughout the
distribution, more apparent in the tails, so a broader tailed distribution can be more effective in
separating signal from noise.

progressing beyond the beginning stage
The t-distribution is obviously an excellent next step, as just introducing one extra parameter, namely df,
is not likely to lead to overfitting (i.e. fitting to noise over signal) which is always a danger when
introducing additional parameters. Plus as mentioned above it has obvious advantages over the normal
distribution, including the fact that the ideal t-distribution is practically as easy to find critical z-values
for as the ideal normal distribution.

Obviously matters start to get much more subtle as we move beyond the simple normal
distribution/central limit theorem perspective of Neyman. This was one of the main hold-ups in
Neyman’s rather idealized take on statistics. There were numerous harsh practical details that needed
to be worked out over the years, costing a lot in terms of effort and resources, and accompanied with
substantial criticism from elite researchers who could see the limitations of Neyman’s introductory

work, and were troubled by the paths being pursued to turn it into a practical tool. They did not want to
rock the boat.

The method works today pretty well for the established academic researchers.

t-distribution
Because the method is so similar to the construction of confidence intervals, we are going to present
just this one more example before leaving the topic of hypothesis testing. We will change to a right-
tailed test or a left-tailed test, over the two-tailed test, as opposed to our first normal distribution
example, to give a little insight into a potentially useful trick. Overall, if you want to make it in academia
in the publish or perish world right now, you really need to be savvy about hypothesis testing, but for us
at the beginning level, most of whom do not intend to become elite researchers in academia, delving
deeper into this rather specialized and tricky “academic chess game” is not of much interest.

table (showing lists on TI calculator and instructions to create the lists)
year production, Y X = Y – Y(prior year) log abs(X)
seq(L1(X) – L1(X – 1),X,2,11) sto L2 log(abs(L2)) sto L3
L1 L2 L3
1981 149010
1982 153536 4526 3.6557
1983 153809 273 2.4362
1984 157064 3255 3.5126
1985 161217 4153 3.6184
1986 163369 2152 3.3328
1987 150882 -12487 4.0965
1988 149511 -1371 3.1370
1989 138432 -11079 4.0445
1990 133365 -679 3.7048
1991 132686 2.8319

We need to group the data in a class/frequency table. We will start at 2.1 to 2.5:

class/frequency table
class frequency, f
2.15 – 2.55 1
2.55 – 3.05 1
3.05 – 3.55 3
3.55 – 4.05 4

This table reveals clearly that the data is not near a symmetric distribution. This does not mean that the
hypothesis test will be unreliable. We are simply using just a few data values. Still the skew to lower
values is pronounced. Therefore, a right tailed hypothesis test should work okay in this case.

hypothesis test step 1: find the point estimate Elog(abs(X)) using empirical frequencies
(This step does not depend on the specific hypothesis, so we have the same result as before.)
Elog(abs(X)) = sum[X] log(abs(X)) (f/N)
N = number of data items = 10 here
f = frequency for the particular value X (all frequencies = 1 here)

Elog(abs(X)) = (1/10) sum(L3) here (using the TI calculator)
= 3.4370

hypothesis test step 2: specify alpha and type of test (two tailed; left tailed; right tailed)
It looks like a right – tailed test is likely to succeed in this case, as the actual distribution has a long left
tail. So we will test, with 97.5% confidence (alpha = 0.025) that the value of the population EX is larger
than 3.5 (before we tested for “equality”).

hypothesis test step 3: specify the ideal data fit probability distribution for the population
We assume the population is fit by the ideal t-distribution, setting df = N – 1 = 9 (here). Since the
distribution is symmetric about Z = 0, we only need the lower critical value using Solver on the TI
MATH > Solver > tcdf(-1000, X, df) – 0.025 > alpha ENTER
We set the guess at -1.9600 (the old ideal normal distribution lower critical value). Result
critical z-value = 2.2622 (take the negative of the calculator result: Since we are making a right
tailed test we need the upper critical value.)

hypothesis test step 4: estimate the standard deviation of the list of expectation values (assuming the
population is normally distributed)
(This step remains unchanged from the prior case of the ideal normal distribution, as the standard
deviation, like the expectation value, does not depend on the hypothesis test, nor does it depend on the
critical z-value.)
We use the sample empirical probability estimate of the variance for the list of X:
Varlog abs(X) = (1/(N – 1)) sum[X] (log abs(X) – Elog abs(X))2 f
(with f = 1 here).
Varlog abs(X) = (1/9) sum((L3 – 3.4370)2 ) (in this case)
= 1.5983
Estimate of variance of list of Elog abs(X)
VarElog abs(X) = Varlog abs(X)/N = 0.15983
(Using this for the hypothesis assumes the validity of the central limit theorem.)

hypothesis test step 5: specify a test value for the test for EX
We have a point estimate: EX = 3.4370, and the values range from about 2.8 to 4.0. Let’s select as the
test value EXTest = 3.5, which seems to be suggested by the data median.

hypothesis test step 5: specify a test value for the test for Elog absX
We continue to use Elog absXTest = 3.5, as before. This has nothing to do with selecting a right tailed,
left tailed or two tailed test.

hypothesis test step 6: specify the lowest acceptable value for EX and the highest
(we have a lowest acceptable value for a right tailed test. Since we have examined the data with a
class/frequency table we have pretty much “loaded the dice” in our favor in this case.)
lowest acceptable value = Elog absXTest + VarElog absX x (critical z-value)
= 3.5 + 0.15983 x 2.2622 = 3.8616

conclusion of the hypothesis test
As the sample Elog absX = 3.4370 lies below the lowest acceptable values, we conclude that the data
here does satisfy the conditions for finding negative support for the hypothesis. As this is the goal of

successful hypothesis testing, this hypothesis test succeeds at the 97.5% confidence level, i.e. we have
evidence in favor of rejecting the claim of the hypothesis.

This concludes our discussion of Chapter 10.

https://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-45

https://en.wikipedia.org/wiki/Statistical_hypothesis_testing#cite_note-45

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.

Order your essay today and save 30% with the discount code ESSAYHELP