
Steps for Testing Hypotheses
- A research question is phrased; a suggested answer (the
research hypothesis) is postulated. At this point then,
all we have is a theory about something, and a view of
what we think the result is. Identify
appropriate quantities describing the population -- these
are called parameters.
- State the proposed theory in terms of the parameter(s) of
interest. The resulting statement is known as the research
hypothesis (more generally as the alternative
hypothesis). The anti-theory is a
statement of no change, or no difference, or no effect.
This statement is known as the null hypothesis (it
should be known as the "hypothesis of no difference").
These two hypotheses are usually written side by side,
with the null preceding the alternative. The symbol for
null hypothesis is H0. The symbol for
alternative hypothesis is HA (or often H1).
The colon (:) stands for states. (The
terminology is unfortunate. Null usually
indicates worthless and alternative then
sounds like a second-best to worthless. Actually, null
comes from nullify: The hope is to nullify this
hypothesis. Alternative is also a poor choice
of term: research hypothesis is much better.)
- A study is designed. A random sample (or even a number of
them) must be obtained. Often this is by far the hardest
part of conducting such a study. Determinations also must
be made regarding sample sizes. A general rule says
Take the largest sample sizes you can. But
sometimes sampling costs money and the answer to What
is the best sample size is not that easy to arrive
at.
- Data is collected and recorded. Your first step in any
real world situation is the same as it would be for any
data you might collect: Investigate. In particular,
obtain appropriate plots, looking for irregularities,
surprises and outliers. Clean up the data if outliers are
the result of poor data entry or if data is determined to
have come from an undesired source.
- The data is summarized and a test statistic (test
stat, or TS) is computed. The form of the TS
depends on H0. When the research claim is
about a single mean, or a comparison between two means,
the TS often is a T-statistic (or Z-statistic);
it measures the number of standard errors the estimate is
from H0. The test stat measures compatibility
between H0 and the data.
- The observed significance level (OSL; often
called a P-valueshort for probability
value) is computed. The P-value is the
probability, computed assuming that H0 is true,
that the test statistic will take a value at least as
extreme as that actually observed. It is the probability
of getting an outcome as extreme or more extreme than
the actually observed outcome. Extreme means far
from what we would expect if H0 were true.
The direction or directions that count as far from
what we would expect are determined by HA.
COMMIT THE FOLLOWING TO MEMORY: Small P-values
indicate strong evidence against H0.
- What do we conclude? We can compare the P-value
with a fixed value that we regard as decisive. The
decisive value is called the significance level and is
given by the Greek letter a.
If the P-value is as small or smaller than a, we say that the data are
statistically significant at level a.
This is equivalent to saying we reject H0 and
conclude HA. If the data are not statistically
significant, we do not reject H0.
Textbooks
often spend little time on the important issues described in
steps 1, 3 and 4.