Denote by m the mean for some population. In applications a random sample is selected from that population. Our objectives are among the following:

To examine the data and note properties of the observed distribution, taking particular care to identify and justify outlying values.

To estimate the population mean with a confidence interval.

To test hypotheses regarding the population mean.

If the data set is rather small, it is sufficient to simply note the values. Of course, with small data sets, we won't get a feel for much of the inherent variability of the data, so identifying true outliers is tough.

A histogram is the preferred graphical device. With smallish data sets, histograms will tend to be sparse and uninformative. Be careful in drawing specific conclusions from histograms based on small data sets: the particular features are quite likely to be due to the particular sample that's observed as well as a computer's choice of classes (bins).

Small sample confidence intervals and *P*-values are
only valid when the sampled *population* is approximately
normally distributed. (The key word there is *population*;
note that whatever information we have is generally nothing more
than a very small subset of observations drawn from the
population -- i.e. the *sample*.) The best way to assess
whether or not this assumption is reasonable is through the *normal
probability plot*. (Link to a short discussion of normal probability plots.) Use
software to obtain such a plot. If your assessment of the plot is
that a line is not the simplest fit for the plot, then the data
is *inconsistent* with what would be expected when
randomly sampling from a normal distribution.

In all cases, the results are "valid" only if the sample is, or is equivalent to, a random sample.

For large samples, it is not necessary that the population be
normally distributed in order to correctly interpret CIs and *P*-values.

Parameter | Estimate of parameter | Standard Error (SE) of estimate |

m |

A (1- a ) CI for m is given by

.

*t* refers to the appropriate tabled value for a *t*-distribution
with (*n -* 1) degrees of freedom.

To test H_{0}: m = m_{0} the test statistic is

.

Find the *P*-value in the usual fashion, using the *t*
distribution with (*n - *1) degrees of freedom.

No difference. Textbooks make it seem so: NOT TRUE. The only
distinction is addressed above: Neither the CI nor the *P*-value
are correct if the sample is small and has been drawn from a
distribution that is not approximately normal.