|IQs have an approximate normal
distribution, N(m, 15). For
western society, the mean is typically 100.
10 years ago there was an accident at a nuclear power plant in Chernobyl, Russia. In addition to direct effects of radiation, there are, possibly, indirect effects. One worrisome effect would be the underdevelopment of children. One way to quantify this development is with IQ scores.
An SRS of 20 children born in Chernobyl is drawn, each is administered an IQ test. Use the data to find a 99% CI for mu, the mean IQ of all Chernobyl children.
Our "recipe" for a confidence interval is
We're interested in a 99% confidence interval; using our standard normal tables we find z* = 2.576. The standard deviation for the (population) distribution of IQ scores is known to be 15. Our sample size is n = 20; the square root of 20 is 4.4721. Here's the data:
80 84 99 98 86 80 72 105 59 102
102 88 89 109 58 117 94 110 92 97
Compute a sample mean of 91.05. The margin of error is (2.576*15)/4.721 = 8.64 and our 99% confidence interval for the (population) mean IQ score of all Chernobyl children is:
91.05 +/- 8.64 or (82.41, 99.69).
(The symbol +/- is the "plus or minus" symbol.) We are 99% confident that the mean IQ of all Chernobyl children is within 8.64 of 91.05. Or. . .equivalently, we are 99% confident that the mean IQ of all Chernobyl children is between 82.41 and 99.69.
.NOTE: The probability mu is between the computed endpoints of this interval IS NOT 0.99. The population mean mu either IS or IS NOT between these values. The 99% refers to the procedural accuracy, not the accuracy for any one CI. 99% of all properly implemented) confidence intervals are successful in the sense that they contain the desired parameter. The problem is, we never know which 99% are successful and which 1% are unsuccessful.
If IQs for Russians have the N(100, 15) distribution, then this interval serves as fairly strong evidence that the mean IQ for all children in Chernobyl is less than that for Russians.
A rephrasing: The evidence is strong that the center of the distribution of IQs for children in Chernobyl (mu) is less that the center of the distribution for Russians as a whole.
"The first step toward statistical sophistication is to resist the temptation to compute without thinking." (Found on page 6 of Introduction to the Practice of Statistic, Moore & McCabe.)
We ought to look at the data more carefully. In constructing such a confidence interval we are implicitly assuming that the data are drawn from a normally distributed population. We have a relatively small sample here--testing this assumption might be difficult, but we ought to at least look at the data. Further, this confidence interval is based on the sample mean. Recall that the sample mean is not a resistant measure. . .it can be "thrown off" quite a bit by a single extreme observation (outlier). DId we check for outliers? NO! We'd better make certain. Here's a histogram and a boxplot of the data.
There are no irregularities in the data, nor no outliers. The data (its histogram) is not inconsistent with a normally distributed population. The confidence interval is properly implemented.
A confidence interval is a statement about the population mean, made using the sample mean as an estimate. You can see that the confidence interval does not include 99% of the children's scores. This cannot be the case for one good reason. A confidence interval by nature narrows as the sample size grows--see the formula. Of course an ever-narrowing interval cannot possibly be a statement about the distribution of items in the population. The margin of error comes from the distribution of the sample mean.
If it's individual scores that are of interest, a confidence interval is of no use; instead you should use a histogram and/or boxplot.