Sampling Rectangles


Recall: The 3 estimation methods were

E: Eyeball. Basically a guess after you've examined the rectangles

J: Judgment Sample. You choose your own sample of 5 and take the average

R: Random Sample. Randomly choose 5 rectangles and take the average.

The average area of all 100 rectangles is 7.42; the standard deviation of the rectangle areas is 5.20227. Here's a histogram of the areas of all 100 rectangles in the population.

Of course, you did not know this when you took samples and obtained estimates. In fact, you were instructed that the purpose of the activity was to estimate the average area of all the rectangles (which, only later, did you learn was 7.42). In a real study, you'd never know the population mean (for, if you did, why would you bother taking a sample?).

Results from 3 different classes are pooled here. The raw results are available in a separate file as well.


First Look.

We notice outliers. Two of them: 200 and 50. These people gave eyeball estimates completely out of whack. What were they thinking? Clearly they were confused over this -- at this time. So, from here forward we'll ignore these two values.


Second Look

Notice that the red reference line is drawn at 7.42 -- the actual mean size of all 100 rectangles.


Looking at shapes

Eyeball Estimate

Left skewed with outliers to the right. It would be very nasty to try to work with this distribution. The average of these is 9.316, they range from 2.1 to 20.0 (ignoring the 50 and 200 we discarded). The standard deviation (again ignoring the 50 and 200) is 3.158 (although this is a poor way to measure spread here, since there are outliers and the shape isn't very close to a normal distribution).

Judgment Sample Estimate

Left skewed, again with outliers to the right. Another difficult distribution to work with. The average of these is 9.143, they range from 3.8 to 17.2. The standard deviation is 2.470 (this is a poor way to measure spread here, since there are outliers and the shape isn't very close to a normal distribution).

Random Sample Estimate

Fairly symmetric (perhaps a little right skew). Very close to normal shaped. The average of these is 7.157, they range from 2.4 to 14.4.Consequently, the standard deviation is an appropriate measure of spread. The standard deviation of these estimates is 2.312.


A Little "Theory"

There is a rule that says when drawing a random (and only random) sample of size n from a large population, the standard deviation of the sample mean is the standard deviation for the population divided by the sample size. Our samples were of size n = 5; the standard deviation for the population is 5.20227 -- so we have 5.20227 / squareroot(5) = 2.24. Note that the standard deviation of the R values (reported above in blue) is awfully close to this.