Worksheet 1

My friend Jill has a collection of 240 compact disks. She keeps track of them in a spread sheet; one variable she measures and records is the playing time. Jill (like just about every other woman) has Joni Mitchell's Hejira in her collection; Hejira has a playing time of 51:23 which is 51.383 minutes. Here's a histogram.

  1. Note the presence of at least one outlier. How many outliers are there? (The exact height of the bar over the 3-6 class is 0.83%.)
  2. Give some likely explanations for the outlier(s).
  3. What percentage of Jill's CDs have playing time between 33 and 39 minutes?
  4. Using the histogram it is possible to closely approximate the value of the 96th percentile. What is this value?
  5. Characterize the shape of the distribution.
  6. Dropping the outliers from the data set, Jill computes the mean playing time to be 45.03 minutes. How will the median compare to this value?
  7. If Jill were to include the outlier(s), how would the mean change? How would the median change?