Trimmed Means

Glad to see you're interested!


  The trimmed mean is computed just as an ordinary mean. . .except. . .first a pre-specified percentage of the extremes is omitted. Consider the 5% trimmed mean. The left-most (lowest) 5% and right-most (highest) 5% of the data are excluded; from the remaining observations the mean is found. So, the 10% most "extreme" data (5% on either side) is omitted before computing the mean.

For instance, consider the data set (N = 40) below

45.8 19.8 23.1 13.8 16.3 21.3 25.2 17.1 21.4 18.6
15.2 18.8 21.0 20.4 16.9 23.7 21.4 24.6 18.6 19.6
26.9 20.7 21.6 19.6 20.8 25.2 26.3 20.5 23.7 13.3
23.2 18.7 24.1 16.0 30.2 24.6 15.8 22.6 27.0 20.5

Here's a histogram. What's notable?


Sorting the data we have what's below

13.3 13.8 15.2 15.8 16.0 16.3 16.9 17.1 18.6 18.6
18.7 18.8 19.6 19.6 19.8 20.4 20.5 20.5 20.7 20.8
21.0 21.3 21.4 21.4 21.6 22.6 23.1 23.2 23.7 23.7
24.1 24.6 24.6 25.2 25.2 26.3 26.9 27.0 30.2 45.8

5% of 40 is 2. So, in computing the trimmed mean the two smallest and two largest 2 observations ignored.

          15.2 15.8 16.0 16.3 16.9 17.1 18.6 18.6
18.7 18.8 19.6 19.6 19.8 20.4 20.5 20.5 20.7 20.8
21.0 21.3 21.4 21.4 21.6 22.6 23.1 23.2 23.7 23.7
24.1 24.6 24.6 25.2 25.2 26.3 26.9 27.0

Now, compute the mean. The mean of the remaining 36 observations is 21.133. This is the trimmed mean. Contrast it with the mean, which is 21.598.


The trimmed mean has the advantage of being relatively resistant to outliers. Unless there are more than 5% outlying values in a given direction, outliers will not be included in the computation of the trimmed mean. Here the trimmed mean is right in the center of the distribution (ignoring the outlier). The (untrimmed) mean is pulled a bit to the right by the extreme of 45.8.

There are other trimmed means. Pick any percentage P% and you can find the P% trimmed mean. In fact, the median is essentially the 50% trimmed mean. Think about it!