Mean vs. median


When we introduce the mean and the median  in basic courses on statistics my feeling is that the median seems a little unnatural for the students. On the other hand they are very familiar with the mean because it is the useful operation to compute the final grade. If the mean is the "natural average" why should they bother about the median? The importance of the median relies mainly on its stability when we have outliers. There is a simple way of "seeing" the difference with a well-known application to digital image filtering.

Let us start with an image. Beauty is not mandatory but it can capture the audience.

mini1



Now we add "salt and pepper noise". For a grayscale image this means that with a certain probability p a pixel is set to on/off and the result is that the image is contaminated with black and white dots. For color images the situation is a little more complicate because we have three channels (RGB) and then with the same parameter the probability of a pixel of being changed is 1-(1-p)3. In our case we have take p=0.15 and consequently something less than the 40% of the pixels are corrupted.

niki_n


This 40% implies that still the majority of the pixels are OK. With this idea in mind one can take the average of the values of the pixels in each 3x3 block (a central pixel and the 8 surrounding pixels) hoping that majority wins. The result is shown below and it is quite disappointing, the image has just blurred. If we force to team a single very bad student with groups of good students, we can feel the presence of the bad student in the average result of a contest.

niki_c


The situation is completely different  with the median because it is unresponsive to extreme values if the rest of the values are alike. Taking the median in each 3x3 block has a magical effect. The image is almost completely clean.

niki_m


The code


The matlab code that produces the images from the initial one is:

imag = imread('niki.jpg');
%imag = rgb2gray(imag);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% NOISY IMAGE
imagn = imnoise(imag,'salt & pepper',0.15);
imwrite(imagn, 'niki_n.bmp')

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% MEAN FILTER
h2 = ones(3)/9;
imwrite(imfilter( imagn, h2, 'conv'), 'niki_c.bmp')

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% MEDIAN FILTER
imagm = imagn;
% simply imagm = medfilt2(imagn,[3,3]); for grayscale images
for ii = 1:3
    imagm(:,:,ii) = medfilt2(imagn(:,:,ii),[3,3]);
end
imwrite(imagm, 'niki_m.bmp')

Quite simple!