After applying motion detection and heuristics, we have a fairly good
indication of where there may be a human face present. However, we still
need to determine if the object we have detected is a face. To do this, we
scan a region close to the location detected in the image at three
sizes (
,
,
). The scanning is
illustrated in figure 4 (nine frames at each size, for a
total of twenty-seven frames). For each frame we apply the following
algorithm:
From the reconstruction error we can decide if the frame contains a human face or not. A face is deemed present at the frame with the smallest reconstruction error. This error also has to be smaller than a threshold. This threshold is set manually from experience.
The preprocessing step is adapted from [7] and [5]. First we subtract a best fit linear function to correct for extreme lighting conditions. This basically removes some shades and normalizes the intensity values to a small degree. Then we histogram equalize the subframe to further normalize the intensity values and enhance features.
The "face space" is computed by a principal components analysis of the ORL
face database2
(figure 5), which consist of 400 human face images
.
This technique of representing face images
using principal components was originally suggested by [6],
and further developed by [8]. The average face is defined by
The average face is then subtracted from each image and vectorized:
Let
,
and
C = DDt. We want to find the eigenvectors
ui of
C. Since the matrix C is of size
(the images are
of size
pixels)3 and we have a maximum of
400 independent vectors, we apply the following technique:
Let L = DtD and vi be the eigenvectors of L. We can now find ui from the following equation (a linear combination of the face images) [8]:
ui are called the principal components of D, and when converted back to matrices we can view these vectors as the eigenfaces of the ORL face database. Some of these eigenfaces are shown in figure 6 (ordered according to eigenvalue).
The preprocessed subframe
(mean subtracted) is
projected into face space by
We only use the first twenty principal components to define face space. It is reconstructed by
The reconstruction error
is a measure of how typical a face this is, and is applicable for
face/non-face classification.
![]() |
![]() |