5. Face Recognition and Related Problems

Face recognition by computers received a lot of recent attention after 9/11, both favorable (as helpful for fighting crime/terrorism) and unfavorable (invasion of privacy). The task is first to find a face in a picture and then match that face to an item in a database. We will run ahead for the moment and look at some challenges to the second problem.

An Aside: Similarity of Shapes

In order to match a picture of an object to a similar picture in a data base we need to tackle the issue of similarity, a concept that it is quite difficult to quantify. Consider, for example, the three shapes in Figure 1.

Figure 1: Illustration used to explain the difference between humanly perceived and mathematical similarity

If you ask a person which one of the three shapes does not belong with the other two, most likely the answer will be the middle shape (circle with a notch). But computers must use mathematical formulas to compare the position of the pixels. It turns that the first two shapes are identical, except for the notch. If we overlay them on top of another (as the computer, effectively, does) they match everywhere, except at the notch, so their matching gets a high score. But the shape on the right is an ellipse, so it cannot be overlayed on the leftmost circle, so the matching will get a poor score. So the computer answer will be that the shape to the right does not belong with the rest!

It is possible to write a program using different matching criteria (based on differential geometry) that will provide the same answer as people, but one can construct more complex examples where that program also fails to match human results.

Question: What do you think of using such shape tests for CAPTCHAs?

Face Recognition

Automatic face recognition seems an unlikely problem to be solved by computer for several reasons. It took over forty years to built acceptable quality machines that recognize written symbols; what makes us think that we can solve the much more complex problem of distinguishing human faces? Nobody amongst those responsible for displaying face recognition systems asked that, given the difficulties we have to distinguish, say, an "h" from a "b", how can we expect to achieve the far more difficult problem of distinguish one face from another? While mechanical face recognition might be possible in principle, it represents a drastic leap in technology from what has been achieved so far by machine vision. It is also likely that face recognition may not be solved regardless of general advances in the recognition of pictorial patterns. Neuroscientists point out that humans have special neural circuitry for face recognition. It is well known that people have trouble recognizing differences amongst people of different races than their own. There is a simple experiment that can be used to show the complexity of human face recognition. Could you try to find out in what way the two images of Figure 2 differ? The task becomes much easier if you look at the picture right side up.

Figure 2: Illustration of one of the challenges in face recognition

It is instructive to repeat the experiment with the cat pictures of Figure 3. (Right side up version.)

Figure 3: The effect of the subject on face recognition

Why such a difference? Scrutinizing human faces is more valuable to people than scrutinizing cat faces.

Modern Politics

Not surprisingly, the results of installed face recognition systems after 9/11 have been dismal. An ACLU press release of May 14, 2002 stated that "interim results of a test of face- recognition surveillance technology from Palm Beach International Airport confirm previous results showing that the technology is ineffective." The release went on to say that: "Even with recent, high quality photographs and subjects who were not trying to fool the system, the face-recognition technology was less accurate than a coin toss. Under real world conditions, Osama Bin Laden himself could easily evade a face recognition system. It hardly takes a genius of disguise to trick this system. All a terrorist would have to do, it seems, is put on eyeglasses or turn his head a little to the side." Similar conclusions appeared in a Boston Globe article of August 5, 2002. It quotes the director of security consulting firm saying that the "technology was not ready for prime time yet.'' He added that the " systems produced a high level of false positives, requiring an airport worker to visually examine each passenger and clear him for boarding." The article goes on to say: "One of the biggest deployments of the technology has occurred in England, in the London borough of Newham. Officials there claim that the installation of 300 facial-recognition cameras in public areas has led to a reduction in crime. However, they admit that the system has yet to result in a single arrest." A recent criticism of mechanical face recognition appeared in the October 26, 2002 issue of the Economist. Ignoring the scientific evidence has resulted in a curious situation. The suppliers of the face recognition systems insist that the testers need to prove beyond reasonable doubt that their systems are faulty, instead of themselves having to prove that they are selling a valid product.

There is web site that displays results of a program for face detection, i.e. locating the face or faces in an image, a necessary step before attempting face recognition. The results are anything but impressive. Apparently the program uses a heuristic that a face is a light round area with dark spots (for eyes, nose and mouth) that causes it to miss faces that are dark and picks up other irrelevant areas. Wearing glasses seems to cause problems because it interferes with the eyes heuristic. Figure 4 shows a blatant case of erroneous results.

Figure 4: Results of the Robotics Institute, CMU program. A green rectangle is overlaid on any face detected. A major miss is evident.

Question? How did I management to get my picture with bin-Ladden? (Hint: the answer is one word.)

Why the Failures? The Scalability Issue

Rather than try to match the pictures of the faces themselves, computer programs select features and then adjust parameters ("train") a classification program so that faces are placed in the right class. A major problem with this method is that of scalability. Suppose we selected features and "trained" a classifier for 1000 samples from a population. What is the probability that this classifier will perform well for the whole population? We may have found a separator for two classes for the 1000 samples, but how valid is the separator for, say, ten classes and a million samples? Several systems that fared quite well in the laboratory have failed in practice for that reason.

One example is "VeggieVision" developed by IBM. The idea was to recognize vegetables (that are not labeled by bar codes) at checkout counters. It was relatively easy to build a system that could tell tomatoes apart from cucumbers or apples from oranges. It is far more difficult though to distinguish two varieties of oranges from each other to say nothing about distinguishing organically grown tomatoes from conventional ones. Supposedly the system was tested in Australia about six years ago and nothing has been heard about it since then.

Scalability is a key issue in face recognition. There have been several research projects with results that demonstrated successful face recognition. However, the population samples in such projects were relatively small, usually the members of a research laboratory and their friends. Samples were diverse. They included men and women of different races with different hairstyles and, for men, different amounts of facial hair, etc. I have never seen a study where all the subjects share the same major characteristics. For example, white blond men between the ages of 20 and 30 with long hair and beards. In addition, the subjects in such studies were cooperative. Expanding the method to a large and uncooperative population appears daunting.

Last update 3/10/07

Back to Tutorial Index