A Challenge to CBIR Techniques

Theo Pavlidis
© Copyright 2008

I argue that CBIR techniques that do not include object recognition are inherently inadequate for content-based image retrieval. Of the three pictures below, the one in the middle (B) is a brightened version of the picture on the left (A) and the picture on the right (C) is a landscape from Cappadocia. Even a visual inspection suggests that the color and histograms of B may be closer to those of C than to those of A. Indeed, I tried several feature sets and B comes always closer to C than A. (I came across the example at first by looking for the pictures most similar to B from my personal collection and C came as the closest.) Yet I claim that for a human observer B is closer to A than B to C. After all both A and B represent exactly the same scene.

A. Picture of our late dog Lucy B. Brightened version of A C. Cappadocian Landscape

First Part: Could someone come with an image similarity measure based on "global" pixels statistics (color histogram, texture histogram, structure histogram, etc) that would give a classification similar to that given by a human? (There are several trivial ways to come up with a measure that would work correctly for these three particular pictures. For example A and B have the same size but C has a different size. To avoid trivial solutions I will also provide another triplet to test the answer.)

Note: By "global" statistics I mean those taken over the whole image or large pre-defined regions (this term also includes transforms such as wavelets).

Second Part: Obviously a segmentation by luminance or color and a comparison of the resulting region adjacency graphs is likely to give a result (at least in this example) in agreement with human intuition. But could such a method also match image A to images D and E below? (It turns out that methods based on local features can meet the first part of the challenge but still fail to meet the second. See details.)

D. Lucy on the beach E. Another shot of Lucy on the beach

An approach that could do the task: A real object recognition program (that could identify A, B, D, and E pictures as containing a "dog") would easily meet the challenge. Of course writing a robust such program is itself a major challenge.

Picture Downloading

You can download each picture in JPEG format by right-clicking on it and then selecting "Save As .."

Or you can download in PPM format by clicking on the following links
FigureA FigureB FigureC FigureD FigureE

Back to Main CBIR page

Original posting: January 2008 - Last update: June 2, 2008

 
theopavlidis.com Site Map

My (spam-protected) e-mail address