Archived posting to the Leica Users Group, 2006/03/31

[Author Prev] [Author Next] [Thread Prev] [Thread Next] [Author Index] [Topic Index] [Home] [Search]

Subject: [Leica] Re:Photo comparison software
From: lrzeitlin at optonline.net (lrzeitlin@optonline.net)
Date: Fri Mar 31 19:33:05 2006
References: <200604010225.k312P6wo018704@server1.waverley.reid.org>

<<Date: Fri, 31 Mar 2006 15:53:09 -0800
From: Brian Reid <reid@mejac.palo-alto.ca.us>
Subject: Re: [Leica] Re: Photo comparison software
To: Leica Users Group <lug@leica-users.org>
Message-ID: <10989B25DFF39D348E4B0EE5@scarborough.isc.org>
Content-Type: text/plain; charset=us-ascii; format=flowed
This is a variation of a classic computer science problem. It's hard, and 
there's no software outside international government spy agencies that can 
do it.
The only way to make it tractable is to define a classification scheme for 
the pictures and sort them into similarity groups. The scheme doesn't 
matter; you can do it by color, by whether or not it contains a chimney, by 
whether or not it contains a person, or how much of the paint is peeling.
Once you've broken down the "several thousand" pictures into clusters 
(groups whose contents are similar according to your primary criterion) then 
pick one of those clusters and repeat the process. If the cluster consists 
of all photographs 
that have a chimney at the left side or all photographs that show shark 
teeth, find sub-categories to allow you to further divide the clusters into 
sub-clusters.
Keep doing this until you get groups that have under about 50 pictures in 
them. Then compare by hand; the sub-sub-clusters will be small enough that 
you won't have any trouble finding similar pictures.
I've done this 3 or 4 times in my life, this process works.>>

Brian,

I'm sure you are right but I'd hoped there was an easier way. There are over 
6,000 individual photographs and it will take me months to classify them and 
then sort through the individual groups. I've  conceptualized an easier way 
but my programming skills aren't good enough to implement it. Here is the 
conceptual scheme:

1. Copy and standardize the pictures as screen sized gray scale images. This 
can be done fairly easily by batch processing in GraphicConverter.
2. Divide each image into about 2500 cells, say a 50 x 50 matrix. Compute 
the average b
rightness of each cell.
This might be easier using only a 2 bit gray scale. The brightness 
calculation could be made by simply counting the number of black pixels in 
each cell.
3. Perform a product moment correlation between each image and all the other 
images using the digitized cell scores. I know that this means over 18 
million correlations, but what the hell. It's a computer and it can work all 
weekend without complaining.
4. Identify pairs of photos where the correlation is higher than some 
arbitrary cutoff, say greater than 0.9. This would give me an 80% 
probability that the pictures are the same or at least quite similar.
5. Visually compare the originals of the high correlation images to see if 
they are indeed identical.

If anyone has any ideas on how to improve the process, please let me know. 
I'll try and get some of my expert programming colleagues to see if they can 
make it work. Any idea will be appreciated.

I'm sure that many of the LUG members have a related 
problem in trying to organize their digital shoe box of image files. I know 
that I have a couple of thousand unclassified photos on my Mac right now 
that I've laways promised myself I would sort through eventually.

Larry Z



Replies: Reply from pdzwig at summaventures.com (Peter Dzwig) ([Leica] Re:Photo comparison software)