Image Cluster copies and renames images based on Exif data and file number names. Image Cluster also clusters those images into directories based on a variable sliding window (with a default of 36 hours), which makes it easy to group images based on events without manual intervention.
The inspiration for this program came from recently getting a new Canon SD500 camera to replace my Canon S30 that I'd had for years. The upside, the Canon SD500 rocks! The downside, I now have 2 cameras that are burning through the same sequence numbers, so my previous solution of just putting all the files in to directories by the first 2 digits of the sequence numbers was no longer going to work.
Imagecluster solves this problem, plus another grouping problem that I'd been thinking about, by extracting the CreateDate and FileNumber exif tags from the images, and using that as the basis of a new filename (typically YYYY:mm:dd_HH:MM:SS_FileNumber.jpg). This ensures that 2 images taken at the same second have an even smaller chance of colliding, as their camera sequence numbers would have to also be the same at that second.
But that is just the first step. I have noticed that I am an occational photographer, so take pictures in bursts, often for a weekend of hanging out with folks, though sometimes for a vacation as well. This got me thinking. What I really needed is a tool that also creates directories that allows for some minimum tollerance between CreateDate, that is used to cluster images. For me, the optimum value seems to be 36 hours, though this is configurable via the command line.
This took me an afternoon to pull together, I'm sure it could be smarter, but it is useful enough to post for others to use.
Set the target directory for images. Defaults to /tmp/photos, which is probably not what you want.
Dryrun. Tells you what the program would have done.
Print out help message
Seperator character. It defaults to : (i.e. 2005:10:09...), but is user configurable because my friend Clemens wants to use - (i.e. 2005-10-09) instead.
Set the tollerance for image clustering. This is the maximum time between any 2 pictures in a cluster, which will cause a new cluster to be created. The name of the cluster will be YYYY:MM:DD of the first image in the cluster, even if it spans multiple days. Because this tollerance is the maximum time between any two images in the cluster, it is possible that all images you have ever taken could be in 1 cluster, if you took a picture every day of your life. Hence, this feature isn't useful to everyone. If you are that kind of person, set tollerance to 16 hours or something, and you'll tend to get 1 day sized buckets.
Prints verbose output