Return to Digital Photography Articles
Catalogs and multiple versions of same photo
As of this writing, one of the most neglected functions in photo cataloging software today is the ability to handle multiple versions of the same image. In fact, this is an issue that can become a serious frustration if ignored for too long. Fortunately, it is widely expected that this functionality will be incorporated into the next releases of most decent photo catalog software applications.
So, what is the problem?
When one uses a keyword tagging system (which is the heart of most photo catalog programs), a basic premise is that each photo in the catalog can be assigned its own unique set of tags. Furthermore, most catalog programs also treat each photo on the hard drive as a seperate entity in the catalog program. When one edits an image, a fundamental rule of digital photography is to preserve the original and create a copy of it for editing.
In the days of film, you wouldn't draw on the negative, would you? Okay, so sometimes we did, but this was generally a last resort. One mistake and the original is gone. The same applies for digital photos. As it is so easy to modify a digital photo, it follows that it is also exceptionally easy to make a mistake in editing that ruins the "digital original" (if one didn't duplicate the original first).
Given that we always duplicate the original before editing, we end up with two or more copies of the same basic photo. As these edits are generally done outside of the cataloging program (except in applications like Photoshop Album 2.0 or Photoshop Elements 3), the catalog program has no way of telling that the newly-generated file in a folder is actually based on an image that is already in the database. Result? The catalog program assumes that the edited version is a completely new photo, and keywords assigned to the original will not be assigned to the edited version.
Comparison of catalog software & versioning
See the comparison chart between catalog programs. It compares a number of important features, and includes a section on how the versioning approach is handled in each application that supports it.
Summary of the multiple version problem
As the keywords that were assigned to the original photo are not copied automatically to the edited version(s), one ends up with an improperly-tagged database. Some photos will have all tags assigned, while others will be missing them. With large image databases, an unassigned / untagged photo is as useless as being non-existent. Without the keys to retrieve the photo, the only option left is to manually wade through the file hierarchy and hope that you spot the photo.
Even if you manually copy the tags from the original to the newly-imported edited version, you run the serious risk of having an out-of-date set of tags. Someday you might add or remove a tag from the edited version, but forget to apply the same change to the original. Similarly, one might modify the original version's tags but not the edited version. In both cases, the tagging will be out-of-date on one of the files.
When it comes to version control, there are three problems to solve:
- Identifying multiple versions and associating with original.
- Keeping tags current between versions.
- Displaying multiple versions in the browser.
Problem 1: Identifying multiple versions
The first problem with versioning is: how does the catalog program know that multiple images on the drive are related? If the software is unaware that several files on the drive are associated with each other, then there is also no way that tags can be maintained between these multiple versions (problem #2 below).
Some catalog programs today attempt to solve this issue, and various strategies are in use. The following is a list of the usual approaches performed by the catalog program, with the best methods at the top:
- Invokes the creation of new versions, performing
the duplicate of the original, if necessary.
The catalog program is aware of the creation of additional versions because it is used to invoke the editor itself. It may also be responsible for duplicating the original photo first, and then sending a copy to an external editor. The catalog program can then hide the unedited version and replace it with the edited copy, for example. Other tools can simply invoke a script that will accomplish the same thing (ie. select a thumbnail, select the script command, and the duplicate is automatically created in the filesystem, along with another entity in the database). One potential limitation of this methodology is that it's very easy to circumvent this process by copying & pasting into new documents within external editors. Such a scenario would defeat the automated tracking provided by the invokation methodology. Fortunately, most catalog programs will also provide one of the other methods described below as a backup, so that these files will not remain catalog orphans.
- Monitors the file system for changes in folders
and imports these into database.
The user enables a mode in the catalog program that will start monitoring the file system for changes. If it detects that a file (under control by the catalog) has changed, it will automatically re-read the image and update the metadata along with the thumbnail. If a new file is detected in a controlled folder, then that image is added to the database. See the related feature under Problem #2: Matching of New Files.
- Performs the actual edits and therefore create
the additional file versions.
If the catalog program itself performs the edits, then one can expect that it will maintain some degree of association to the original (either by name or link within the database). The reason I feel that this is not necessarily a good approach is that for a catalog program to do a good job at cataloging, it shouldn't also try to be a good editor program too. The editing functionality within most catalog programs is limited at best, and it is often best to leave this to the job of an external editor, such as Photoshop. The only exception to this I see is with Adobe's Photoshop Elements 3. As Adobe produces an excellent editor, and have now effectively married it to the catalog program, they are in the unique position to offer complete integration. I don't expect any other company to offer an equivalent two-in-one package. However, looking at the issue from the common-user perspective, the basic editing functionality within these catalog programs may be sufficient to cover 90% of their needs (eg. exposure, cropping, rotating, sharpening), thereby reducing the need for a further tool step.
- No internal support for the creation of versions or new-file detection. The most common scenario for catalog programs until this versioning issue becomes an important differentiating / selling point for products. The onus is on the user to duplicate the original source, edit the file with an external application, and then locate the new file manually within the catalog program. Of course, it is fairly easy to forget the re-import process, and be left with many files that are essentially orphaned by the database (there is no connection to them, and they are effectively lost).
Orphaned images: photos"lost" until rescan
A very big problem with relying on a catalog program to handle all of your future photo searching (versus searching through folders manually) is that if the files aren't known to the database program, they are effectively lost. As the collection grows, it becomes less likely that one will manually locate these orphaned images, and they will hide in the folders unnoticed. Fortunately, a number of catalog programs offer some way of re-scanning the controlled folders for files that may have been added or modified. Occasional re-scanning may be necessary if your catalog program falls into the last category described above.
Problem 2: Keeping tags current between multiple versions
Depending on the existing support for identifying newly created versions of an image, the catalog program will also have varying levels of support for the maintenance of the tags between these versions. As described at the top of this page, the maintenance of tags between versions is a critical feature for many users. Having some degree of support from the catalog program is essential. The following lists some of the various approaches offered by catalog programs today:
- Native support for versions.
The catalog program maintains an integrated link (ie. not through user scripting, etc.) between all versions of the same photo. As it is able to quickly find related images (through internal indices, for example), tagging is generally associated with a single asset (eg. a photo), and not individual variants of an asset. Adding a tag to a photo in the database should be reflected in all versions of the same photo, automatically. In such an approach, one should provide the ability for both common tags and individual tags (see the description below). Programs that fall into this category will probably use an efficient user interface mechanism to display the associations (see Image Stacks below, under Problem #3).
- Detection of new files is automatically matched with originals.
This is not a common approach, but if implemented, it could save one a fair amount of work by automating a step in the workflow. A catalog program should be able to recognize the addition of new files and have a means of locating the original. This could either be through naming conventions, metadata comparisons or user input. As the program now has a link between the versions, it can then keep the tags together. Unfortunately, this doesn't appear to be done by many applications currently available. Programs monitor folders for changes, but they typically aren't smart enough to locate the original. In all likelihood, if a program were to go to the extent of providing this feature, they would more likely do it properly and support image stacks natively (see above).
- Scripts are used to transfer tags between original and edited versions.
If the scripting environment is extensive enough, it is a fairly easy task to write a script that will keep tags between the originals and edited versions up-to-date.
- User has to manually maintain tags between versions.
This is by far the most common method, and is the only option if Problem #1 reveals that the software does not support the invocation / detection strategies listed there. Most catalog programs available today rely on the user to do the version support and maintenance, and unfortunately it is a tedious and error-prone task. Beware if your program fits into this category, as you might find it to be a lot more work than you could expect.
For those applications that support multiple versions natively, it is important that one have the ability to define common and individual tags. By common tags I mean that applying a tag to an asset (which includes all versions spawned from the same photo), such as People:Fred, will copy this tag to all versions of the same photo. By individual tags, each version based on the original can have its own independent tag that is not copied between all versions. An example of this might be a state tag such as Ratings:Excellent, or State:Sharpened. As an example, my Manage Versions script supports the notion of common and individual tags through defining a list of tags that are not transferred when maintaining tags between versions.
Problem 3: How to display multiple versions
Assuming that the catalog program inherently supports multiple versions, then the last item to deal with is how is the information conveyed to the user. The following lists the usual approaches:
- Image stacks / versions
Only a single thumbnail is displayed in the catalog, representing all versions of the same base photo. An indicator in the thumbnail view will show that multiple photos exist under the main thumbnail. Opening this will reveal all photos that have been associated with this image stack. A good use for this is keeping all of the edited versions along with the original, in which case you will probably keep the best edited version at the top of the stack.
- Individual images, no visual association
Unless the program offers native version support, then this is probably the most likely scenario the user will face. All individual versions are displayed in the thumbnail window. Some indication might be given so that multiple versions of the same photo can be easily identified (color coding or by sort order). Obviously, this is much less desireable than a program that supports image stacks (or equivalent) in the user interface.