Phase 1: Scanning Raw Images
The first step in the process of digitizing a photo is, of course, to scan it. There is a bit of preparation to be done beforehand. The scanning software needs to know what kind of scanning you plan to do. For this project I’d be scanning both color and black-and-white photographs. Since scanning can occur at all kinds of resolution, I had to establish a resolution quality benchmark that I could live with. There is a tradeoff to consider: while ideally you want the pictures to be at the highest possible resolution, in reality limitations of things like disk storage and network bandwidth become very real constraints. Sometimes the quality of the source prints, or even the scanner itself, can be a constraint.
Originally, my quality benchmark was to make them come out with the same resolution and quality as the pictures my digital camera produces. My first digital camera, a Sony Mavica, was capable of producing a 3.3 megapixel image. At its finest quality, it saves a color photograph at 2048 pixels wide x 1536 pixels high. This results in a JPEG image file that is about 1.3MB. In the beginning I used that as my quality benchmark.

Karen Wells, my mother. 1968, age 23.
But when my camera came out in 2002, 3.3 megapixel was the very top end of image resolution for the prosumer digital camera world. These days, it’s more like 8-10 megapixel or more. In 2004 I ran the numbers on 8 megapixel images, and I decided my computer was inadequate for dealing with thousands of images of that size. So I selected 5 megapixel benchmark as a compromise. 5 megapixel implies an image that is 2592 x 1944, resulting in a 1.5MB JPEG file. That’s not bad. But then in 2006 I bought a Canon Rebel XTi, my first digital SLR, which was capable of producing 10 megapixel images, 3888 x 2592, ~3.5MB. That was the final benchmark. I wanted my scanner to produce images as close to this as possible.
Note: I recognize that this only approaches or perhaps meets the image resolution 35mm is capable of, effectively speaking. So why am I willing to compromise? Because it’s pragmatic to do so. Most of these pictures are family-related, and so the resolution isn’t as important as it would be for purely artistic photography. Also, many of the photos aren’t even 35mm, or even color for that matter. Better to accept the limits of the technology now than it is to wait the years it would take for the multiple terabyte disk drives, next generation DVD discs, gigabit wireless Ethernet, and 24″ 300 dpi monitors to reach non-insane price points.
As it turns out, I never owned a scanner that could reach that last benchmark anyway. Trial and error with flatbed scanners has taught me that for best results, a typical 4″ x 6″ 35mm print should be scanned at 600 DPI (dots per inch) or so. This will reach the 5 megapixel benchmark, which was more than adequate for the quality of most of these snapshots. Adjust this number higher for small pictures (like a wallet print) or lower for large pictures (like an 8″ x 10″). You can scan a much higher resolutions, if you want, and if you have lots of disk space and memory. Lastly, the descreening option should be set for ‘photo’ or ‘magazine.’

Dave Wells, my brother. From the late 60s.
When using a flatbed scanner, which is the only practical choice in this setting, the scanning process is very cumbersome. It goes like this:
- First, place the photo on the glass window of the scanner.
- Next, do a preview scan. This is a lot quicker than a full scan, and helps you be sure that the pictures aren’t crooked.
- Then you select the region of the preview that you really do want to scan, since you only want it to scan precisely where you put the picture, not the entire scanner bed.
- Finally, initiate the scan. The scanning software saves the resulting image to a file on your hard drive.
As a practice I try to scan multiple photos into a single file. Naturally, if you can scan three pictures in one pass, you’ve tripled your productivity here. But this is practical only if the pictures are part of the same group, like an event, because the filename of the scanned image can only convey so much information. (I’ll get into this more in the post-processing phase.) VistaScan, the tool I first used back in 2002, saved the scanned images as uncompressed TIFF files with names like ‘VSImage_1′, ‘VSImage_2′, etc. Other scanning software uses different schemes.
So you’ve scanned the photos, and you’re all done, right? Wrong. You still have to deal with these problems:
- File size. These raw TIFF images may range from 20-80MB each. The target file size is around 4MB or less per image.
- Resolution. The raw images may be several times the target resolution of 2592 x 1944.
- The image may need to be rotated 1-2 degrees or so to compensate for how the photo was placed in the scanner.
- Scanner cruft (unwanted portions of the scanned image) will need to be cropped off the edges.
- The contrast may be sub-optimal, resulting in a dim image.
- The color saturation may be too low, resulting in muddy colors.
- The white balance of the image may be off, due to problems in the original print.
- The images may contain problems such as windshield glare, red eye, or scratches that are all correctable via Photoshop.
- The raw scanned image may actually be comprised of several separate photographs, whereas each image should comprise one (and only one) photo.
- The filenames of the raw images are completely meaningless, and don’t describe the picture.
Which brings us to the next phase: post-processing, where all of these problems are solved.