A "Lab" investigation program which processes a bmp
image file and identifies connected pieces of the image separated by "white
Background & Techniques
This started out in response to a request from Peter in the Netherlands
looking for an efficient way to process a black/white image to identify its
connected pieces. ( I call the pieces "chunks" since it turns
out the point contained may have color as a 3rd dimension.)
The application is to quickly identify scanned documents which were printed
with dot matrix printers so they can be routed to a more advanced OCR
recognition program. I haven't yet had feedback from Peter on how
it worked on his files, but it seems fast on the ones I've tried so far with
a thousands of "chunks". (As sample of one scanned hand-written page
2300x2900 pixels is included in the downloadable zip files. the
program finds 4300 chunks when scanned as a grayscale and 71,000 when
scanned as a black/white in .8 seconds.) Developing the program led
into a number of areas that I hadn't previously explored so, regardless of
the usefulness of the program, it has been an interesting two week diversion.
Here are the major areas explored and what I learned:
- Bitmap Chunks: The heart of the original problem is
solved in the Countchunks procedure. The image is processed
from top to bottom using the TCanvas Scanline function. Each scan
line is converted to a string of 0's and 1's with 0 for white and 1 for
black. These are in turn converted to temporary chunk numbers; each string of contiguous 1' is assigned the next
higher number in an integer array, Line2. This new line is then used to update the cumulative
chunk line (Line1). As lines are merged, a separate
Chunks array of TChunk records is updated. Chunks
has an entry for each chunk and the Line1/Line2 values index into the array.
Each entry in Chunks contains X=number of points in the chunk,
Y=starting row for this chunk, and, optionally, a string list,
named Points, containing
the coordinates of each point in the chunk. During the lines
update, if corresponding column positions in each line have chunk numbers
assigned, the lower number replaces the higher and the lower chunk
number entry in Chunks is updated by adding the number of points from
the higher chunk, and retaining lower starting row of the 2 entries.
If point detail is being kept, the higher chunk point values are
appended to the lower chunk's string list. The process
may require multiple passes. For example, when the cumulative chunk is tracking two
vertical elements as separate chunks and the new line contains a
horizontal line (chunk) which connects the two vertical elements and all
three end up with the lowest number involved.
Points in the Chunks: Optionally, a Tstringlist
containing the points in each chunk can be built. This was
originally added to as a debugging tool so that the chunks could be
displayed. If points have been retained, clicked on the displayed
chunk number will show the list of points and display it's image.
Pixelformats: TBitmaps have the necessary logic to
convert images based on a specified PixelFormat property; specifying
pixel sizes of 1, 4, 8, 15, 16, 24, or 32 bit. Scanline
processing must decode based on the pixel size specified. So, for
example, scanlines for 1 bit black/white images will contain 8 pixels
per scanline byte and 24 bit pixel format scanlines use 3 bytes for each pixel. 8 bit pixel formats use a
special Windows trick; value are index pointers into a Color Lookup Table so that 256
full depth RGB
colors may be displayed . This program will scan using either 1
bit or 8 bit pixel format.
Processing Color/Grayscale Images
For code simplicity, I have used 8 bit pixel formats
for the "color" option and made all
bitmaps device independent (HandleType = bmDIB). With this
combination, colors in the image are loaded into a color lookup table
and the bitmap entries contain index values into the table. I
implemented two palette display options; the "Colors used" button
just displays the colors actually used in the current picture which were
retained as the chunks were generated. A 2nd button displays the
full 256 entry current color lookup table which appears to match the current system palette.
I don't understand much about the process except that lots of thing go
on behind the scenes. Here are a couple of links worth checking
out if you want to learn more palettes and color tables: Microsoft page
"Using DIBs with palettes" and Earl Glynn's excellent "Color Lab" pages at
lots of sample Delphi code.
Identifying "white": RGB vs HSB
- If we want to separate a
color bitmap into pieces, it may be that colors near white should count
as white. The program uses a "BlackCutoff field to identify
the highest value that should be counted as black.
(Remember that 0 represents black and 255 represents white. ) For
color or grayscale images, I calculate a single value for a color in one
of two ways: the
average of the individual RGB (Red, Green, Blue) sub-pixel values, and a "brightness" value
calculated from when an RGB color value is converted to a HSB (Hue,
Saturation, Brightness) color system value. The conversion
function was adapted from code found at
delphi.about.com, an excellent set of pages which seems to have
toned down the distracting "Flash" ad content since my last visit. There
are other "convert to grayscale" algorithms which apply unequal weights
to R, G, and B values but which I did not investigate.
ClearType: While testing, I happened to capture a screen image of a memo which was
displayed on a machine using Microsoft's ClearType technology.
ClearType renders black/white text using colored edge pixels for smoothing.
It was developed after lots of study of the psychology of human vision. If you run XP or Vista on an LCD monitor, chances are that you are reading
this with ClearType. Dig out your magnifying glass to check.
The Magnifier: The colors actually used by ClearType
can be seen by examining a screen display with a magnifying glass. I
decided to add a magnifier function to preview images "just for fun".
Performance: Several sample
bitmaps are included in the downloads below. The largest, a full
page 300dpi scan of a page of my scribbles, contains 2297x2993 pixels.
When processed as 256 bit grayscale it required 0.75 seconds to find
4392 chunks without point detail being kept and 0.8 seconds with point
detail. Processed as a 1-bit black/white image dithers the edges
of lines and finds 71,962 chunks. Time for this analysis was 0.9
seconds without point detail and 5.3 seconds with points.
Manipulating those 72,000 string lists takes a while! Above times
measured on a 1.6Ghz Core 2 Intel CPU on my Inspiron laptop.
Much more could be written about the program internals, but I believe
I'll wait for your feedback questions before spending more time working
Addendum December 14, 2007: A viewer recently discovered a
major memory leak in Bitmap Chunks which was corrected today with Version
3.1. The string lists used to track which pixels belong to which
"chunks" were not always freed before the next list was created.
A couple of other small cosmetic errors were also corrected.
Let's talk a little about the definition of "memory leak". The
viewer used an open source memory management program "FastMM4" (FastMM
Version 4) which discover the problem originally. It seems that it
reports unreleased memory at program termination time, however it cannot
distinguish between memory that has been increasing as the program runs
multiple cases, and memory that was not released for the last (or
only) case processed. For this reason, the viewer assumed
that the memory leak still existed even after it had been fixed. In my mind, the first condition
(increasing memory usage with run time) reflects a bug and the second
condition (last case memory not released) does not. After the program closes there are two more tasks that
can free allocated memory. Delphi memory management has a crack at it
and if Delphi misses it, Windows itself is guaranteed to free all
memory belonging to the program when it destroys the address space in which
the program was running. So the bottom line is that memory leaks
are only of concern while the program is running, not at termination time.
As much as I hate to admit it, the best definition I found on the web is the
two part definition from developer.apple,com:
"Memory leak: A bug in a program that prevents it from freeing discarded
memory and causes it to use increasing amounts of memory."
I compiled today's posting with FastMM4 and added an "OnClose"
exit to free that "last case" memory just to verify that the real leak had
been fixed. I removed FastMM4, but left the exit in. However I do not plan to routinely free
memory that will be released anyway in a few more microseconds.
Running/Exploring the Program
Suggestions for Further Explorations
||Should not be too difficult to accept JPEG images
grayscale conversions for setting "black cutoff" values might
improve chunk identification.
Original Date: August 26, 2007
February 18, 2016