HowItWorks
Go back one page.

How Wavelet/Scalar Quantization Works:

The following block diagram gives an overview of the 3 main steps involved in WSQ encoding/decoding:

WSQ block diagram.

In the DWT step, the digital image is split into 64 spatial frequency bands by a 2-dimensional discrete wavelet transform, which is a cascaded multirate digital filter bank. The precision of the floating point DWT output is then truncated (``quantized'') by the scalar quantization step; this is the irreversible, ``lossy'' part of the process. Finally, the quantized DWT output is Huffman-coded (a form of ``entropy coding'') to minimize the number of bits that need to be transmitted.

To reconstruct the image after compression, the WSQ decoder undoes the Huffman coding, maps the quantized DWT coefficients back to close approximations of their original floating point values, and runs these quantized DWT coefficients through an inverse DWT.


Wavelets and Scaling Functions:

``So what do the wavelets look like?''

Here are the analysis wavelet (highpass filter) and scaling function (lowpass filter) used in the first-generation FBI WSQ coder. (Try to widen your web viewer enough to get the pictures side-by-side for comparison.)

Analysis wavelet (left/top); analysis scaling function (right/bottom) for first-generation FBI WSQ coder.

These look like analog filters (which they are), but they're obtained as ``continuum limits'' of cascaded digital filter banks, which is what we actually implement in practice. This is a lot like the relationship between the Fourier series expansion of a periodic function and the discrete Fourier transform of a sampled version, except that the wavelets aren't periodic. Instead of decomposing a signal using sines and cosines of different frequencies, we use dilates and shifts of the wavelet and the scaling function (implemented in the form of digital filter banks).

We use a different wavelet-scaling function pair to put the signal back together again; here are the wavelet and scaling function used for synthesis:

Synthesis wavelet (left/top); synthesis scaling function (right/bottom) for first-generation FBI WSQ coder.

``I hear that there are oodles of different wavelets out there, with more being discovered all the time. Are these the only wavelets the FBI will allow people to use on fingerprints?''

For the time being. Choosing wavelets for image coding applications is still a somewhat inexact science, depending on a lot of trial and error. There are a few ``standard'' wavelet families (including the above one) that seem to work well for image coding, although that is not a task for which they were specifically designed. In the future we hope to be able to design wavelets (or wavelet-like filter banks) that are optimized for a specific application, like fingerprints. Until then we'll probably stick with proven performers like the above family.

Just for fun, here's another family that works pretty well for image coding, although it was designed purely as a mathematical example. Note that these guys, unlike the above family, all have the same length supports, and that the wavelets are antisymmetric instead of symmetric.

Analysis wavelet (left/top) and scaling function (right/bottom).

Synthesis wavelet (left/top) and scaling function (right/bottom).


WSQ In Action:

Now let's look at what happens to a fingerprint when we run it through a DWT. Here's a source image:

and here's a visualization of its discrete wavelet transform decomposition:

Fingerprint DWT decomposition.

The original image (589824 pixels) has been decomposed into 64 different spatial frequency subbands containing a total of 589824 DWT coefficients. For visualization purposes we've put these 64 subbands together in a single large array that's exactly the same size as the original image, although we will be able to code it accurately using less than 1 bpp on average!

There are several interesting features worth pointing out in this picture. First, each of the 64 subbands has been normalized independently for visualization as a gray-scale image; this is why the background intensities appear to vary randomly. Independent normalization is necessary because the variances of these subbands differ by much more than this visualization would lead one to believe. For instance, in typical fingerprints like this one, the variance of the lowest frequency subband (upper left) surpasses the variance of the highest frequency band (lower right) by six orders of magnitude. This allows the encoder to deploy its resources more efficiently by allocating more bits to the bands with greater variances. When the above image is coded at a target of 0.75 bpp, for instance, the lowest frequency band (which contains only 1/1024th as many samples as the original image) receives a target allocation of 8.7 bits per DWT coefficient, while the highest frequency subband receives 0 bits.

Next, note the orientation-selectivity of the DWT subbands. The subbands along the top, which have been highpass or bandpass filtered along rows and lowpass filtered along columns, pick up the vertical line near the right edge of the picture (look for the ``energy'' along the right edge of these subbands). The subbands along the left side, however, which have been highpass or bandpass filtered along columns and lowpass filtered along rows, pick up the horizontal lines at the top and bottom edges of the picture.

An interesting side-effect of this orientation selectivity is that the band in the upper-right corner, which has been lowpass filtered twice along columns and highpass filtered twice along rows, has conveniently isolated all of the scanner noise in this particular image.

Finally, the interesting moire' patterns visible in some of the subbands represent aliasing due to the non-ideal bandpass nature of the digital filters used in the multirate wavelet filter bank. It's rather remarkable that this aliasing is completely cancelled by the inverse DWT in the absence of quantization noise or floating-point roundoff error (``perfect reconstruction in perfect arithmetic'').

Here's the reconstructed image (compressed file size 32702 bytes; compression ratio 18.0):


Next:


Last updated 25 June 2002.