WolframAlpha.com
WolframCloud.com
All Sites & Public Resources...
Products & Services
Mathematica
Mathematica Online
Development Platform
Programming Lab
Data Science Platform
Finance Platform
SystemModeler
Enterprise Private Cloud
Enterprise Mathematica
Wolfram|Alpha Appliance
Enterprise Solutions
Corporate Consulting
Technical Services
Wolfram|Alpha Business Solutions
Products for Education
Wolfram|Alpha
Wolfram|Alpha Pro
Problem Generator
API
Data Drop
Mobile Apps
Wolfram Cloud App
Wolfram|Alpha for Mobile
Wolfram|Alpha-Powered Apps
Services
Paid Project Support
Training
Summer Programs
All Products & Services »
Technologies
Wolfram Language
Revolutionary knowledge-based programming language.
Wolfram Cloud
Central infrastructure for Wolfram's cloud products & services.
Wolfram Science
Technology-enabling science of the computational universe.
Computable Document Format
Computation-powered interactive documents.
Wolfram Engine
Software engine implementing the Wolfram Language.
Wolfram Natural Language Understanding System
Knowledge-based broadly deployed natural language.
Wolfram Data Framework
Semantic framework for real-world data.
Wolfram Universal Deployment System
Instant deployment across cloud, desktop, mobile, and more.
Wolfram Knowledgebase
Curated computable knowledge powering Wolfram|Alpha.
All Technologies »
Solutions
Engineering, R&D
Aerospace & Defense
Chemical Engineering
Control Systems
Electrical Engineering
Image Processing
Industrial Engineering
Mechanical Engineering
Operations Research
More...
Education
All Solutions for Education
Web & Software
Authoring & Publishing
Interface Development
Software Engineering
Web Development
Finance, Statistics & Business Analysis
Actuarial Sciences
Bioinformatics
Data Science
Econometrics
Financial Risk Management
Statistics
More...
Sciences
Astronomy
Biology
Chemistry
More...
Trends
Internet of Things
High-Performance Computing
Hackathons
All Solutions »
Support & Learning
Learning
Wolfram Language Documentation
Fast Introduction for Programmers
Training
Videos & Screencasts
Wolfram Language Introductory Book
Virtual Workshops
Summer Programs
Books
Need Help?
Support FAQ
Wolfram Community
Contact Support
Premium Support
Premier Service
Technical Services
All Support & Learning »
Company
About
Company Background
Wolfram Blog
Announcements
Events
Contact Us
Work with Us
Careers at Wolfram
Internships
Other Wolfram Language Jobs
Initiatives
Wolfram Foundation
MathWorld
Computer-Based Math
A New Kind of Science
Wolfram Technology for Hackathons
Student Ambassador Program
Wolfram for Startups
Demonstrations Project
Wolfram Innovator Awards
Wolfram + Raspberry Pi
Summer Programs
More...
All Company »
Search
This is documentation for an obsolete product.
Current products and services
User's Guide
Image Transforms
8.8 Example: Block Transform Coding
The transform coding method compresses image data by representing the original signal with a small number of transform coefficients. It exploits the fact that for typical images a large amount of signal energy is concentrated in a small number of coefficients.
The goal of transform coding is to minimize the number of retained transform coefficients while keeping distortion at an acceptable level.
Transform coding is an integral part of one of the most widely known standards for lossy image compression, the JPEG (Joint Photographic Experts Group) standard.
What is the best transform to use for purposes of image compression? The discrete Karhunen-Loeve transform, also known as the Hotelling transform, has the property that it maximizes the number of coefficients that are small enough to be discarded. Equivalently, the KL transform minimizes the mean squared error between the original and reconstructed images for a given number of coefficients. However, since no fast algorithm exists for the computation of the KL transform, the
DCT
is typically used.
This loads the package.
In[1]:=
This loads one of the example images.
In[2]:=
Block transform coding divides an image into blocks of equal size and processes each block independently. Block processing allows the coder to adapt to local image statistics, exploit the correlation present among neighboring image pixels, and to reduce computational and storage requirements. JPEG uses 8 × 8 blocks. Larger blocks do not offer significantly better compression, since correlation between pixels tends to decrease as the block size grows. A fixed block size allows the design of optimized software and hardware. Typically the blocks do not overlap, which significantly reduces computational complexity. However, overlapping block processing has proven useful in reducing distortion at very high compression factors.
Here we compute the block DCT transform of an example image. As required by JPEG, the image is first normalized with respect to the middle gray level.
In[3]:=
Here a multiblock segment of the original image and the corresponding DCT transform blocks are shown.
In[4]:=
Out[4]=
This selects an 8×8 block of DCT coefficients for use in the next few calculations.
In[5]:=
Out[5]//MatrixForm=
An inspection of the block cosine transform reveals that a large number of the cosine coefficients have relatively small values. For purposes of compression, it is of interest to discard these small values. This may be implemented in a number of alternative ways. The retained coefficients may be selected based on their position in the DCT block, typically the low-frequency zone of the transform domain. Alternatively, they may be selected based on their value relative to a fixed threshold. The zone method is simpler, but inefficient, since large-valued coefficients outside of the prescribed zone may be discarded, increasing the approximation error. For this reason the threshold method, with its ability to capture all large-valued coefficients and adapt to the local image statistics, is preferred.
Here is a typical triangular zone mask that retains 15 of a total of 64 DCT coefficients.
In[6]:=
Out[6]//MatrixForm=
Here are the DCT coefficients of the
block after application of the mask.
In[7]:=
Out[7]//MatrixForm=
The 15 remaining coefficients define an approximation of the original signal. The distortion or approximation error may be measured in a number of equivalent ways. A commonly used metric is the so-called signal-to-noise (SNR) ratio. In the present case, the noise is the approximation error or difference between the original and reconstructed signals. SNR may be defined as
where
[i,j] and
_{Z}
[i,j] represent the original and zone masked DCT coefficients, respectively. The SNR is typically given in units of decibels (dB). The denominator expression defines the error signal power, while the numerator is the signal power. The SNR in decibels can be calculated using the
ImageProcessing
function
ImageEnergy
.
In[8]:=
Out[8]=
We now turn to the threshold method. We pick a threshold, say T=30, and discard all coefficients with magnitudes smaller than this threshold. The threshold was selected to yield the same number of coefficients as the zone mask.
In[9]:=
Out[9]//MatrixForm=
The SNR improves approximately 4 dB from the application of threshold masking instead of zone masking.
In[10]:=
Out[10]=
The threshold masking method improves image fidelity, but complicates the reconstruction process. Note that the positions of the coefficients need to be known in order to correctly reconstruct an approximation of the original image. The JPEG standard adopted a clever scheme to order the coefficients in a 1D list, so that a coefficient's position in this list uniquely identifies its place in the 2D coefficient array. This is accomplished by ordering all the coefficients according to the following zig-zag scanning pattern:
(
.
.
.
.
.
.
)
Additionally, all zeros beyond the last nonzero value are dropped and replaced by a special end-of-buffer (EOB) symbol. Here is the list of coefficients within block
listed in zig-zag order: {-474,-68,0,0,-62,102,0,-71,73,0,0,0,41,-58,45,70,0,64,-42,0,0,0,0,0,0,0,-33,-57,42,EOB}
The retained coefficients need to be quantized and coded to complete the compression process. In reality, the threshold and quantization steps are combined by dividing the coefficient values by scaling factors stored in a quantization table Q. The standard luminance quantization table has the following scaling factors.
In[11]:=
Here is the result of scaling
with quantization mask Q.
In[12]:=
Out[12]//MatrixForm=
The scaling factors may be changed according to a quality factor q with values
. A factor of
results in images practically indistinguishable from the original. As the factor increases, the distortion in the compressed image increases. Here we compute the quantized block DCT coefficients of the example image.
In[13]:=
The ratio of the number of retained coefficients to the total number of DCT coefficients (i.e., N
^{2}
) may serve as an approximate measure of compression due to the transform coding operations.
In[14]:=
Out[14]=
JPEG decoding reverses the processing steps described in previous sections. We proceed with computing an approximation of our example image using the standard quantization table and evaluate the approximation error.
In[15]:=
In[16]:=
Out[16]=
Here are the error signal's dynamic range and SNR (in dB).
In[17]:=
Out[17]=
This ends the example.
Enable JavaScript to interact with content and submit forms on Wolfram websites.
Learn how »