1. CSE 490 G. Introduction to Data Compression. Winter 2006. Lossy Image
Compression. Transform Coding. JPEG. CSE 490g - Lecture 11 - Winter 2006. 2.
Lossy Image Compression Methods • DCT Compression – JPEG
• Wavelet Compression
CSE 490 G Introduction to Data Compression
– SPIHT – UWIC (University of Washington Image Coder) – EBCOT (JPEG 2000)
Winter 2006
• Scalar quantization (SQ). • Vector quantization (VQ).
Lossy Image Compression Transform Coding JPEG
CSE 490g - Lecture 11 - Winter 2006
2
Barbara
JPEG Standard • JPEG - Joint Photographic Experts Group – Current image compression standard. Uses discrete cosine transform, scalar quantization, and Huffman coding.
• JPEG 2000 uses to wavelet compression.
original
VQ CSE 490g - Lecture 11 - Winter 2006
3
JPEG
JPEG
32:1 compression ratio .25 bits/pixel (8 bits)
Wavelet-SPIHT CSE 490g - Lecture 11 - Winter 2006
4
CSE 490g - Lecture 11 - Winter 2006
6
VQ
CSE 490g - Lecture 11 - Winter 2006
5
1
Original
SPIHT
CSE 490g - Lecture 11 - Winter 2006
7
CSE 490g - Lecture 11 - Winter 2006
Distortion
Images and the Eye • Images are meant to be viewed by the human eye (usually). • The eye is very good at “interpolation”, that is, the eye can tolerate some distortion. So lossy compression is not necessarily bad. The eye has more acuity for luminance (gray scale) than chrominance (color). – Gray scale is more important than color. – Compression is usually done in the YUV color coordinates, Y for luminance and U,V for color. – U and V should be compressed more than Y – This is why we will concentrate on compressing gray scale (8 bits per pixel) images. CSE 490g - Lecture 11 - Winter 2006
original
compressed
x
y
MSE =
9
Decoder
1 n
n i =1
xˆ
( xi − xˆi ) 2
CSE 490g - Lecture 11 - Winter 2006
10
Rate-Fidelity Curve
PSNR
• Peak Signal to Noise Ratio (PSNR) is the standard way to measure fidelity. m2 ) MSE
38 36 34 32 30 28 26 24 22 20
SPIHT coded Barbara
Properties: - Increasing - Slope decreasing
0. 02 0. 13 0. 23 0. 33 0. 42 0. 52 0. 65 0. 73 0. 83 0. 92
where m is the maximum value of a pixel possible. For gray scale images (8 bits per pixel) m = 255.
• PSNR is measured in decibels (dB).
bits per pixel
– .5 to 1 dB is said to be a perceptible difference. – Decent images start at about 30 dB CSE 490g - Lecture 11 - Winter 2006
Encoder
decompressed
• Lossy compression: x ≠ xˆ • Measure of distortion is commonly mean squared error (MSE). Assume x has n real components (pixels).
PSNR
PSNR = 10 log10 (
8
11
CSE 490g - Lecture 11 - Winter 2006
12
2
PSNR Reflects Fidelity (1)
PSNR is not Everything
VQ
VQ
PSNR 25.8 .63 bpp 12.8 : 1
PSNR = 25.8 dB
PSNR = 25.8 dB
CSE 490g - Lecture 11 - Winter 2006
13
CSE 490g - Lecture 11 - Winter 2006
PSNR Reflects Fidelity (2)
14
PSNR Reflects Fidelity (3) VQ
VQ
PSNR 24.2 .31 bpp 25.6 : 1
CSE 490g - Lecture 11 - Winter 2006
15
Idea of Transform Coding
PSNR 23.2 .16 bpp 51.2 : 1
CSE 490g - Lecture 11 - Winter 2006
16
Decoding
• Transform the input pixels x0,x2,...,xN-1 into coefficients c0,c1,...,cN-1 (real values) – The coefficients are have the property that most of them are near zero – Most of the “energy” is compacted into a few coefficients
• Quantize the coefficients
• Entropy decode the quantized symbols • Compute approximate coefficients c’0,c’1,...,c’N-1from the quantized symbols. • Inverse transform c’0,c’1,...,c’N-1 to x’0,x’1,...,x’N-1 which is a good approximation of the original x0,x2,...,xN-1 .
– This is where there is loss, since coefficients are only approximated – Important coefficients are kept at higher precision
• Entropy encode the quantization symbols CSE 490g - Lecture 11 - Winter 2006
17
CSE 490g - Lecture 11 - Winter 2006
18
3
Mathematical Properties of Transforms
Block Diagram of Transform Coding Encoder input
coefficients
x
c
transform
symbols
bit stream
s
quantization bit allocation
b
entropy coding
• Linear Transformation - Defined by a real nxn matrix A = (aij)
a00
a 0,N-1
x0
c0 =
aN-1,0 symbols
s
entropy decoding Decoder
coefficients
output
c’
x’
decode symbols
inverse transform
CSE 490g - Lecture 11 - Winter 2006
• Orthonormality
19
c0
aN-1,N-1 c N-1
a 00
N−1
xN-1
aN-1,0 c0 +
+
a 0,N-1
i= 0
x0
21
i
=
i=0
2
A=
T
22
1 1 1 2 1 −1
A2 = A
A −1 = A
A = A = A −1 T
(c i − c'i ) = (c - c' ) (c - c' ) = (Ax - Ax' ) (Ax - Ax' ) 2
CSE 490g - Lecture 11 - Winter 2006
Compaction Example
• In lossy coding we only send an approximation c’i of ci because it takes fewer bits to transmit the approximation. Let ci = c’i + εi i=0
xi
xN-1
Squared Error is Preserved with Orthonormal Transformations
2
N−1 i= 0
CSE 490g - Lecture 11 - Winter 2006
N−1
2
coefficients
basis vectors
N−1
c i = c T c = (Ax)T (Ax)
= (x T A T )(Ax) = x T (A T A)x = x T x =
c N−1 = aN-1,N-1
20
• The energy of the data equals the energy of the coefficients
x0 =
a 0,N-1
(transpose)
Why Orthonomality
A Tc = x aN-1,0
A −1 = A T
c N-1
CSE 490g - Lecture 11 - Winter 2006
Why Coefficients a 00
aN-1,N-1 xN-1
orthonormal
T
= (A(x - x' ))T (A(x - x' )) = ((x - x' )T A T )(A(x - x' ))
1 1 1 b = 2 1 −1 b
2b 0
compaction
= (x - x' ) (A A)(x - x' ) = (x - x' ) (x - x' ) T
=
N−1 i =0
T
(xi − x'i )2
T
Squared error in original. CSE 490g - Lecture 11 - Winter 2006
23
CSE 490g - Lecture 11 - Winter 2006
24
4
Discrete Cosine Transform 1 i=0 N 2 (2j + 1)i cos 2N N
dij =
Basis Vectors 1
1
0 .8
0 .8
0 .6
0 .6
0 .4
0 .4
0 .2
0 .2
0
0
0
i>0
0
1
2
1
2
3
3 - 0 .2
-0 .2 - 0 .4
-0 .4 - 0 .6
-0 .6 - 0 .8
-0 .8 -1
-1
row 0
N=4 .5
.5
.5
row 1
1
1
0. 8
0. 8
0. 6
0. 6
.5
0. 4
0. 4
0. 2
0. 2
.65328 D= .5
.270598 - .270598 - .5 - .5
.270598
- .65328
.65328
- .65328 .5
0
- .270598
- 0 .4
- 0. 6
- 0 .6
- 0. 8
- 0 .8
.5 .5 .5
c0 +
.270598
c1 +
− .270598 - .653281
− .5 − .5
c2 +
.653281
c3 =
2
3
-1
row 3
CSE 490g - Lecture 11 - Winter 2006
26
Block Transform
x0
− .270598
.5
DC coefficient
- .653281
1
row 2 25
.270598
0
3
- 0. 4
Image .5
2
- 0 .2
Decomposition in Terms of Basis Vectors .653281
1
-1
CSE 490g - Lecture 11 - Winter 2006
.5
0
0
- 0. 2
Each 8x8 block is individually coded
x1 x2 x3
AC coefficients
CSE 490g - Lecture 11 - Winter 2006
27
2-Dimensional Block Transform
28
8x8 DCT Basis
Transform
Block of pixels X
a00 a10
a01 a02 a11 a12
a03 a13
x20 x21 x22 x23
a 20
a21 a 22
a23
x30 x31 x32 x33
a30
a31 a32
a33
x00 x01 x02 x03 x10 x11 x12 x13
Transform rows rij =
CSE 490g - Lecture 11 - Winter 2006
A=
N−1 k =0
Transform columns c ij =
akj xik
N−1 m =0
aimrmj =
Summary C = AXA
N−1 m=0
N−1
a im
k =0
a kj x mk =
N−1
N−1
m=0 k =0
aimakj xmk
T
CSE 490g - Lecture 11 - Winter 2006
29
CSE 490g - Lecture 11 - Winter 2006
30
5
Importance of Coefficients
Quantization
• The DC coefficient is the most important. • The AC coefficients become less important as they are farther from the DC coefficient. • Example Bit Allocation
• For a nxn block we construct a nxn matrix Q such that Qij indicates how many quantization levels to use for coefficient cij. • Encode cij with the label
8 7 5 3 2 1 0 0 7 5 3 2 1 0 0 0 5 3 2 1 0 0 0 0
compression 55 bits for 64 pixels = .86 bpp
sij =
c ij Qij
Larger Qij indicates fewer levels.
+ 0.5
• Decode sij to
3 2 1 0 0 0 0 0 2 1 0 0 0 0 0 0
c'ij = sij Qij
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 CSE 490g - Lecture 11 - Winter 2006
31
Example Quantization 24 c' = 2 ⋅ 24 = 48
• c= 54.2, Q = 6
54.2 + 0.5 = 5 12 c' = 5 ⋅12 = 60 s=
54.2 + 0.5 = 9 6 c' = 9 ⋅ 6 = 54 s=
CSE 490g - Lecture 11 - Winter 2006
32
Example Quantization Table
• c = 54.2, Q = 24 s = 54.2 + 0.5 = 2
• c = 54.2, Q = 12
CSE 490g - Lecture 11 - Winter 2006
16
11
10
16
24
40
51
61
12
12
14
19
26
58
60
55
14
13
16
24
40
57
69
56
14
17
22
29
51
87
80
62
18
33
37
56
68
109
103
77
24
35
55
64
81
104
113
92
49
64
78
87
103
121
120
101
72
92
95
98
112
100
103
99
Increase the bit rate = halve the table Decrease the bit rate = double the table 33
CSE 490g - Lecture 11 - Winter 2006
Zig-Zag Coding
34
JPEG (1987)
• DC label is coded separately. • AC labels are usually coded in zig-zag order using a special entropy coding to take advantage the ordering of the bit allocation (quantization).
• Let P = [pij], 0 < i,j < N be an image with 0 < pij < 256. • Center the pixels around zero – xij = pij - 128
• Code 8x8 blocks of P using DCT • Choose a quantization table. – The table depends on the desired quality and is built into JPEG
• Quantize the coefficients according to the quantization table. – The quantization symbols can be positive or negative.
• Transmit the labels (in a coded way) for each block. CSE 490g - Lecture 11 - Winter 2006
35
CSE 490g - Lecture 11 - Winter 2006
36
6
Block Transmission
Example Block of Labels
• DC coefficient – DC coefficients don’t change much from block to neighboring block. Hence, their labels change even less. – Predictive coding using differences is used to code the DC label.
• AC coefficients – Do a zig-zag coding.
5
2
0
0
0
0
0
0
-8
0
0
0
0
0
0
0
3
1
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Coding order of AC labels 2 –8 3 0 0 0 0 1 1 0 0 1 0 0 ..... CSE 490g - Lecture 11 - Winter 2006
37
Coding Labels 1 2 3 4
• A symbol has three parts (Z,C,B)
{0} {-1, 1} {-3,-2,2,3} {-7,-6,-5,-4,4,5 6 7}
– Z for number of zeros preceding a label 0 < Z < 15 – C for the category of the of the label – B for a C-1 bit number for the actual label
• End of Block symbol (EOB) means the rest of the block is zeros. EOB = (0,0,-) • Example: 2 –8 3 0 0 0 0 1 1 0 0 1 0 0 .....
• Label is indicated by two numbers C,B • Examples label C,B 0 2 -4
1 3, 2 4, 3
(0,3,2)(0,5,7)(0,3,3)(4,2,1)(0,2,1)(2,2,1)(0,0,-)
CSE 490g - Lecture 11 - Winter 2006
39
Coding AC Label Sequence • Z,C have a prefix code • B is a C-1 bit number 0 1010
1
2 01
100
1100
11011
11110001
1110
11111001
1111110111
111010
111110111
111111110101
– MPEG – uses DCT – H.263, H.264 – uses DCT
• Audio Coding – MP3 = MPEG 1- Layer 3 uses DCT
• Alternative Transforms
(0,3,2) (0,5,7) (0,3,3) (4,2,1) (0,2,1) (2,2,1) (0,0,-) 100 10 11010 0111 100 11 1111111000 1 01 1 11111001 1 1010 46 bits representing 64 pixels = .72 bpp CSE 490g - Lecture 11 - Winter 2006
40
• Video Coding
3
00
CSE 490g - Lecture 11 - Winter 2006
Notes on Transform Coding
Partial prefix code table
C 0 1 Z 2 3
38
Coding AC Label Sequence
• Categories of labels – – – –
CSE 490g - Lecture 11 - Winter 2006
41
– Lapped transforms remove some of the blocking artifacts. – Wavelet transforms do not need to use blocks at all. CSE 490g - Lecture 11 - Winter 2006
42
7