Color image Compression Based DWT

Republic of Iraq Ministry of Higher Education And Scientific Research University of Baghdad College of Science

Color image Compression Based DWT

A thesis Submitted to the Collage of Science University of Baghdad In Partial Fulfillment of the Requirement for the Degree of Doctor of Philosophy in Astronomy Science.

Approved by Faisel Ghazi Mohammed

Dr. Loay A. Jorj

December 2005 AC

Supervised by and Bushra K. Al-Abodi

Baghdad

Dhu'l –qad'a 1425

‫بِسْمِ اللَّهِ الرَّحْمَنِ الرَّحِيمِ‬ ‫وَلَوْ أَنّ َمَا فِي الْأَرْضِ مِنْ شَجَرَةٍ‬ ‫أَقْالمٌ وَالْبَحْرُ يَمُدُّهُ مِنْ بَعْدِهِ سَبْعَةُ‬ ‫أَبْحُرٍ مَا نَفِدَتْ كَلِمَاتُ اللَّهِ إِنَّ اللَّهَ‬ ‫عَزِيزٌ حَكِيمٌ‬ ‫صدق اهلل العلي العظيم‬ ‫(لقمان‪)72:‬‬

To my mother, father and family for there providing and kindness.

ACKNOWLEDGMENT Thanks to God for giving me the health and desire to complete this work and for all every think. Especial gratefulness and deep gratitude to my supervisor Dr. Loay A. Jorj, and Dr. Bushra K. AlAbodi for there encouragement, guidance and support during my research work. My great thanks and respect for the warmly support and encouragement to all the members of Baghdad University\College of science\ Computer Science department. Most of all, I would like to thank my parents, brothers, sisters, my wife for encouraging me to continue in my educational goals. Last but not least I want thanks every one who had contributed in this work and I didn't mention their name.

Abstract In this research work, some low complexity and efficient coding approaches are proposed to compress color images. The suggested compression schemes consist of some coding modules. The first coding modules is indicated to perform color transform (i.e., YUV, YIQ, or YCbCr), their suitability was investigated experimentally. The second coding step is the application of the wavelet transform (tree decomposition); three known wavelet filters (i.e., Harr or biorthogonal wavelet filters (Tab3/5 and Tab7/9)) have been tested as the transform basis. The wavelet transform is exploited to resample the two chrominance subbands, and used to transform the luminance subband into its approximation coefficients and wavelet (detail) coefficients. The approximation coefficients of the two chrominance subbands and luminance subbands have been coded by using proposed lossless scheme. The third coding step is concerned with encoding the wavelet coefficients of the luminance subband. This is done by using a heirarical scalar quantization followed by modified bitslicing method to select the significant wavelet coefficients to be finally encoded using a suitable lossless coder. Huffman and LZW are used as lossless encoders. Three methods were proposed to solve the wavelet coefficients selection problem. The first is based on the idea of eliminating one (or more) of the least significant bit slice. While in the second method, the vector quantization mapping techniques is used, where the code-book is build for a given block size, then the bit-slices are partitioned into nonoverlapped blocks. Then, for each block the code-book is searched to find the best matched template, and finally the block index of the best matched template is sent instead of sending all the block elements. This third coding method (which is the best among previous mentioned methods) is based on the fact that the most repetitive blocks of the bit-planes (especially the most significant bit-planes) are blocks whose contents are (0’s), “cluster of zeros bits”. So, such blocks should be coded as one zero bit, while all other blocks shown should be coded by using variable codewords (whose size will be more than one bit). The obtained results showed that the performance of lossy schemes is image dependent; wavelet filters varies in their performance, since no specific biorthogonal wavelet filter performs better than the other on all images. Standard real images were used as test materials to investigate the performance of the suggested compression scheme; the results indicate that the efficiency of proposed scheme is encouraging when it is compared with state of the art JPEG and JPEG2000.

‫المستخلص‬ ‫اقترحت في هذا العمل البحثي عدة طرق ترميز ذات كفاءة عالية وتعقيد قليل لضغط الصور الرقمية الساكنة‬ ‫الملونة‪ .‬تتألف مخططات الضغط المقترحة من عدة نماذج ترميزية‪ .‬يشير نموذج الترميز األول إلى تطبيق‬ ‫تحويل لوني (مثل‪ ،YIQ ،YUV ،‬أو ‪ ،)YCbCr‬وقد تم فحص مالئمتها عمليا‪ .‬إشتملت خطوة الترميز الثانية‬ ‫على تنفيذ التحويل المويجي (التجزئة الشجرية)‪ ،‬ثالثة مرشحات مويجية معروفة (مثل هار‪ ،‬أو المرشحات‬ ‫المويجية ثنائية التعامد (‪ ))Tab7/9,Tab3/5‬والتي قد تم اختبارها لتعمل كأساس التحويل‪ .‬استغل التحويل‬ ‫المويجي إلعادة تمثيل اثنين من الحزم الفرعية اللونية و استخدامه لتحويل الحزمة الفرعية اللمعانية إلى‬ ‫معامالتها التقريبية والمويجية (التفاصيل)‪ .‬رمزت المعامالت التقريبية للحزم الفرعية اللونية واللمعانية‬ ‫بواسطة استخدام مخطط بالفقدان المقترح‪ .‬ارتبطت خطوة الترميز الثالثة بترميز معامالت الحزمة الفرعية‬ ‫اللمعانية‪ .‬نفذ هذا بواسطة استخدام التكميم العددي الهرمي متبوعا بطريقة تشريح‪-‬البتات المطورة إلختيار‬ ‫المعامالت المويجية المهمة لترمز نهائيا باستخدام مرمز بالفقدان مالئم‪ .‬استثمرت "هوفمان" و "ألزدبليو"‬ ‫لتستخدم كمرمزات بالفقدان‪.‬‬ ‫اقترحت ثالثة طرق لحل مشكلة أختيار المعامالت المويجية‪ .‬تعتمد األولى على فكرة حذف واحد (أو أكثر)‬ ‫من شرائح البتات األقل أهمية‪ .‬بينما استخدمت الطريقة الثانية تقنية نقل التكميم المتجه‪ ،‬حيث بني كتاب‪-‬‬ ‫الرموز لحجم قالب معطى‪ ،‬ثم قسمت شرائح‪-‬البتات إلي قوالب غير متراكبة ثم لكل قالب يتم البحث في‬ ‫كتاب‪-‬الرموز إليجاد القالب المثل األكثر مطابقة و أخيرا يتم إرسال عناوين القوالب األكثر مطابقة إلى مرمز‬ ‫بالفقدان بدال من كل عناصر القالب‪ .‬تعتمد طريقة الترميز الثالثة (والتي تعد األفضل من بين الطرق السابقة)‬ ‫على حقيقة أن القوالب المتكررة لمستويات البتات (خاصة مستويات‪-‬البتات األكثر أهمية) هي قوالب‬ ‫عناصرها هي األصفار "عناقيد من البتات الصفرية" لذلك يجب أن ترمز مثل هذه القوالب كبت صفري‬ ‫واحد‪ ،‬بينما ترمز كل القوالب األخرى باستخدام كلمات‪-‬ترميز متغيرة (حيث يكون طولها أكثر من بت‬ ‫واحد)‪.‬‬ ‫تبين النتائج المستحصلة أن األداء لمخططات الفقدان يعتمد على الصورة‪ ،‬وأن المرشحات المويجية متباينة‬ ‫في أدائها‪ ،‬حيث ال يوجد مرشح مويجيي ثنائي التعامد محدد أداءه أفضل من اآلخر في كل الصور‪.‬‬ ‫استخدمت صور حقيقة قياسية كمواد اختبار الستكشاف أداء مخطط الضغط المقترح‪ ،‬أشارت النتائج إلى أن‬ ‫كفاءة المخطط المقترح مشجع عند مقارنته مع حاالت "الجيبك" و "الجيبك‪."0222‬‬

To my mother, father and family for there providing and kindness.

0

TABLE OF CONTENTS Page

List of Abbreviations Chapter One

(General Introduction)

4

(1.1) Introduction……………..….……………………………………………

7

(1.2) Book Related Work…..….……………………........................................

11

(1.3) Book Objective……..…………………………………………………...

14

(1.4) Book Layout……………………………………………………..............

15

Chapter Two (Theory Fundamental) (2.1) Introduction……………………………………………….........................

17

(2.2) Lossless Compression…………………………………….........................

19

(2.3) Lossy Compression……………………………………….........................

19

(2.4) Color Principles.………………………………………………………….

21

(2.5) Color Space..……………………………………………………………...

22

(2.5.1) RGB Color Space ………………………..………………………..

23

(2.5.2) YUV Color Space………………………………….........................

24

(2.5.3) YIQ Color Space…………………………………………………...

24

(2.5.4) YCbCr Color Space…………………………………........................

25

(2.6) Color Image Coding…………………………………….………………..

26

(2.7) Image Zooming (Resampling)…………………………….. …………….

26

(2.8) Differential Coding……………………………………….........................

27

(2.9) Transform Coding………………………………………………………...

28

(2.10) Wavelet Transform Theory……………………………………………...

30

(2.10.1) Wavelet Filter Properties and Decomposition…….........................

30

1

(2.10.2) Harr Wavelet Filter…………………………….….........................

33

(2.10.3) Biorthogonal Wavelet Filters……………………………………...

34

(2.10.4) Lifting Scheme……………...………………………

35

(2.10.5) Harr and Lifting…………………………………….

37

(2.10.6) Symmetric Extension……………………………….

38

(2.10.7) Biorthogonal Tab3/5 Filter…………………………

39

(2.10.8) Biorthogonal Tab7/9 Filter…………………………

40

(2.10.9) Wavelet Coefficient Coding Schemes………………..

34

(2.11) Quantization ……………………………………………...

43

(2.11.1) Scalar Quantization…………………………………

43

(2.11.2) Vector Quantization………………………………...

45

(2.12) Lossless Entropy Coding…………………………………

45

(2.12.1) Shift Code (S-cod)………………………………….

46

(2.12.2) Huffman Code………………………………………

47

(2.12.3) Lempel-Zive-Welch Code……..……………..…….

47

(2.13) Rate Distortion Tradeoff………………………………….

48

(2.14) Model Complexity Tradeoff……………………………..

49

(2.15) Compression Efficiency Parameter……………………….

49

(2.15.1) Compression Ratio………………………………….

50

(2.15.2) Fidelity Criteria……………………………………..

50

(2.15.2.1) Mean………………………………………….

50

(2.15.2.2) Mean Square Error……………………………

51

(2.15.2.3) Signal to Noise Ratio………………………....

51

(2.15.2.4) Peak Signal to Noise Ratio……….…………..

51

Chapter Three (Proposed Image Compression System ) (3.1) Introduction……………………………………….………...

2

54

(3.2) Suggested Compression Scheme………………….….……..

55

(3.3) System Evaluation Methodology………………….….……..

57

(3.4) Color Transformation Results ……………………..….…….

59

(3.5) Chrominance Bands Down-Sampling……………..….……..

62

(3.6) Wavelet Filters Reconstruction Performance……..………...

68

(3.7) Proposed Lossless Coding Procedure………….….………...

71

(3.8) Proposed Lossy Coding Procedure…………….….…….…..

73

(3.8.1) Scalar Quantization…………………….…….………..

73

(3.8.2) Wavelet Coefficients Coding Methods……….….……

76

(3.8.3) Bit-Slicing (Bit Planes Partitioning)………..…………

76

(3.8.3.1) Bit-Slice Truncation (BS-T)…………………..…

78

(3.8.3.2) Bit-Slice Vector Quantization (BS-VQ)…………

81

(3.8.3.3) Bit-Slice Constant Block Size Truncation ………

82

(3.9) The Results of using LZW Lossless Coder.............................

85

(3.10) CR Controlling Parameters Discussion.................................

86

(3.11) Comparisons with JPEG and JPEG2000...............................

87

Chapter Four (System Performance Evaluation) (4.1) Introduction.............................................................................

92

(4.2) Comparisons with JPEG and JPEG2000…………………….

92

(4.3) Progressive Transmission……………………………………

102

Chapter Five (Conclusion and Future Suggestions) (5.1) Discussions…………………….…………………………….

104

(5.2) Conclusions…………………….……………………………

104

(5.3) Suggestions for Further work......……………………………

106

3

References

107

List of Abbreviations Shortcut

Full Description

ADPCM

Adaptive Difference Pulse Code Modulation

ASWDR

Adaptively Scanned Wavelet Difference Reduction

BR

Bit Rate

CDF

Cohen, Daubechies, and Feauvea (names of scientists) The International

Commission

on

Illumination (known

CIE

Internationale de l'Éclairage)

CR

Compression Ratio

DB

Decibel (a logarithmic measurement of image/sound)

DCT

Discrete Cosine Transform

DPCM

Difference Pulse Code Modulation

DWT

Discrete Wavelet Transform

EBWIC

Efficient Bit Allocation Wavelet Image Coder.

EZW

Embedded Zero tree Wavelet

FFT

Fast Fourier Transform

FIR

Finite Impulse Response

HVS

Human Visual System

l*a*b*

by the

Commission

Color Space of an area of colors of Independent Device type which is created by CIE (the symbol * denote nonlinear)

LSB

Least Significant Bit

MAR

Multiresolution Approximation

MRR

Multiresolution Representation

MSB

Most Significant Bit

MTCB

Multilevel Block Truncation Coding

nebs

Number of Eliminated Bit Slices

np

Number of Passes

NTSC

National Television Systems Committee (Video system adopted in the USA)

PAL

Phase Alternate Line (Video system adopted in the UK)

RCT

Reverse Color Transform 4

R-D

Rate Distortion

RGB

Red Green Blue

SMAWZ

Significant Map based Adaptive Wavelet Zero-tree

SPEBC

Set-Partitioning Embedded Block Coding

SPECK

Set Partitioned Embedded Block Coder

SPIHT

Set Partitioning In Hierarchical Trees

SQ

Scalar Quantization

SVD

Singular Value Decomposition

TC

Transform Coding

VQ

Vector Quantization

WBCT

Wavelet-Based Contourlet Transform

WDR

Wavelet Difference Reduction

WPEB Wavelet Packet-based Embedded Block YCbCr

Y or luminance/intensity signal, it contains the main picture information, Cb and Cr are first chrominance blue and first chrominance red respectively

YIQ

Y is same above, Inphase Quadrature

YUV

Y is same above, U is Hue, V is Saturation

5

6

Chapter One General Introduction 1.1 Introduction Uncompressed multimedia (graphics, audio and video) data requires considerable storage capacity and transmission bandwidth. Despite rapid progress in mass-storage density, processor speeds, and digital communication system performance, the demand for data storage capacity and data-transmission bandwidth continues to outstrip the capabilities of available technologies. The recent growth of data intensive multimediabased web applications have not only sustained the need for more efficient ways to encode signals and images, but have made compression of such signals central to storage and communication technology. The two fundamental components of compression are redundancy and irrelevancy reduction. Redundancy reduction aims to remove the duplication from the signal source (image/video). Irrelevancy reduction omits parts of the signal that will not be noticed by the signal receiver namely Human Visual System (HVS). Generally, three types of redundancy can be identified [Umbaugh, 1999]: 1. Spatial redundancy or correlation between neighbored pixels. 2. Spectral redundancy or correlation between different color planes or spectral bands. 3. Temporal redundancy or correlation between adjacent frames in a sequence of images (in video applications). Image compression techniques are experiencing many improvements under the discipline of transform coding. Many research groups have developed different image-

7

coding schemes and tried to modify these schemes to further reduce bit rate [Wongsawat,2004]. Compressing an image is significantly different than compressing raw binary data. Though the general-purpose compression programs can be used to compress images, their compression results are less than optimal, because images have certain statistical, spatial and spectral properties that can be exploited by higher order entropy encoders specifically designed for them. Also, some of the fine details in the image can be sacrificed for the sake of saving a little more bandwidth or storage space. This means that lossy compression techniques can be used in this area. Lossless compression involves compressing data which when decompressed will be an exact replica of the original data. This is the case when binary data (such as executables, documents etc.) are compressed, they need to be exactly reproduced when decompressed. On the other hand, images (and sounds too) need not be reproduced ‘exactly’. An approximation of the original image is enough for most purposes, as long as the error between the original and the compressed image is tolerable. Figure (1.1) shows the functionality of image and video data compression in visual transmission and storage [Shi, 2000].

Input

Image and Video Compression

Transmission or Storage

Output Data Reconstruction or Data Retrieval

Fig. (1.1) Image and video for visual transmission and storage[Shi, 2000].

Efficient image compression scheme must not only reduce the necessary storage and bandwidth requirements, but also allow sub-image extraction for editing, processing, and targeting for particular devices and applications. The JPEG2000 image compression system has a rate-distortion advantage over the original JPEG, more

8

importantly it allows extraction of different resolutions, pixel fidelities, regions of interest, components, and more, all from a single compressed bitstream [Gormish,2000]. This allows an application to manipulate or transmit only the essential information for any target device from any JPEG2000 compressed source image [Marcellin,2000]. The information in Table (1.1) shows the qualitative transition from simple text to full-motion video data in terms of the required disk space, transmission bandwidth, and transmission time needed to store and transmit such uncompressed data.

Table(1.1) Multimedia data types and uncompressed storage space, transmission bandwidth, and the required transmission time required. The prefix kilo- denotes a factor of 1000 rather than 1024 [Saha,2001].

Multimedia Data

A page of text

Size/Duration

''11x 8.5''

Telephone quality

11sec

speech

(8 K sample/sec)

Grayscale Image Color Image

Transmission Bits/Pixel or Uncompressed Bandwidth Size Bits/Sample (b for bits) (B for bytes) Varying

8-4KB

resolution

44-23

Transmission Time (using a 28.8 Kb/sec. Modem)

3.3 - 1.1sec

Kb/page

8bps

81KB

44Kb/sec

33.3sec

213x 512

8bpp

343KB

3.1Mb/image

1min 13 sec

213x 512

34bpp

684KB

4.36

2min 39 sec

Mb/image Medical Image

3148x 1680

13bpp

2.14MB

41.2

32min 54 sec

Mb/image SHD Image

3148x 2048 441x 480, 1 min

Full-motion Video

21)frames/sec)

34bpp

13.28MB

34bpp

1.44GB

111Mb/image 28min 15 sec 331Mb/sec

2days 8 hrs

The examples above clearly illustrate the need for sufficient storage space, large transmission bandwidth, and long transmission time for image, audio, and video data. At the present state of technology, the only solution is to compress multimedia data before 9

its storage and transmission, and decompress it at the receiver for play back. For example (in video image transmission), with a compression ratio of 32:1, the space, bandwidth, and transmission time requirements can be reduced by a factor of 32, with acceptable quality.

1.2 Related Works 1.

Colored Image Compression:

(a) [Shen,1997] have proposed a wavelet based color image coding algorithm, where data rate scalability was achieved by using an embedded coding scheme. It was useful for video applications. (b) A useful comparative study among 11 color transforms for color image coding have been introduced by Hao [2000]. The results show that the DCT is the best among these color transforms. (c) Al-Abudi [2002] has proposed an efficient colored image compression system, where it is based on the transform coding combined with adaptive block truncation coding methods. Exhaustive experimental results showed this method was efficient for real applications. (d)

A modified hybrid DCT-SVD image-coding system was proposed by [Wongsawat,2004] to encode both monochromatic and color images. The simulation results have shown good image quality at low bit rates compared with the conventional hybrid DCT-SVD image-coding algorithm 2.

Wavelet-based Image Compression: (a) A useful comparison between Wavelet, fractal, and DCT study has been made by Kramer [Kramer,1997]. The results show that the “pure” compression algorithms is better than hybrid compression schemes, and the fixed quantization does not improve the wavelet encoder. (b) A rich survey and its performance testing about image coding methods has been introduced by [Egger,1999]. The results show that predictive methods are suited to lossless and low-compression applications, while transform-based coding 01

schemes achieve higher compression ratios for lossy compression but suffer from blocking artifacts at high-compression ratios, and multiresolution approaches are suited for lossy as well for lossless compression (c) [Munteanu,1998], has presented two wavelet based (non-linear and integer-tointeger wavelet transforms) compression techniques able to operate in the lossless mode. This was done by using lifting scheme. Experimental results show that these techniques provide better rate-distortion performances than the embedded wavelet zerotree (EZW) combined with the integer wavelet transform, with the advantage of supporting progressive refinement of the decompressed images. (d) Cumming and Ting, [2002] have investigated the Wavelet Packet-based Embedded Block (WPEB) scheme for SAR data compression. Examples were given using RADARSAT data, which show that the compression performance was better than conventional wavelet methods and visual image interpretation is acceptable at 1 bit per pixel (bpp) (e) Pearlman et, al., [2003], have proposed an embedded, block-based, image wavelet transform coding algorithm of low complexity. Extensive comparisons with SPIHT and JPEG2000 show that this proposed algorithm was highly competitive in term of compression efficiency. 3.

Biorthogonal Wavelet Filters: (a) Odegard and Burrus [1997], have introduced a family of smooth, symmetric biorthogonal wavelet basis. Image compression examples applying these filters with the EZW compression algorithm showed that these basis functions performs better when compared to the classical CDF 7/9 wavelet basis. (b) A useful determination of LeGall 5/3 (Tab3/5) and Daubechies 9/7 (Tab7/9) filters key mathematical features (order of approximation, and regularity) was introduced by [Unser,2003]. Tab7/9 stands out because it was very close to orthonormal, but this turns out to be slightly detrimental to its asymptotic performance when compared to other wavelets with four vanishing moments

00

4.

Quantization: (a) Przelaskowski et. al., [1998], have investigated several uniform scalar quantizers, and found that they are typically near optimal for high bit rate applications, but are inefficient for low bit rate applications. (b) Gupta and Mutha [2003], have introduced some modification in wavelet transform and quantization methods. The experimental results show that this proposed algorithm was superior on the existing image compression standard like JPEG.

5.

Wavelet Coefficients Selection Methods: (a) An image compression algorithm called embedded block coding with optimized truncation (EBCOT) was proposed by [Taubman,2000]. The algorithm had little complexity, and from the experimental results it is suitable for applications involving remote browsing of large compressed images. (b) Rajpoot et. al. [2003], have introduced a presentation of a general zerotree structure which is based on applying adaptive wavelet transform to encode the transform coefficients and to progressively encode the image. the results showed that this adaptive wavelet zerotree image coder had a relatively low computational complexity, its performance is comparative with the state-of-theart of image coders, and was capable of progressively encoding images. (c) An embedded block-based image wavelet transform coding algorithm of low complexity has been proposed by Tang [2004]. This block is called 3D-SPECK that provides progressive transmission. The results show that this algorithm was better than JPEG2000 in compression efficiency. (d) Sudhakar et. al., [2005], have presented a very rich information about methods of wavelet coefficients encoding algorithms. The presentation show that the performance of each scheme depends on the coded data, because each schema has its own characteristics.

01

6.

Progressive Transmission of Encoded Data: (a) Some solutions about the transmission of images across noisy channels problem have been introduced by Cosman et. al., [2000]. The results show that the hybrid coder performs well across all channel conditions and degrades more gracefully under the most severe conditions. (b) Boulgouris et. al., [2003], proposed a scheme that has the ability to decode portions of the bitstream even after the occurrence of uncorrectable errors. This scheme was compared with other robust image coders, and it was shown that it is suitable for transmission of images over memoryless channels 7.

Standard JPEG2000: (a) A useful demonstrations of the JPEG2000 standardization process was provided by [Grosbois,2001], [Skodras,2001], [Marcellin,2000], [Gormish,2000],

and

[Boliek,2000]. (b) A proposal for some extensions to a new JPEG2000 standard, which allow for the efficient coding of floating point data, is found in [Usevitch,2003], also see [Rountree,2003] for more information about JPEG2000.

1.3 Research work Objective The main objective of the current research work is to establish a simple and efficient image compression scheme. The established scheme should has low complexity, high performance (low bit rate, high image quality), and progressive. In this Book the concern was mainly focused on the traditional relation between the compression ratio (achieved by the wavelet transform), reconstructed image quality and the rate of the control parameters on this relation. Different standard images have been used as test images, especially these that have high textural regions, because the core of this research work is dedicated to apply an adaptive coding scheme on the very busy regions in the image.

01

1.4 Hypothesis The Book investigates the validity of the following the hypothesis: 1. The energy of the Difference Pulse Code Modulation (DPCM) output of the image is important to be considered acriteria to pre-assess the compression system performance. 2. The raw image data must be arranged in a suitable way before sending the image to the lossless coder. 3. Wavelet transform is efficiently capable to decompose the image signal. This ability gives the lossless coder to work more efficiently after wavelet transform. 4. Hierarchical Quantizer is powerful when applied on the wavelet coefficients. 5. For highly texture or smoothed images the encoder based on selection of significant wavelet coefficients is better than a direct wavelet coefficient coding.

1.5 Book Layout The reminder of this Book is organized as follows:  Chapter 2 (Theory Fundamentals) is dedicated to illustrate the theoretical disciplines related to color models/transforms, image losseles\lossy coding, wavelet methods, entropy coding (Huffman and LZW). Also, a survey of some image coding methods is given. The focus is on image compression methods that have had major impact on the development of wavelet image compression.  Chapter 3 (Proposed Image Compression System) presents the implemented steps to perform the suggested (image coding) methods; their performance and efficiency are investigated. Also, in this chapter the way of exploiting the usefulness of the considered coding methods in our proposed image coding system is illustrated.  Chapter 4 (System Performance Evaluation) focuses on colored image compression based wavelet transform, and describes the Book original work and its purpose. To evaluate the compression results of the testes conducted on some standard real images, some comparisons between these with their correspondence results obtained 01

by using some image compression standards (like JPEG and JPEG2000) are presented. Also, the wavelet coefficients selection problem is investigated to understand why and how the criteria of wavelet coefficients selection will affect the reconstructed image quality and compression ratios.  Chapter 5 (Conclusions and Future Work) is dedicated to discuss the presented test results. The essential and necessary analysis of these results is shown. Some useful suggestions and solutions to handle the observed weak aspects are proposed. Also, suggestions for future work and improvements are given.

01

06

Chapter Two Theory Fundamentals

2.1 Introduction The objective of image compression is to reduce both spatial and spectral redundancy of the image data in order to be able to store or transmit data in an efficient form. Yet some other applications demand both lossy and lossless compression or even progressive compression from lossy to lossless. Lossy compression methods usually introduce compression artifacts near sharp edges, and they are suitable for natural images such as photos or medical imagery [Chen, 2004a]. The current research work will be devoted to investigate lossy color image compression with low complexity and efficient system. Its performance will be evaluated by making comparisons between its results with coding results of some common use image compression standard (i.e., JPEG and JPEG2000). JPEG2000 is one of the modern standards for still image compression. It provides a wide range of functionality’s for still image applications, like Internet, color facsimile, printing, scanning, digital photography, remote sensing, mobile applications, medical imagery, digital library, and E-commerce. Comparative results have shown that JPEG2000 is indeed superior to some other established still image compression standards [Skodras,2001]. One of the principal advantages of JPEG2000 over previous compression methods is the ability to perform both lossless and lossy compression with a single algorithm. Lossless compression is obtained by using a reversible integer Tab3/5 wavelet filter, followed by the entropy coder. No explicit quantization is used, but lossy compression can be obtained from a lossless encoded file by truncating the bitstream at the appropriate point. This unique capability is important for data base 07

management, where the archive is lossless but a client may require only a lower quality image [Rountre, 2003]. If an image is compressed, it clearly needs to be uncompressed (decoded) before it can viewed. Lossless compression frequently involves some form of entropy encoding, which is based on information theoretic techniques, while lossy compression uses source encoding techniques that may involve transform coding, differential coding or vector quantization as shown in Figure (2.1). The transform coding schemes achieve higher compression ratios for lossy compression, but it suffers from some artifacts at high-compression ratios. Multiresolution approaches are suited for lossy as well for lossless compression. At lossy high-compression ratios, the typical artifact visible in the reconstructed images is the ringing effect [Egger, 1999].

Coding Techniques

Source Coding

Entropy Coding Repetitive Sequence Suppression

Transform Coding

Statistical Encoding

Differential Coding

Vector Quantization

DPCM Zero Length Suppression

Pattern Run Substitution Length Encoding

Shannon Fano

FFT

DCT

DWT ADPCM

Huffman Coding

Fig. (2.1) Classification of the Coding Techniques [Marshall,2001].

08

2.2 Lossless Compression Lossless compression is a descriptive term for the encoding and decoding processes in which the output of the decoding process is identical to the input to the encoding process. Distortion free restoration can be assured. Lossless processes require reversible systems [Boliek, 2000]. A lossless color image compression method has to employ some reversible color transforms (RCT), or doesn’t use any color transform at all (i.e., it uses RGB) [Chen, 2004a]. Lossless compression schemes can be crudely classified as follows [Clunie,2000]: 1. Predictive schemes with statistical modeling, in which differences between pixels and their neighbor are computed and their context modeled prior to coding. 2. Transform based coding, in which images are transformed into the frequency or wavelet domain prior to modeling and coding. 3. Dictionary based schemes, (like Huffman and LZW) in the most probable symbols are replaced with shorter codewords. In lossless methods, no information is lost; every bit of the original image can be recovered. If an image has no more than 256 different color triples in it, then color table can exactly recreate it. Lossless compression techniques permit the perfect reconstruction of the original image, but the achievable compression ratios are only of (2:1) to (4:1) depending on the image characteristics [Munteanu,1998]. Some lossless methods were investigated and utilized in the current research work.

2.3 Lossy Compression A lossy data compression method may cause some differences between the retrieved data (after compression) and the original data, but both data sets should be "close enough" to be useful in some way. This type of compression is frequently used on the internet and in streaming media and digital telephony applications. Almost all

09

practically used algorithms adopt a quantization stage, which makes the algorithms lossy, in order to achieve the desired compression ratio [Makkapati,2003]. In general, the lossy compression schemes could be categorized into two basic branches: the predictive (differentials) coding and transform coding (TC). Both coding schemes utilize inter-pixel correlation, and they are efficient coding schemes. In comparison with transform coding, the predictive coding (like, DPCM) is simpler in computation; it needs less storage and has less processing delay, but it is more sensitive to image variations. Besides, transform coding provides higher adaptation to image statistics variation. Transform coding is capable of removing more inter-pixel correlation, thus providing higher coding efficiency. Traditionally, predictive coding is preferred if the bit rate is within the range of two to three bits per pixel, while transform coding is preferred when the bit rate is below two bits per pixel. However, the situation changes. Nowadays transform coding is the core of the most families image and video coding schemes [Shi, 2000]. Any modern coding schemes is based on combining these two coding branches; this new scheme is called Hybrid (transform/waveform) coding method, or simply hybrid coding. This method was introduced in order to combine the merits of the two methods. By waveform coding we mean coding techniques that code the waveform of a signal instead of the transformed signal. DPCM is a waveform coding technique. Hybrid coding combines TC and DPCM coding ( i.e., transform coding can be first applied rowwise, followed by DPCM coding columnwise, or vice versa. In this way, the two techniques complement each other. That is, the hybrid coding technique simultaneously has TC’s small sensitivity to variable image statistics and DPCM’s simplicity in implementation [Shi, 2000]. Lossy methods are most often used for compressing sound or images. In such cases, the retrieved file can be quite different from the original at the bit level while this difference should be indistinguishable to the human ear or eye for most practical purposes. Many methods focus on the idiosyncrasies of the human anatomy, taking into 11

account, for example, that the human eye can see only certain frequencies of light. The psychovisual model describes how image can be highly compressed without degrading the perceived quality of the image. Flaws caused by lossy compression that are noticeable to the human eye or ear are known as compression artifacts.

2.4 Color Principles The human eye cannot see all radiant energy waves, such as radio waves, for example. The small range of frequencies that it could be sensed is called the visible spectrum. The waves of light are measured in wavelengths or frequencies, from them the color is determined. Wavelengths of visible light range from 380nm to 760nm, see Figures (2.2 and 2.3) [Dunn,1999]. Most colors can be recreated using combinations of pure red, green, and blue. The graph shown in Figure (2.4) presents the spectral response functions of the red, blue, and green cones in the eye and the luminous efficiency function of the eye at those colors.

Fig. (2.2) Electromagnetic spectrum.

10

Fig. (2.3) The relative amounts of the RGB primaries to recreate the colors of the visible spectrum [Dunn,1999].

Fig. (2.4) Spectral response [Dunn,1999]

2.5 Color Space A color space is a mathematical representation of a set of colors, and it specifies how color information is represented. The problem of color representation affects almost every field in computer vision. Many ways have been suggested for modeling and representing colors. The most four popular color models used in this research are RGB (used in computer graphics), YUV, YIQ and YCbCr (used in video systems). Most color space models in use today are oriented toward either hardware or applications in which color manipulation is a goal. Table (2.1) summarizes the most popular color space models and some of their applications [Kim, 2005].

11

Table (2.1) Color Space Models and their Applications. Color Space Models

Applications

Non-uniform spaces RGB,

storage, color TV broadcasting,

YIQ, YUV, YCbCr

processing, analysis coding

Hardware-Oriented Uniform spaces L*a*b*, L*u*v*

color difference analysis, color management systems

color image manipulations, Application-Oriented

HIS, HSV, LHS

computer graphics

2.5.1 RGB Color Space RGB is perhaps the simplest color space for us to understand; it is represented as a cube, see Figure (2.5). There are a lot of color spaces instead of the common RGB system that most commonly used in computer graphics. Many of these other color spaces are derived by applying linear functions of r, g, b. For example, a color space based upon 3 coordinates V1, V2, and V3 can be written as [Bourke, 2000]:

V1 V2 V3

=

ar1 ar2 ar3

ag1 ag2 ag3

ab1 ab2 ab3

R G ,..........................................(2.1) B

where the ar, ag, and ab are constants for a particular color space. Similarly, for any such system there are linear functions to go back to RGB space. The coefficients dr, dg, and db can be derived by solving the above equations for (R,G,B) to get: R G B

=

dr1 dr2 dr3

dg1 dg2 dg3

db1 db2 db3

V1 V2 ,..........................................(2.2) V3

11

However, RGB color space is not considered as an appropriate representation for color. This is due to the strong correlation between the three coordinates in the RGB color space [Omer,2003]. To generate the color in the RGB format, all three color components should be of equal bandwidth. This requires more storage space and bandwidth. Also, processing an image in the RGB space is more complex since any change in the color of any pixel requires all the three RGB values to be read, calculations performed, and then stored. Therefore, if the color information is stored in the intensity and color format, some of the processing steps can be made faster [Payette, 2002].

Fig. (2.5) RGB Color Cube.

2.5.2 YUV Color Space YUV is a way of representing color in terms of Y component (which is the brightness or luminance), and U, V components (which define the chrominance). Since the human eye is most sensitive to brightness, and less sensitive to the spatial variation of color and saturation, this means the color resolution of U and V and the spatial resolution of U and V can be reduced with little visual impact for the viewer, providing quick compression just by removing a lot of redundant information. The equations to go from RGB to YUV are [Hatizler,2003]: Y U V

=

0.299 0.587 0.114 -0.147 -0.289 0.436 0.615 -0.515 -0.10

R ,...............................………...(2.3) G B 11

2.5.3 YIQ Color Space YIQ (NTSC version of YUV) is a television transmission color space (analogue NTSC and PAL) adopted by National Television System Committee (American NTSC) for color TV broadcasting. Y is the luminance component and is usually referred to as the luminance component (it comes from CIE standard), while I,Q are the chrominance components (the "I" stands for "Inphase" and the "Q" for Quadrature" , which is the modulation method used to transmit the color information ). The YIQ color solid is made by a linear transformation of the RGB cube. Also its purpose is to exploit certain characteristics of the human eye to maximize the utilization of a fixed bandwidth. NTSC space corresponds to the color encoding used for color broadcasting in USA, whereas PAL space corresponds to the color encoding used in Europe. NTSC and PAL have different screen resolutions, frequencies, and are otherwise incompatible. However, in terms of how color values are calculated, NTSC space and PAL space are both identical to YIQ space . The basic equations to make conversion between RGB and YIQ are [Shi, 2004]: Y I Q

0.299 0.587 0.114 = 0.596 -0.274 -0.322 0.211 -0.523 0.312

R G,……...............……………(2.4) B

2.5.4 YCbCr Color Space It is a human perceptual color space that is widely used in the image and video processing community. The conversion of RGB to YCbCr is necessary for further color image and video processing. Experimental results indicate that the luminancechrominance (YCbCr) space outperforms better than several well-known color models (e.g., RGB, HSV, and L*a*b*) for all test images, tacking into consideration that the human eye is less sensitive to high frequencies in chrominance [kim,2005].

11

Since the eye is less sensitive to Cb and Cr, engineers do not need to transmit Cr and Cb at the same rate as Y. Less storage and bandwidth is needed design cost. YCbCr is a adopted by JPEG and JPEG2000 [Christopoulos,2000]. The basic equations to convert from RGB and YCbCr are described in matrix form as follows [Shi, 2004]: Y Cb Cr

0.257 0.504 0.098 =-0.148 -0.291 0.439 0.439 -0.368 0.071

R G ,...................………………(2.5) B

2.6 Color Image Coding Color images are everywhere in science, technology, medicine and industry. Color images are acquired and reproduced by using on tristimulus (triplet) values whose spectral composition is carefully chosen according to the principles of color science. Color space transform is critical for color feature extraction and data redundancy reduction [Chen, 2004b]. The color planes (like RGB) are usually highly correlated, so that transformation to a less correlated space is mandatory for efficient compression [Pearlman,2003]. Most visual coding schemes generally first compress luminance image components, and then extend coding to color components. They use color spaces with luminance and chrominance channels, where the latter may easily be down-sampled, due to the fact that they carry much less information than the luminance component. Moreover, in the classical coding paradigm, decorrelation between channels is generally seen as an advantage. However, when a similar strategy is applied to scalable color image coding, it often causes colors to only appear after a certain time, or with important distortion [Ventura,2003]. A simple application, called set-partitioning embedded block coding (SPEBC), on a color image is done by coding each color space plane separately as does a conventional color image coder. Then, the generated bit-stream of each plane would 16

be serially concatenated. However, this simple method would require bit allocation among color components, losing precise rate control and would fail to meet the requirement of full embeddedness of the image coder, since the decoder needs to wait until the full bit-stream arrives to reconstruct and display [Pearlman,2003]. In the recent thesis, simple coding schemes with low complexity were proposed to encode color images.

2.7 Image Zooming (Resampling) Image zooming may be viewed as over-sampling, while shrinking may be viewed as under-sampling. The key difference between these two operations are in the sampling and quantization operation that is applied on the original continuous image [Gonzalez,2002]. In this work, the shrinking process based on wavelet transform was performed on the two chrominance components (i.e., I,Q and Cb,Cr) to get high compression ratio while preserving the quality.

2.8 Differential Coding The simplest methods for lossless image coding are based on differential pulse code modulation (DPCM) encoding method, where no information loss is allowed during compression. These methods are usually used for lossless image compression. For all types of images, direct coding using an entropy coder does not achieve any considerable degree of compression. Therefore, all coding techniques appeared in the literature employed coding methods based on information decorrelation. One of the possible coding ways is to use differential models to achieve this goal. The idea behind DPCM method is to encode the value of the difference between the previously encoded pixel and the current pixel. Due to the correlation existence in the natural images, the resulting values to be encoded have a lower dynamic range than the original values [Egger,1998]. In DPCM scheme one tries to predict a value Px,y (original pixel) based on the coded past. The resulting difference between the value of coded pixel Px,y and the previous pixel Px,y-1 is given by: 17

Dx,y = Px,y - Px,y-1 ,..........................................................................(2.6) 1. If the successive pixels are close to each other we need to encode only the first pixel with a large number of bits, while the coded value of other pixels will be their difference values. To achieve high performance, some ideas have been adopted in this thesis which are as follows: 2. The direction of calculating the differences must be zig-zag (inverted from line-to-line) as shown in Figure (2.6). 3. The orientation of differencing may be horizontal, vertical or diagonal, the choice of orientation should depend on the direction of lowest gradient energy (i.e., direction of most energetic edges). The energy of three principal edge directions (i.e., horizontal, vertical, and diagonal) should be calculated, then the direction that have lowest energy is chosen as the best differencing directions. This procedure is followed in our research work to get a lower dynamic range of pixels differencing values.

Fig. (2.6) Direction of horizontal differencing calculation.

18

The lossless mode of the joint photographic expert group (JPEG) compression standard is a predictive scheme. Seven different prediction methods have been defined in the standard JPEG. All these seven are based on a prediction of the next input with up to three previously encoded local neighbors. It was noticed that none of these prediction methods clearly outperform the other ones for in applications [Egger,1998].

2.9 Transform Coding The transform coding is based on the idea of representing the pixel to be compressed in an alternate domain (e.g. space-scale instead of cartesian space for an image). Then, a rate distortion (R-D) optimization is performed in the transform domain. By design, this transform should have two properties for the class of pixels to be compressed. In the First property, the transform should concentrate on the majority of the pixel energy into a small number of coefficients (thus having minimal distortion for low rates). In the Second property, the transform should result in de-correlated coefficients. This allows a simple scalar quantizer to have performance near that of a more complicated vector quantizer [Durant,1998].

2.10 Wavelet Transform Theory The main application of wavelet theory lies in the design of filters for sub-band coding. Sub-band coding is a coding strategy that tries to isolate different characteristics of a signal in a way that collects the signal energy into few components. This is referred to as energy compaction. Energy compaction is desirable because it is easier to efficiently encode these components than the signal itself [Ramakrishnan,2002]. In the wavelet literature the name scaling function is denoted to the function that used to find the low-low subband, while the name wavelet function is denoted to the

19

function that used to find the low-hi, hi-low, hi-hi subbands (wavelet differences coefficients). Wavelet decompositions are efficient for a large number of common image properties such as gentle background gradients (scale localization) and edges (spatial localization) [Ortega,1998]. For many images, energy is concentrated in the lower scale sub-images. Also, there is normal redundancy between sub-images (both within a scale and between scales) which can be exploited by algorithms such as those using Sharpiro’s Zerotree [Sharpiro,1993]. In general, the wavelet transform provides another solution for reducing the blocking artifact. It is obtained by successively iterating a 2-channel filter bank on its low-pass output. As a result, it generates an octave-band filter bank with longer low frequency basis functions and shorter high frequency basis functions. Therefore, when the wavelet transform is applied to the whole image, blocking artifact can be eliminated satisfactorily [Liang,2003]. Also, the discrete orthogonal variants with finitely supported basis functions have linear computational complexity O(N),

while that of DCT or other Fourier-related

transforms is O(NlogN). The reason for the optimal complexity of the latter variants is the applicability of multiresolution methods. In this case, the wavelet transform can be implemented in terms of a hierarchical application of finite impulse response (FIR) filter banks together with down-sampling of the low frequency parts [Kutil,2004]

2.10.1 Wavelet Filter Properties and Decomposition Several properties are desired in any wavelet basis for images. Some of the most important properties are [Antonin,1992]: 1. Smooth reconstruction filter (since smooth areas dominate most images). 2. Short filters (for efficient computation). 3. Linear phase filters (which lead to simpler cascade structures). 11

The most commonly used implementation of the discrete wavelet transform (DWT) is based on MALLAT’s pyramid algorithm [Savaton,2000]. It consists of recursive application of the low-pass/high-pass one-dimensional filter bank successively along the horizontal and vertical directions of the image (see Figure 2.7). The low-pass filter provides the smooth approximation coefficients while the high-pass filter is used to extract the detail coefficients at a given resolution. Each filtering step is followed by a sub-sampling operation, corresponding to decreasing of the resolution from one transformation level to the following. After applying the 2-D filler bank at a given level n, the detail coefficients are output, while the whole filter bank is applied again upon the approximation image until the desired maximum resolution is reached. Figure (2.8) shows the usual representation of a threelevel wavelet decomposition of an image. The DWT gives us three parts of multiresolution representation (MRR) and one part of multiresolution approximation (MRA) [Merchant,2003]. It is similar to hierarchical subband system, where the sub-bands are logarithmically spaced in frequency. The sub-bands labeled LH1, HL1, and HH1 of MRR represent the finest scale wavelet coefficients. To obtain the next coarser scale of wavelet coefficients, the subband LL1 (that is MRA) is further decomposed and critically subsampled. LLn

LHn LLn-1 HLn

HHn

ID DWT along the X axis

1D DWT along the Y axis

Fig. (2.7) 2D DWT Using MALLAT’s Filters Bank [Savaton,2000]. 10

Fig. (2.8) DWT decomposition of an image.

The sub-bands are labeled by using the following conventions: 1. LLn is the approximation image at resolution (level decomposition) n, resulting from low-pass filtering in the vertical and horizontal directions. 2. LHn represents the horizontal details at resolution n, and results from horizontal low-pass filtering and vertical high-pass filtering. 3. HLn represents the vertical details at resolution n, and results from vertical lowpass filtering and horizontal high-pass filtering. 4. HHn represents the diagonal details at resolution n, and results from high-pass filtering in both directions. Wavelet compression is not good for all kinds of data, transient pixel characteristics mean good wavelet compression, while smooth and/or periodic pixels are better compressed by other methods. Some new published researches aimed to combine discrete cosine transform with a wavelet transform to produce a hybrid compression method that would be good for representing a mixture of smooth signals and transients (as exhibited by most popular music and the images that contain high contrast area ) [Ortega,1998].

11

First a wavelet transform is applied. This produces as many coefficients as there are pixels in the image (i.e., up to stage it is only a transform with no compression). Then, coefficients can then be compressed more easily because the information is statistically concentrated in just a few coefficients. This principle is called transform coding. After that, the coefficients are quantized and the quantized values are entropy encoded and/or run length encoded, examples for wavelet compression: JPEG 2000 and SPIHT. To perform the forward DWT, a one-dimensional subband is decomposed into a set of low-pass samples and a set of high-pass samples. Low-pass samples represent a smaller low-resolution version of the original. The high-pass samples represent a smaller residual version of the original, it is needed for a perfect reconstruction of the original set from the low-pass set. In orthogonal wavelet analysis, the number of convolutions at each scale is proportional to the width of the wavelet basis at that scale. This produces a wavelet spectrum that contains discrete “blocks” of wavelet power and is useful for signal processing as it gives the most compact representation of the signal. Conversely, a nonorthogonal (biorthogonal) analysis is highly redundant at large scales, where the wavelet spectrum at adjacent times is highly correlated. The nonorthogonal transform is useful for time series analysis, where smooth, continuous variations in wavelet amplitude are expected [Torrence,1997]. In the well-known wavelet multiresolution decompositions, the coefficients of the filters are fixed. They are not calculated in order to optimally fit the image data. Moreover, most of lossless image compression algorithms used today doesn’t use multiresolution decomposition. Besides, they based on image context-based predictors [Weinberegr,2000]. In the resent research work some popular wavelet filters were adopted. The theoretical idea behind these filters is shown in the next sections.

11

2.10.2 Harr Wavelet Filter Consider two neighboring samples (a and b) of a sequence numbers and have some correlation. The simple wavelet transform which replaces a and b by the average s and difference d is done according to the following equations: [Sweldens,1996] s

ab 2

,……………………………………………….……….(2.7)

d  b  a ,………………………………………………………...(2.8)

The idea is that if a and b are highly correlated, the expected absolute value of their difference d will be small and can be represented with fewer bits. In case that a = b the difference is simply zero. No loss in any information will occur because given s and d could be always recover a and b as follows: [Sweldens,1996] as

d 2

b  s

d ,………………….……………….…………………….(2.10) 2

,…………………………………………………………(2.9)

The above two equations is the key behind the so-called Harr wavelet transform. The whole Harr transform can be thought of as applying a N×N matrix (N=2n) to the signal sn, where n is the signal length. The cost of computing the transform is only proportional to N. This is, in general, a linear transformation of an N vector, it requires O(N2) operations. The hierarchical structure of a wavelet transform allows switching to and from the wavelet representation in O(N) time [Sweldens,1996].

2.10.3 Biorthogonal Wavelet Filters The LeGall 5/3 (also called Tab3/5 because the low-pass filter length is 5 and the length for high pass filter is 3) and Daubechies 9/7 ( also called Tab7/9 because the filters lengths are 9 and 7 for low and high pass filters, respectively) have risen to

11

special prominence because they were selected to be the kernel transform in JPEG2000 standard [Unser,2003]. In various papers, the terms Biorthogonal Tab3/5 and Tab7/9 also called CDF(2,2) and CDF(4,4) respectively, because it was presented by Cohen, Daubechies, and Feauvea [Shlager,2004]. In conclusion, the biorthogonal wavelet systems strike a good balance between regularity and reduced support [Dogaru,2001]. Biorthogonal wavelet decomposition are efficient for lossy and near losses image compression, hence they are used in JPEG2000 standard [Bekkouche,2002]. Most of the filters used in wavelet transforms have floating point coefficients. Since the input images have integer entries, the filtered output no longer consists of integers, and losses will result from rounding. For lossless coding it is necessary to make a reversible mapping between the input integer image and its integer wavelet representation. Odegard and Burrus, showed that an integer version of every wavelet transform employing finite filters can be built with a finite number of lifting steps [Odegard,1997]. In order to handle filtering at image boundaries, symmetric extension is used (illustrated in section 2.13). Symmetric extension adds a mirror pixel to the outside of the boundaries so that large errors are not introduced at the boundaries. The default irreversible transform is implemented by means of the biorthogonal Tap7/9 and Tap3/5 filters. The analysis filter coefficients for the Tap7/9 filters, which are used for the dyadic decomposition, are given in Table (2.2), and a graph of the corresponding wavelet is shown in Figure (2.9) [Christopoulos,2000].

11

(a)

(b)

Fig. (2.9) CDF(4,4) or Biorthogonal 7/9 Wavelet System, (a) Scaling Function, (b) Wavelet Function [Odegard,1997].

Table (2.2) Impulse Response of the Low High Pass Synthesis Filter for the Tab 7/9 Wavelet Transform [Boliek, 2000] i

Lp(i)

Hp(i)

0

1.115087052456994

0.6029490182363579

±1

0.5912717631142470

-0.2668641184428723

±2

-0.05754352622849957

-0.07822326652898785

±3

-0.09127176311424948

0.01686411844287495

±1

0

0.02674875741080976

all other values

0

0

2.10.4 Lifting Scheme The basic idea behind lifting is that it provides a simple relationship between all multiresolution analyses that share the same low pass filter or high pass filter. The low pass filter gives the coefficients of the refinement relation which entirely determines the scaling function [Daubechies,1996]. The lifting scheme was introduced by Sweldens [Sweldens,1995] in order to construct wavelet decompositions by a simple, reversible and fast process. Its main application is found in lossless image compression.

16

Daubechies and Sweldens showed that any biorthogonal wavelet decomposition with FIR (Finite Impulse Response) filters can be represented by a lifting scheme, and hence all the well-known wavelets used in lossy image coders can be closely approximated by integer-to-integer wavelets [Caldeebank,2001]. The main feature of lifting is that it provides an entirely spatial interpretation of the wavelet transform, as opposed to the more traditional Fourier based constructions [Claypoole,1998]. Let us take the 1-D DWT decomposition, it consists of a set of scaling coefficients C0[n] (i.e., LL sub band image) at scale j=0 , and a set of wavelet coefficients Dj[n] (i.e., LH, HL,HH sub band images) at scales j=1,2,....,J. The forward DWT has an efficient implementation in terms of a recursive multiscale filter bank based around a low-pass filter h and high-pass filter g. The inverse DWT employs an inverse filter bank with low-pass filter ĥ and high-pass filter ĝ [Claypoole,1998]. In the remaining part of this section a short demonstration of lifting procedure is given, more details are found in [Claypoole,1998]. Lifting consists of the iteration of the following three basic operations [Claypoole,1998]: 1. Split: Divide the original data into two disjoint subsets. For example, the original data set X[n] is divided into two sets: Xe[n]=X[2n], the even indexed points; and Xo[n]=X[2n+1], the odd indexed points. 2. Predict: Generate the wavelet coefficients D[n] as the error in predicting Xo[n] from Xe[n] using prediction operator P: D[n]=Xo[n]-P(Xe[n]) ,……………………………………..…(2.11) 3. Update: Combine Xe[n] and D[n] to obtain scaling coefficients C[n] that represent a coarse approximation to the original signal X[n] . This is accomplished by applying the update operator U to the wavelet coefficients and adding to Xe[n]: C[n]=Xe[n]+U(D[n]) ,……………………………….…..…..(2.12) Figure (2.10) shows the diagram of lifting stage:

17

Xe[n]

C[n]

even

+

X[n]

Spli t odd

Predic t

Xo[n]

Updat e

D[n]

‫ـ‬

Fig. (2.10) Lifting Stages [Sweldens,1996].

These three steps form the lifting stage. Iteration of the lifting stage on the output C[n] creates the complete set of DWT scaling and wavelet coefficients Cj[n] and Dj[n]. The lifting steps are easily inverted, even if P and U are non-linear or space varying. Rearranging (2.11) and (2.12), we have: Xe[n]=C[n]-U(D[n]) ,…………………………………….……(2.13) Xo[n]=D[n]+P(Xe[n]) ,……………..…………………..………(2.14) The predictor design and update design are shown in more details in [Claypoole,1998].

2.10.5 Harr and Lifting This section demonstrates how to perform Harr by lifting transform as introduction to lifting ideas. The novelty lies in the way of computing the difference and average of two numbers a and b. Suppose that we need to compute the whole transform in-place (see Figure 2.12), without using auxiliary memory locations, by overwriting the locations that hold a and b with the values of s and d respectively. This cannot immediately be done with the 18

formulas given in the previous section (2.10.2), and would lead to the wrong result. Computing s and overwriting a leads to a wrong d (in case of computing the average after the difference). Therefore suggest an implementation in two steps is suggested. First, only compute the difference [Sweldens,1996]: d = b – a ,……………..…………………………….……….(2.15) and store it in the location of b. Then, the value of a and the newly computed difference d are needed to find the average as: s = a + d/2 ,…………….…………………………….……….(2.16) This gives the same result because a + d/2 = a + (b - a)/2 = (a + b)/2. The advantage of splitting into two steps, is that to have the ability to overwrite b with d and a with s, this will lead to no auxiliary storage is needed. So, the above equations could be written as follows [Sweldens,1996]: b = b - a ,…………..…………………………………………..(2.17) a  a

b ,……………………………………………………….(2.18) 2

where b is the difference and a is the average. The computations can be done in-place. Moreover it could immediately find the inverse without formally solving a 2×2 system: simply run the above equations backwards (i.e., change the order and flip the signs). Assume a is the average and b is the difference, then: [Sweldens,1996] a  a

b ,..……..…….……………………….………………....(2.19) 2

b = b + a ,……..……..……………………….…….………....(2.20) The equation recovers the values a and b in their original memory locations. This particular scheme of writing a transform is called the lifting scheme.

19

2.10.6 Symmetric Extension In order to handle filtering at image boundaries, symmetric extension can be utilized. Where the row/column pixels vector P in the image is extended beyond its left and right boundaries, in the extend pixel vector Pext , and then the 1-D wavelet filtering procedure with inverse filters is applied on the extended pixels Pext to produce the desired filtered pixels D [Boliek, 2000]. Figure (2.11) shows the periodic symmetric extension, where the first sample of the pixel vector is P0, and the last pixel of the vector is PW. P2 P1 P0 P1 P2 ……PW-2 P W-1 P W PW-1 P W-2 Fig. (2.11) Periodic Symmetric Extension [Boliek, 2000].

The number of values of the extension parameters (xleft, xright) for the transforms Tab3/5 and Tab7/9 are given in Table (2.4).

Table (2.4) Symmetric extension [Boliek, 2000]. (a) To left

x

(b) To right

xleft Tab3/5 xleft Tab7/9

x

Xright Tab3/5

xright Tab7/9

even

2

4

odd

2

4

Odd

1

3

even

1

3

2.10.7 Biorthogonal Tab3/5 filter The reversible transformation described in this section is the lifting-based implementation of the Tab3/5 filter. This filter implies a sequence of simple filtering operations for which alternately, odd pixel values are updated with a weighted sum of

11

odd pixel values which is rounded to an integer value, and even pixel values are updated with a weighted sum of odd pixel values which is rounded to an integer value [Boliek, 2000]. The odd coefficients of the output pixel C are computed first for all

values of

x such that -1≤ 2x+1 ≤ W+1 [Boliek, 2000]: C( 2x  1)  Cext ( 2x  1) 

Cext ( 2x)  Cext ( 2x  2) ,…………….(2.21) 2

where W is the original image width. Then the even coefficients of the output signal C are computed from the even values of extended pixel Cext, and the odd coefficients of pixel C for all values of x such that -1≤2x+1≤W+1 [Boliek, 2000]: C( 2x)  Cext ( 2x) 

Cext ( 2x  1)  C( 2x  1)  2 ,…….………….(2.22) 4

The values of C(k), for 0≤k≤W, form the output of the 1-D filtering procedure. This procedure is applied in both directions (vertical and horizontal) to get the final transformed image.

2.10.8 Biorthogonal Tab7/9 Filter The scheme of the transformation (irreversible) described in this section is the lifting-based scheme of Tab7/9 wavelet transform [Antonini,1992]. Figure (2.12) illustrates in a graphical way the lifting hierarchy of Tab7/9 transform. It is worth noticing that the whole transform can be performed in-place, leading to significant memory saving. Black dots correspond to the even-index input samples (replaced by low-pass coefficients at the end of the in-place computation), while white dots stand for the odd-index input samples (replaced by the output high-pass wavelet coefficients). The computations are performed as follows [Boliek, 2000]: (a) First lifting step: It is performed for all values of x such that –3 ≤ 2x+1≤W+3 : 10

C(2x+1)=Cext(2x+1) + α × (Cext(2x)+ Cext(2x+2)) ,….……..…....(2.23) (b) First dual lifting step: It is performed for all values of x such that –2 ≤ 2x ≤W+2 : C(2x)=Cext(2x) + β × (Cext(2x-1)+ C(2x+1)) ,…………..…….….(2.24) (c) Second lifting step: It is performed for all values of x such that –1 ≤ 2x+1 ≤W+1 : C(2x+1)=C(2x+1) + γ × (Cext(2x) + C(2x+2)) ,…….……..…..…(2.25) (d) Second dual lifting step: It is performed for all values of x such that 0 ≤ 2x ≤ W : C(2x)=C(2x) + δ × (C (2x-1) + C(2x+1)) ,…………………....…(2.26) (e) Scaling step: 1. It is performed for all values of x such that 0 ≤ 2x+1 ≤ W : C(2x+1)= - K × C (2x+1) ,………………….….…...……..……(2.27) 2. It is performed for all values of x such that 0 ≤ 2x ≤ W : C(2x+1)= - (1/K) × C (2x) ,……………….……..……….….....(2.28) where the values of the parameters (α, β, γ, δ ) are [Boliek, 2000]: α = -1.586134342, β = -0.052980118, γ = 0.882911075, δ = 0.443506852 .

and the scaling factor K is equal to: K=1.230174105.

11

Original Vector

(a) (b) (c) (d) (e)

Transformed Vector

Fig. (2.12) 1-D DWT Dependency Graph of the Lifting-based Tab7/9 Biorthogonal Wavelet Transform [Savaton,2000].

2.10.9 Wavelet Coefficients Coding Schemes Image coding by using hierarchical structures on scalar quantization applied on transformed images has been very effective and computationally simple technique. Shapiro was the first to introduce such a technique with his EZW algorithm, see [Shapiro,1993]. Different variants of this technique have appeared in the literatures, which provide an improvement over the initial work. Said & Pearlman successively improved the EZW algorithm by extending this coding scheme, and succeeded in presenting a different implementation based on a setpartitioning sorting algorithm called the SPIHT, which provides an even better performance than the improved version of EZW [Said,1996]. All of the popular scalar quantized schemes employ some kind of significance testing of sets or groups of pixels, in which the set is tested to determine whether the maximum magnitude in it is above a certain threshold. The results of these significant tests determine the path taken by the coder to code the source samples. These significance-testing schemes are based on some complex principles, which allow them to exhibit excellent performance.

11

Among these principles are: the partial ordering of magnitude coefficients with a set-partitioning sorting algorithm, bit plane (bit slicing) transmission in decreasing bit plane order, and exploitation of self-similarity across different scales of an image wavelet transform. All previous works after wavelet transformation of an image represent wavelet coefficients using trees because of the sub-sampling that is performed in the transform. The low subband coefficient can be thought of as having four descendants in the next higher subband. Each of the four descendants, has four descendants in the next higher subband, as shown in Figure (2.13).

Fig. (2.13) Quadtree Structure Emerge every Root has Four Leafs [Sudhakar,2005].

The very direct approach is to simply transmit the values of the coefficients in decreasing order, but this is not very efficient. In this approach case a lot of bits are spent on the coefficient values. Other better approach is to use a threshold and only signal to the decoder the coefficients whose values are larger or smaller than the threshold (see Pearleman, 2003 for more details). In the current research work, simple coding techniques were adopted to encode the wavelet coefficients after Quantization stage as will be explained in more details in the next chapter.

11

2.11 Quantization The quantization process is an irreversible process, its objective is to represent the elements of large set in terms of smaller set, the benefits obtained from this conversion is to reduce the number of bits required to represent all possible values of mapping outputs to fewer bits needed to describe its representatives (or approximates). According to the mapping methodology adopted to perform the quantization of image (or its mapping coefficients), the quantization processes could be classified into [Jorj, 1997]: 1.

Scalar Quantization (SQ), each input symbol is treated individually to produce the output.

2.

Vector Quantization (VQ), the input symbols are assembled together in groups, called vectors, and processed to give the output. This assembling of data and treating them as a single unit, increases the optimality of the vector quantizer, but at the cost of increased computational complexity.

2.11.1 Scalar Quantization The typical way to handle floating-point data is to do an initial quantization. This approach allows standard techniques to be used on the quantized data but may be unacceptable to researchers due to the irreversible data loss [Usevitch,2003]. The quantizer is a function whose set of output values are discrete and usually finite. Obviously, this is a process of approximation and a good quantizer is one which represents the original signal with minimum loss or distortion [Sudhakar,2005]. The most important factor that significantly affect the wavelet coefficients coding compression and reconstruction error is the wavelet coefficients quantization step. Most widely used quantizers are uniform quantizers. Quantization operation can be described as: qx,y = fx,y / s ,………………………..………….……..…………(2.29)

11

where, qx,y is the quantized value, fx,y is the original value, and s is the quantization step. For uniform quantizer the quantization step s is same for all wavelet coefficients; in other words for uniform quantization process the quantization step size is constant. The difference between fx,y and sqx,y is called the quantization error. Performance of the quantizer can be compared by using the sum of quantization error (δ), which is given as [Gupta,2003]: L

  2

fk 1

(f

K 1 fk

 q) 2 Pf ( f )df ,……………………………………..….(2.30)

If the quantization error is greater than the step size it is called as overload error. If the quantization error is less than the step size it is called granular error. Total error is the sum of granular error as well as overload error, it depends on the quantization step size. Although the above mentioned quantizer is not adaptive, much can be gained from adjusting the quantization levels according to the local behavior of an image. In theory, slowly changing region can be finely quantized, while the rapidly changing areas are quantized more coarsely. This approach simultaneously reduces both granular noise and slope overload, while requiring only a minimal increase in code rate. There are many effective ways for determining the quantizer step size. In the current research work three simple methods were adopted: 1. Fixed Quantization method: the quantizer step s size is constant for all sub-bands. 2. Hierarchical based Quantization method: the quantizer step size is decreased for each level by using the heirarical relation: s = s × ai-1 ,………..….………….……………..………..…...(2.31) where, s (quantizer step) is used to quantize the wavelet coefficients, and a is the attenuation parameter (such that a