Jun 15, 2010 - Filtered using analytic filters or âanalytic waveletsâ. â Transformed to complex domain and then fi
Exploring Complex Wavelets and their Application to Optical Character Recognition Makarand Tapaswi MERIT Master under the guidance of Prof. Philippe Salembier
Structure • Introduction to Complex Wavelets – Motivation – Key Idea
• Implementations – Redundant Complex Wavelet Transform (Dual-Tree based) – Non-Redundant Complex Wavelet Transform (Projection based)
• Properties additional to Perfect Reconstruction – Shift Invariance – Better directionality – Others...
• Application to Optical Character Recognition – Simplification of Text – Comparison with Baseline (DCT) and Real Wavelets
• Conclusion June 15th 2010
Complex Wavelets and OCR
Complex Wavelets Motivation • Continuous Wavelet Transform [1] – – – –
Too many coefficients Shift invariant Poor directionality Missing phase information
• Discrete Wavelet Transform – – – –
Critically sampled, no redundancy Shift sensitive (due to sub-sampling) Poor and Limited directionality Missing “phase” information like that of the Fourier Transforms
June 15th 2010
Complex Wavelets and OCR
Complex Wavelets Key Idea • Real signals [2] – Filtered using analytic filters or “analytic wavelets” – Transformed to complex domain and then filtered with DWT wavelets
• Analytic function: – – – –
Complex valued Real function projected onto real and imaginary subspaces Single Sided spectrum! But, cannot be applied to the finite support wavelets so approximate
• Properties – Of course retain perfect reconstruction – Additional features like shift invariance, more directionality June 15th 2010
Complex Wavelets and OCR
Analytic Filter • Complex extension of real signal where • Real 2-sided spectrum, now becomes 1-sided
• Similarly, decompose real filter coefficients to get a pair of real filter coefficients which together form “analytic filter” June 15th 2010
Complex Wavelets and OCR
Redundant CWT (Complex Wavelet Transform) • Dual-Tree based CWT • m-dimensional signal, 2m:1 redundancy • • • •
Two conventional DWT filter banks, working in parallel Interpreted as complex because they are in quadrature Overall filter (1-D case): 2 – trees Overall filter (2-D case): 4 – trees
• No more useful for signal compression June 15th 2010
Complex Wavelets and OCR
Dual Tree DWT - Structure
June 15th 2010
Complex Wavelets and OCR
Kingsbury’s DTDWT • Observation: Shift invariance possible if not sub-sampled • In the case of dual-tree structure Delay the filter of one tree by 1 sample than the other tree • Main Design criteria – Perfect reconstruction – Shift invariance: modeled by alias free reconstruction
• Two possible design ideas [3] – Odd and Even alternating filter lengths – Quarter-Shift Filters (Q-shift) with all even filter lengths
• Single sided pass-band (analytic) helps in meeting design requirements June 15th 2010
Complex Wavelets and OCR
Kingsbury’s DTDWT • Odd and Even Alternating Filter Lengths – – – –
Design based on MMSE in the approximation 13/12 tap and 19/16 tap filters designed Sub-sampling structure not very symmetrical Slightly different frequency response!
• Q-Shift Filters – – – – –
All filters of even length (beyond level 1) No longer strictly linear phase Tree-a has ¼ group delay, then tree-b has ¾ delay (time reversed filter) Orthonormal filters can be designed Directly time-reversed version for synthesis filters
June 15th 2010
Complex Wavelets and OCR
Selesnick’s DTDWT • Alternate design approach, however same structure used • Design constraints – Fixed length – Vanishing moment
• Design proves that for 2 orthogonal filters to form a Hilbert pair, their scaling filters should be offset by ½ sample! [4] • Some spectral factorization and Grobner bases technique used to obtain this ½ sample delay using FIR filters… June 15th 2010
Complex Wavelets and OCR
Non-Redundant CWT • Projection based methods – Real world signal mapped onto complex function space – Followed by any type of DWT on the complex mapping
• Projection based CWT with Controllable Redundancy – Allows shift invariance, and good directionality [5]
• Projection based CWT with Non-Redundancy – No more shift invariant, but retains good directionality June 15th 2010
Complex Wavelets and OCR
Properties of DTDWT Approx. Shift Invariance
June 15th 2010
Complex Wavelets and OCR
Properties of DTDWT Approx. Shift Invariance
June 15th 2010
Complex Wavelets and OCR
Properties of DTDWT Better directionality
From 00, 450?, 900 now to ±150, ±450, ±750 June 15th 2010
Complex Wavelets and OCR
Properties of DTDWT Better directionality
June 15th 2010
Complex Wavelets and OCR
Properties of DTDWT • Now with Phase Information – Allows pointing to location
• Limited Redundancy – 1D signals: twice the samples – mD signals: 2m samples – Much less compared to Continuous WT or Wavelet Packets
• and finally Perfect Reconstruction!
June 15th 2010
Complex Wavelets and OCR
Applications • Motion Estimation and Compensation – Phase shift of complex coefficients proportional to motion [6]
• De-noising and De-convolution – Better results with un-decimated DWT than standard DWT – Comparable results with DTCWT, with much lower computation
• Texture Analysis – Owing to much better directionality properties
• Watermarking – Perceptual masking, and robustness against denoising – Easier in the complex wavelet domain
• Optical Character Recognition ?? June 15th 2010
Complex Wavelets and OCR
Introduction Devanagari Script • Primary Languages: Sanskrit, Hindi, Marathi, … [7] • Vowels
June 15th 2010
Complex Wavelets and OCR
Introduction Devanagari Script • Lots of consonants and crazy conjuncts!
June 15th 2010
Complex Wavelets and OCR
Devanagari OCR Database Generation • Applied heuristics [8] on photographs of text from a book • Mainly vertical and horizontal projections of the bw images • Line splitting, Word splitting
June 15th 2010
Complex Wavelets and OCR
Devanagari OCR Line Splitting • Applied heuristics on photographs of text from a book • Mainly vertical and horizontal projections of the bw images • Line splitting, Word splitting
June 15th 2010
Complex Wavelets and OCR
Devanagari OCR Word Splitting • Word splitting; Character pick
June 15th 2010
Complex Wavelets and OCR
Devanagari OCR Database Montage • Contains 32 base characters, unequal number of samples • Vowels, conjuncts all are ignored for simplicity • Used 50% for training, 50% for testing • Resized to 50x45 px • Total train set: 377 • Total test set: 367
June 15th 2010
Complex Wavelets and OCR
Devanagari OCR Classification Methodology • Baseline – Compute DCT coefficients of complete image – Store 8x8 coefficients of the top-left corner – Euclidean distance based classification
• Real DWT – Perform wavelet decomposition – Pick the level 3, horizontal coefficients as feature – Euclidean distance based classification
June 15th 2010
Complex Wavelets and OCR
Devanagari OCR Classification Methodology • DTDWT – Simplify the initial image • Thinning operation: bwmorph(im,'thin',Inf); • Edge detection: edge(im,'sobel');
– – – – –
Perform wavelet decomposition Pick the level 2, horizontal coefficients of both the trees Compute magnitude and phase as feature Euclidean distance based classification DTDWT Toolbox: from Brooklyn University [9]
June 15th 2010
Complex Wavelets and OCR
Comparison DWT and DTDWT
June 15th 2010
Complex Wavelets and OCR
Recognition Accuracy Standard DWT approach Wavelet Accuracy
– – – –
Haar
Daubechies
Coiflets
Biorthogonal
db1
db4
db7
coif1
bior1.1
bior4.4
96.7%
93.2%
95.4%
98.1%
96.7%
89.4%
Haar gives good accuracy due to short length All small length filters give higher accuracy (except db7 > db4) Coiflets: symmetric, N/3 vanishing moments, length N. coif1: length 6 Symmetry (linear phase) maybe useful for performance
June 15th 2010
Complex Wavelets and OCR
Recognition Accuracy DTDWT approach Image Type Normal Image Thinned Image Edge Image
Parameters
Accuracy
Magnitude
82.29%
Phase
41.14%
Magnitude
71.66%
Phase
42.23%
Magnitude
92.64%
Phase
48.77%
Mag + Phase
60.76%
• Magnitude looks quite good • Lots of zeros though! • Phase is quite noisy to look at June 15th 2010
Complex Wavelets and OCR
Recognition Accuracy Final Comparison Procedure
Parameters
Recognition Accuracy
Baseline DCT
64 coefficients
96.73%
Real Wavelets
coif1, horizontal details, level3
98.09%
DTDWT Kingsbury 10-tap filter
edge image, horizontal details, level2
92.64%
June 15th 2010
Complex Wavelets and OCR
Conclusion • Quite strange to see DTDWT perform lower than standard DWT • Maybe symmetry is a key issue, since DTDWT Kingsbury filters are approximately symmetric • Filter length could also be an issue – Haar, Coiflets were quite small – Kingsbury 10-tap (smallest) filter was used – Needed to use level 2 instead of level 3 due to smudging
• Better classifiers could perhaps improve performance – Aritificial Neural Networks may be used like others [10] – No machine learning used in this project because of insufficient data
• Handwritten text could be a future interesting step, specially for a script which is quite different for printed and handwritten June 15th 2010
Complex Wavelets and OCR
References [1] S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 1999. [2] P. D. Shukla, “Complex wavelet transforms and their applications,” Master’s thesis, University of Strathclyde, 2003. [3] N. Kingsbury, “Complex wavelets for shift invariant analysis and filtering of signals,” Journal of Applied and Computational Harmonic Analysis, vol. 10(3), pp. 234–253, May 2001. [4] I. W. Selesnick, “The design of hilbert transform pairs of wavelet bases via the flat delay filter,” in Proc. of ICASSP, 2001. [5] F. Fernandes, “Directional, shift-insensitive, complex wavelet transforms with controllable redundancy,” Ph.D. dissertation, Rice University, 2002. [6] J. Magarey and N. Kingsbury, “Motion estimation using a complex-valued wavelet transform,” IEEE Trans. Signal Processing, vol. 46, no. 4, pp. 1069 –1084, Apr 1998. [7] Devanagari OCR, Wikipedia Article. [8] V. Bansal and R. M. K. Sinha, “A Devanagari OCR and a brief review of OCR research for Indian scripts,” STRANS01, IIT Kanpur, India, 2001. [9] Dual-Tree DWT Toolbox http://taco.poly.edu/WaveletSoftware/dt1D.htm. [10] P. Zhang, T. Bui, and C. Suen, “Extraction of hybrid complex wavelet features for the verification of handwritten numerals,” pp. 347 – 352, 26-29 2004. June 15th 2010
Complex Wavelets and OCR
Thank You!