... TEXT EXTRACTION. FROM COMPLEX IMAGES. Submitted by: Ghina Salman. Scientific Supervisor: Dr. Nizar Zarka. Eng. Bassel Shanwar. June 21 th. , 2016 ...
Syrian Arab Republic Higher Institute for Applied Sciences and Technology Communications Department 4th year, Mini_Project_Multimedia
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Submitted by:
Ghina Salman
Scientific Supervisor:
Dr. Nizar Zarka Eng. Bassel Shanwar
June 21th, 2016
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Contents List of Figures ................................................................................................................................ iii Abstract ........................................................................................................................................... v Introduction ..................................................................................................................................... 7 Explanation of algorithm…………………………………………………………………………8 Candidate text region detection………………………………………………………………………………………….9 Text region localization………………………………………………………………………………………………………11 Character extraction…………………………………………………………………………………………………………17 Our practical result……………………………………………………………………………..19 Conclusion .................................................................................................................................... 22 References ..................................................................................................................................... 23
Page | i
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Page | ii
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
List of Figures Figure 1 MSER region .................................................................................................................... 9 Figure 2 after removing non text region ....................................................................................... 12 Figure 3 expanded bounding boxes text ...................................................................................... 13 Figure 4 detected text .................................................................................................................. 13
Page | iii
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Page | iv
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Abstract Text that appears in images contains important and useful information. Detection and extraction of text in images have been used in many applications. We use a multiscale edge-based text extraction algorithm, which can automatically detect and extract text in complex images. The proposed method is a general-purpose text detection and extraction algorithm, which can deal not only with printed document images but also with scene text. It is robust with respect to the font size, style, color, orientation, and alignment of text.
Page | v
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Page | vi
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Introduction The automatic detection of Region of Interests (ROI) is an active research area in the design of machine vision systems. Text embedded in images contains large quantities of useful semantic information which can be used to fully understand images. Text appears in images either in the form of documents such as scanned CD/book covers or video images. Video text can broadly be classified into two categories: overlay text and scene text. Overlay text refers to those characters generated by graphic titling machines and superimposed on video frames/images, such as video captions, while scene text occurs naturally as a part of scene, such as text in information boards/signs, nameplates, food containers, etc.
Page | 7
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Explanation of algorithm: The proposed method is based on the edges Edge strength, density and the orientation variance are three distinguishing characteristics of text embedded in images, which can be used as main features for detecting text. The proposed method consists of three stages: Candidate text region detection Text region localization Character extraction
Page | 8
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
We will explain every part of algorithm
1. Candidate text region detection This stage aims to build a feature map by using three important properties of edges: edge strength, density and variance of orientations. The feature map is a gray-scale image with the same size of the input image, where the pixel intensity represents the possibility of text. The instruction which do that called in matlab “detectMSERFeature” this instruction returns an object containing information about MSER features detected in the 2-D grayscale input image.
The image that we get after applying the instruction on picture:
Figure 1:MSER region
Page | 9
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Next step: we need to obtain the connected region and we do that with instruction “bwconncomp” which takes a black and white image and returns the connected components CC found in BW. This code in matlab explain previous idea:
clear all; close all; clc; %Maximally Stable Extremal Regions (MSER) algorithm to find regions colorImage=imread('75649706.jpg'); I = rgb2gray(colorImage); %************************************************** ************** % Step 1: Detect Candidate Text Regions Using MSER: %************************************************** ************** % Detect MSER regions. [mserRegions] = detectMSERFeatures(I, ... 'RegionAreaRange',[200 8000],'ThresholdDelta',4); figure imshow(I) hold on plot(mserRegions, 'showPixelList', true,'showEllipses',false) title('MSER regions') hold off BW=im2bw(I); mserConnComp=bwconncomp(BW);
Page | 10
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
2. Text region localization Normally, text embedded in an image appears in clusters. Characteristics of clustering can be used to localize text regions. Since the intensity of the feature map represents the possibility of text, a simple global thresholding can be employed to highlight those with high text possibility regions resulting in a binary image. We need to remove non text regions based on basic geometric Proprieties. By using “regionprops” we can measure MSER Properties. This instruction measures a set of properties for each connected component (object) in the binary image BW, and we determine these properties like: 'Bounding Box', 'Eccentricity', 'Solidity'… Then we should determine threshold which regions to remove. These thresholds may need to be tuned for other images. “In this picture I set different values of thresholds” We get this image after removing non text regions based on basic geometric proprieties.
Page | 11
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Figure 2: after removing non text regions
Next step: we need to merge text regions for final detection result and plot a rectangle around every character by using this instruction “insertShape”.
The image that we get after applying the instruction on picture:
Page | 12
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Figure 3: expanded bounding boxes text
Figure 4: detected text
Page | 13
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
This code in matlab explain previous idea: %************************************************** ************** %Step 2: Remove Non-Text Regions Based On Basic Geometric Properties: %************************************************** ************** % Use regionprops to measure MSER properties mserStats = regionprops(mserConnComp, 'BoundingBox', 'Eccentricity', ... 'Solidity', 'Extent', 'Euler', 'Image'); % Compute the aspect ratio using bounding box data. bbox = vertcat(mserStats.BoundingBox); w = bbox(:,3); h = bbox(:,4); aspectRatio = w./h; % Threshold the data to determine which regions to remove. These thresholds % may need to be tuned for other images. filterIdx = aspectRatio' > 3; filterIdx = filterIdx | [mserStats.Eccentricity] > .995 ; filterIdx = filterIdx | [mserStats.Solidity] < .3; filterIdx = filterIdx | [mserStats.Extent] < 0.2 | [mserStats.Extent] > 0.9; filterIdx = filterIdx | [mserStats.EulerNumber] < 4; % Remove regions mserStats(filterIdx) = []; mserRegions(filterIdx) = []; % Show remaining regions figure imshow(I) Page | 14
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
hold on plot(mserRegions, true,'showEllipses',false) title('After Removing Non-Text Geometric Properties') hold off
'showPixelList', Regions
Based
On
%************************************************** ************** %Step 3: Merge Text Regions For Final Detection Result %************************************************** ************** % Get bounding boxes for all the regions bboxes = vertcat(mserStats.BoundingBox); % Convert from the [x y width height] bounding box format to the [xmin ymin % xmax ymax] format for convenience. xmin = bboxes(:,1); ymin = bboxes(:,2);0 xmax = xmin + bboxes(:,3) - 1; ymax = ymin + bboxes(:,4) - 1; % Expand the bounding boxes by a small amount. expansionAmount = 0.02; xmin = (1-expansionAmount) * xmin; ymin = (1-expansionAmount) * ymin; xmax = (1+expansionAmount) * xmax; ymax = (1+expansionAmount) * ymax; % Clip bounds xmin = ymin = xmax = ymax =
the bounding boxes to be within the image max(xmin, max(ymin, min(xmax, min(ymax,
1); 1); size(I,2)); size(I,1));
Page | 15
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
% Show the expanded bounding boxes expandedBBoxes = [xmin ymin xmax-xmin+1 ymin+1];
ymax-
IExpandedBBoxes = insertShape(colorImage,'Rectangle',expandedBBoxes,' LineWidth',3); figure imshow(IExpandedBBoxes) title('Expanded Bounding Boxes Text') % Compute the overlap ratio overlapRatio = bboxOverlapRatio(expandedBBoxes, expandedBBoxes); % Set the overlap ratio between a bounding box and itself to zero to % simplify the graph representation. n = size(overlapRatio,1); overlapRatio(1:n+1:n^2) = 0; % Create the graph g =biograph(overlapRatio); % Find the connected text regions within the graph [c,componentIndices] = conncomp(g); % Merge the boxes based on the minimum and maximum dimensions. xmin = accumarray(componentIndices', xmin, [], @min); ymin = accumarray(componentIndices', ymin, [], @min); xmax = accumarray(componentIndices', xmax, [], @max); ymax = accumarray(componentIndices', ymax, [], @max);
Page | 16
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
% Compose the merged bounding boxes using the [x y width height] format. textBBoxes = [xmin ymin xmax-xmin+1 ymax-ymin+1]; % Remove bounding boxes that only contain one text region numRegionsInGroup = histcounts(componentIndices); textBBoxes(numRegionsInGroup == 1, :) = []; % Show the final text detection result. ITextRegion = insertShape(colorImage, 'Rectangle', textBBoxes,'LineWidth',3); figure imshow(ITextRegion) title('Detected Text')
3. Character extraction We recognize detected text using “OCR” The results after applying this instruction are:
Page | 17
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
This code in matlab explain previous idea:
%************************************************** ************** %Step 4: Recognize Detected Text Using OCR %************************************************** ************** ocrtxt = ocr(I, textBBoxes); [ocrtxt.Text]
Page | 18
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Our practical result: We apply our algorithm on another image and we get this result:
Page | 19
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
The answer that we get in matlab:
Page | 20
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Page | 21
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
Conclusion We present an effective and robust general-purpose text detection and extraction algorithm, which can automatically detect and extract text from complex background images.
Page | 22
Mini_project
MULTISCALE EDGE-BASED TEXT EXTRACTION FROM COMPLEX IMAGES
References [1] KongqiaoWang and Jari A. Kangas, “Character location in scene images from digital camera,” Pattern Recognition, vol. 36, no. 10, pp. 2287–2299, 2003..
[2] S. Messelodi and C. M. Modena, “Automatic identification and skew estimation of text lines in real scene images,” Pattern Recognition, vol. 32, no. 5, pp. 791–810, 1999.
Page | 23