526
JOURNAL OF MULTIMEDIA, VOL. 6, NO. 6, DECEMBER 2011
Skeletonization of Deformed CAPTCHAs Using Pixel Depth Approach Jingsong Cui, Lu Liu, Gang Du, Ying Wang, and Qianqi Guan School of Computer, Wuhan University, Wuhan, Hubei, China Email:
[email protected],
[email protected],
[email protected],
[email protected],
[email protected]
Abstract—CAPTCHA is a standard security technology that presents test to tell computers and humans apart. Nowadays the most widely deployed CAPTCHAs are text-based schemes, which rely on sophisticated distortion of text images aimed at rendering them unrecognizable to the state of the art of pattern recognition methods. Generally, the skeletonization of character is acknowledged as one of the most significant parts in character recognition. The skeleton which keeps the topology information as well as reduces the computational complexity is an excellent and robust structural feature to noise and deformation. In this paper, a depth-based approach is proposed in order to locate the skeleton point. In order to strike the balance between efficiency and robustness against distortion, three fault tolerance techniques have been applied in the extraction process. Then in the amendment stage, we use noise patterns to filter redundant points. Experiments are conducted and positive results are achieved, which show that the depthbased skeletonization scheme is applicable to the widely used CAPTCHA images, and the skeleton is robust against rotated, distorted or conglutinated characters. Index Terms—deformed CAPTCHA, skeleton, pixel depth, distortion, symmetry
I.
INTRODUCTION
CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) is a program that generates and grades tests that are human solvable, but intend to be beyond the capabilities of current computer programs [1]. This technology is now almost a standard security mechanism for defending against undesirable or malicious Internet bot programs, aiming to improve the server system and user information security [2]. CAPTCHA are divided into three categories: OCRbased, visual non-OCR-based, non-visual. In order to protect the system security, CAPTCHA based on nonOCR, for instance the moving object identification and tracking problems is proposed, which is referred to biological motion vision model [3,4]. Now there are no effective methods to attack 3D dynamic CAPTCHAs. The most widely used CAPTCHAs are the so-called textbased schemes, which typically require users to solve a text recognition task. In order to improve the security, a great deal of interference has been added into the CAPTCHA images. There are three types of interference: foreground, background and character itself. Foreground interference includes noisy points and interfering lines; background interference includes background color and © 2011 ACADEMY PUBLISHER doi:10.4304/jmm.6.6.526-533
texture; interference upon itself includes font-change, rotation, distortion and conglutination. It should be pointed out that there are many mature techniques dealing with foreground and background interference and can achieve high recognition rate. However most of the methods fail to recognize characters with deformation or affine transformation upon itself. The well-known reCAPTCHA and Google CAPTCHA are two classical examples that only apply interference upon character itself instead of interference on foreground or background. To recognize these CAPTCHAs poses a great challenge to Artificial Intelligence, so in this paper we are focusing on these deformed CAPTCHAs. Generally, Character skeleton plays a significant role in character recognition. As an important shape feature to the pattern recognition and the classification, the skeleton which reduces the computational complexity efficiently as well as keeps the topology and the geometric properties of the shape is the collection of the pixels past through the center of data cloud [5]. To the deformed CAPTCHAs, this feature is more robust and effective than a raster of pixels to the contour of the shape. This representation is particularly effective in extracting relevant features of the character for optical character recognition [8,15,16]. At present, it covers a wide range of applications, such as graphic recognition, handwriting recognition, signature verification [22], etc. What’s more, in the recognizing process, the skeleton lie a solid foundation for the follow-up recognition tasks which are based on pixel weight, such as angle detection, segmentation and template matching. In this paper, a skeletonization method based on pixel depth, i.e. the largest inscribed circle radius of a certain stroke area is proposed. There are two main steps: a) extraction of the primary skeleton points using tolerance techniques. b) amendment processing using noise patterns. As a result, the skeleton of character can be extracted precisely. Experiments are conducted on reCAPTCHA, Google CAPTCHAs and other classical deformed CAPTCHAs. Positive results are achieved, which shows that the depth-based skeletonization scheme is applicable to the widely used CAPTCHA images, and the skeleton is robust against rotated, distorted or conglutinated characters.
JOURNAL OF MULTIMEDIA, VOL. 6, NO. 6, DECEMBER 2011
527
II. RELATED WORK In the past several decades, a great many of skeletonization techniques has been developed. Recently, the skeletonization algorithms have three categories, namely the distance skeleton based on the symmetry analysis, the iterative edge-point erosion method and the non-iterative method [6,7,8]. H.blum [9] defines the medial axis and medial axis function of the object to represent a certain shape. This distance skeleton derivates many other new skeletonization algorithms such as the triangulation scheme and getting the skeleton by the contour symmetry. The iterative edge-point erosion technique use a sliding window (e.g. 3*3 window) which moved over the entire image with a set of rules applied to the contents of the windows. Simon [10] partitioned the character stroke into regular and singular regions. The singular region corresponds to ends, intersections and turns, and the regular region covers the other parts of the strokes. The technique of the regularity-singularity analysis uses the constrained Delaunay triangulation to separate two sections apart. Then extract the primary skeleton points in the regular region first and do amendment processing in the singular region. The noniterative way is commonly used to extract the skeleton of simple shape. The known thinning algorithms are usually feasible for the well-defined digital line patterns. To the affine transformation, conglutination and rotation the above skeletonization algorithms will lead to disappointing results. What’s more, the implementation of the traditional symmetry and nonsymmetrical analysis often suffers from the complicated computation while finding the symmetric pairs indirectly from the boundaries of the character strokes in both continuous and discrete domains. III. EXTRACTION OF PRIMARY SKELETON A. Motivation The skeletonization algorithm for CAPTCHA should meet following requirements: a) keep the topology and the geometric properties of characters b) thin and well centered c) fast and efficient d) sensitive to small CAPTCHA characters e) robust against noise and affine transformation CAPTCHA characters are deliberately and artificially rotated, distorted or conglutinate, so it is hard to tell regular and singular regions apart. What’s more, singular regions and interference are closely bound with each other and one’s change may lead to another’s variance. For instance, as Figure 1(a) shows below, the conglutination of two adjacent characters links two endpoints to one junction. In Figure 1(b), the point in the regular region of a stroke turns into an intersection point because of conglutination. In Figure 1(c), the original straight line which belongs to the regular regions becomes curved and oblique.
© 2011 ACADEMY PUBLISHER
Figure 1:Three situations happens in CAPTCHA.(a)shows two endpoints become one junction; (b) a stroke turns into an intersection point; (c)a straight line becomes curve and oblique.
B. Definitions A binary image is represented as a two-dimensional matrix [f] whose (x, y)th element is pixel f x, y, where x and y denote spatial coordinates, and f x, y = 1 and f x, y = 0 are represented as a white pixel and a black pixel, respectively. Definition 1 (Pixel Depth) Let the location of the pixel be a center and r be a radius, considering a black pixel (f x, y = 0) in a stroke, we may have a circle with the radius r which starts from a low value. As the radius r increases by a certain step, the circle enlarges. The depth of the pixel is the maximum radius of a circle that is tangential to any of the boundary curves of the stroke or intersects the boundary for the first time. The depth of a white pixel (f x, y = 1) is zero. Every skeleton point has a triple: (point, depth, context) Ù (P, d, OP∪ OQ)
(1)
Here, depth (P) = rmax; context (P) = {P| P∈ two radius}. Pk is skeleton point Ù depth (Pk) = max {depth (P), P ∈ context(P)}
(2)
The skeleton of the character is the set of skeleton points: K = {Ki| depth (Ki ) = max{depth(Pi), Pi∈ context(Pi)}} (3) It is conceivable that after scanning the whole image, each pixel owns its depth. The original two-dimensional image together with the depth of each pixel constitutes a three-dimensional image, which appears to form a hilly ground. The character skeleton is just like the ridge of the hills. As Figure 3 shows, the skeletonization procedure is just like a hiking travel, and we start from the highest peak of the mountain and go downhill following the mountain ridge forward.
Figure 2: The process of the expanding of circles and locating the skeleton point O.
Figure 3: The depth of each pixel of the letter E.
Definition 2 (white pixel heap) White pixel heap is a circular arc, part of the circular ring in each rotate, which
528
JOURNAL OF MULTIMEDIA, VOL. 6, NO. 6, DECEMBER 2011
only contains white (“1”) pixels and all the pixels appear continuously and stably. Continuity means white points lie continuously without any disconnection and gap; Stability requires the radian of the arc large enough so that it won’t be judged invalid in the fault tolerance processing. C. Idea Let the location of any black pixel be a center and r be a radius, draw a circle with the radius r which starts from a low value. With the increasing of the radius r by a certain step, the circle enlarges. If the circle is still inside the stroke, the radius keep increasing by the step of 1 pixel until the circle reaches (intersect or tangent) at least one of the contour. We call this first reach “first-time touch”. The depth of the center pixel is the value of last detecting radius. Whether the pixel is a skeleton point is determined by the condition of first-time touch as the following rules: 1.If first-time touch only reach (intersect or tangent) one side of the stroke contour, say there is only one white pixel heap on the circular ring, the pixel is non-skeleton point. 2.If first-time touch reach (intersect or tangent) two or more sides of the stroke contour, say there are two or more non-conterminous white pixel heaps on the ring, the pixel is a skeleton point. There is another problem which is different between continuous and discrete domain: Given a pixel and a radius, how to draw a circle? Which pixel around the center is on the circle? As it is in a binary image, trigonometric function is not appropriate as it fails to locate pixels precisely. Therefore, we use relative coordinates to record the relative position of the points on the circle. Regard the detecting center as original of the coordinate and the whole coordinate plane is divided into four quadrants. Given a radius starting value and detecting step (an integer), a series of circles can be formed. In this way, we can exactly locate the points on the circular ring. Points (x, y) on the circle satisfy the condition: r
1
x
y
r
(4)
With the pixel depth, the essential distinction between skeleton point and non-skeleton point is found. There is no need to find the symmetric point pairs in local area as some of the traditional algorithms do. By detecting the maximum radius that a circle can extend, the symmetric point can be positioned directly. With the white pixel heap, the state of the circle generating in each rotate can be shown, as well as the relationship between the circle and the contour, say intersect or tangent. Now the most crucial task is to compute the number of the white pixel heap in each rotate. How to maintain the skeleton points and leap over interference and singular points? We bring in following fault tolerance techniques to generalize the theory and achieve the goal.
D. Fault Tolerance Technique From practical point of view, in the two-dimensional discrete domain, the thinning method is different with the method in the continuous domain. Pixels are individual points distributed discretely in a picture. Each pixel has its own unique coordinate. Moreover, on condition that we are aiming at recognizing the character later on based on the skeleton extracted, human visual perception is not of great significance compared with a skeleton which keeps the topology and the geometric properties of characters. So we generalize the skeleton theory and use three techniques to tolerant pixel fault or irregularity as follows: i. Filter the unstable region (white or black) in each circle generating in each rotate by a certain center. ii. Tolerate at most two skeleton points exit in one local area, whose depth simultaneously increase to the local maximum depth. iii. Tolerate the number of white pixel heap of a skeleton point in first-time touch equals 1, but the next time the heap number must at least equals 2. Filtering the unstable circular arcs can help eliminates some of the singular points and noise that appear irregularly and unexpectedly. In terms of the circular ring generating in each detecting rotate, the ring is divided by black or white boundary pixels into several consecutive regions, each of which has the same pixel color. No matter the region is white or black, the states can be classified into two states: stable and unstable. A minimum radian threshold is used to represent the tolerance of pixel fault, which denotes the highest degree of fault the algorithm permits. If the arc radian is larger than the threshold, it is a stable region. Otherwise it is unstable and will be filtered in the processing. Points in the stable region are stable points, and in the unstable region are unstable points. The lower is, the less pixels is required to constitute a stable region, and relatively the more skeleton points will be extracted at last; the higher is, the more pixels are required in a stable region, and consequently the less skeleton points will be extracted. The factor is set out of regard for the following reasons: 1. Circumference (c): the sum of pixels on the circle. 2. Radius (r): current detecting radius. 3. Experimental verification, we give a parameter 2. According to the value of c and r, the minimum radian of a valid white pixel heap can be calculated: (5) Statistically, the arc length in continuous domain is equivalent to the total number of pixels of the arc in discrete domain. Therefore, we might as well use the sum of pixels count in an arc region to approximately represent its arc length.
radian
© 2011 ACADEMY PUBLISHER
count / r
6
JOURNAL OF MULTIMEDIA, VOL. 6, NO. 6, DECEMBER 2011
529
(7) The radiann equals counnt divided by r. If this radiian is lower than the t threshold α the region is in an unsstable state and shoould be markedd invalid by eliminating e thee two boundary pooints and connnect the twoo adjacent reggions together on both b sides. As the Figgure 4 shows,, the circle’s center c is O, r = 3, count (EF) = 1. The circum mference equaals 16, = 166/2*3 = 3.2. The ring r is divideed into six seegments, i.e. black b regions withh consecutive black (“0”) pixels, and white w regions with consecutive white w (“1”) piixels. Withouut any fault tolerancce techniquess, white pixell heap is set W = {BC,DE,FA}, F and blackk pixel heap is i set B = {AB B, CD, EF}. The nuumber of white pixel heap equals e the number of elements of set W, Nw w = 3. Howevver, apparentlyy the s arc EF whicch only contaains one pixeel is not a stable region. For we w can calculaate radian (EF F) = count (EF F) / r = 1 / 3 = 0.33 < , so elim minate the tw wo boundary points p and connecct two adjaacent regionss together. New segmentationn is: W = {BC { , DA}, B = {AB , CD}, amended Nw w = 2. In the disscrete domainn, the relationnship betweenn the current deteccting radius and stroke width w is uncerrtain. There exist tw wo circumstannces: (1) Theree is only one symmetric point in the middle m of loccal area whicch has the maaximum depthh and can form f a circlee simultaneouusly reachingg the strokee contour on two t or more siides of contouur. (2) Theree exists two symmetric poiints located onn left and right r side resppectively. Thee left point’s firsttime touch only reeach the left contour, while next time it reaches both sides. The right r point connduct s way. in a similar
poin nts of equal status s and sim milar function ns both to bee skeleton points. After A the firsst-time touch failed, thesee poteential points arre given anothher chance to form a largerr circlle. If this tim me it touches both sides of o contour, itt beco omes a skeletoon point. Otheerwise, it is definitely d non-skeleton point. With W the aboove three tecchniques, thee e com mpletely witho out damagingg skeleton can be extracted any of its topological structure.. n the first exttraction step, we allow thee existence off In duall adjacent skkeleton pixels. And in th he followingg amendment step, we will removve these redun ndant pixels. V. AMENDMEN NT PROCESSING IV Th he two major operations of amendmen nt processingg are: i. When the width of a strraight line is not odd, bothh of the leftt and right siide in the ceenter area aree consideredd as the skeletton points. In this situation,, we need too detect it andd only keep on ne side as thee skeleton pooint. ii. When the stroke is in singular regiions or otherr irregular parts, p the extraacted skeleton n loci may noo longer conntinuous or sm mooth. In this situation, wee need to elim minate redunddant points. Th he noise patteerns are displaayed in the Figure 5. Thesee patteerns are dividded into severral groups an nd each groupp amends a specificc situation:
Figure 5: The nooise patterns
V. ALGORITHM
Figure 4: Filteriing unstable regioon EF
As for the first circumstance, the skeeleton point caan be located accurrately by com mparing the piixel depth. Buut the second circum mstance seem ms to be a littlee troublesomee. If a skeleton is constrained c too those pointss whose first--time touch is twoo sides, thenn it will fail to get evenn one skeleton poinnt in many loccal areas, resuulting in a skeeleton with lots of disconneected and scattered pooints. Furthermore,, as mentioneed above, it iss not necessaary to thin the charracter to a dellicate single pixel p stream which w will definitelly increase coomputation coomplexity. Hence, we generalizze the theoryy of skeleton permitting two © 2011 ACADEMY PUBLISHER
Th hough experience and stattistical verificcation, we sett follo owing parameeters: a) radius staarting value: r0 = 1 b) radius inncreasing step:: rs = 1 c) maximum m detecting raadius: rm = 10 0 d) minimum m radian thresshold: = c / (2*r) It should be poointed out thaat here we set the maximaa valu ue of detectingg radius 10, ffor the reason n that in mostt casees the width of o character sstroke is limiited, which iss geneerally no wiider than 200. Therefore we set thee detecting radius varying v from 1 to 10, whicch will assuree the integrity i of deetecting. Algo orithm 1: (Skkeletonization based on pixeel depth) Inpu ut parameter: a) Preproceessed text-baseed CAPTCHA A image b) Absolutee coordinates ssets of circles Outp put: The primaryy skeleton of im mage Step ps:
530
JOURNAL OF MULTIMEDIA, VOL. 6, NO. 6, DECEMBER 2011
1) Scan the image, if the current pixel is white (f x, y = 0), continue scanning the next pixel; if the pixel is black (f x, y = 0), jump to step 2 2) Let the location of the pixel be a center point, the circle radius ranges from 1 to 10, calculate the white pixel heap number h 3) If h is greater than 2, the current pixel can be considered as the primary skeleton point, jump to step 1 4) If h is equal to 1 for the first time, set flag, expand the circle and calculate the next h, jump to step 2; if h is equal to 1 for the second time, ignore this pixel and jump to step 1 5) If h equals zero, expand the circle, jump to step 2. 6) If the current detecting radius exceed maximum detecting radius, jump to step 1
Input parameter: Maximum detecting radius rm = 10; Output: Relative coordinates on the circle Steps: 1) Compute relative coordinate of edge points on each circle, the loop viable ranges as follows: 2) Calculate the distance between original and detect point (x, y) in each loop: D = . 3) Search the points whose D is greater than r 1 and lower than r
Algorithm 2: (Calculating the white pixel heap) Input parameter: a) Preprocessed text-based CAPTCHA image b) Current pixel location (x, y) and the detecting circle radius r c) Absolute coordinates sets of the circles Output: The white pixel heap of the current pixel Steps: 1) Obtain the coordinates of pixels on the current detecting circle. The coordinates have been sorted anti-clockwise. 2) Calculate the minimum radian α, as a threshold of a valid white pixel heap. 3) Traverse the detecting circle, when finding a white pixel area, consider this area as a potential white pixel heap. 4) Determine whether the current white region is valid. If the radian is greater than α, then the count of the white pixel heap plus 1.Jump to step 3. 5) When the traverse finished, output the count of the white pixel heap. Algorithm 3: (Filtering with noise patterns) Input parameter: a) The primary skeleton of image b) The noise patterns Output: The final skeleton image result Steps: 1) Scan the image, if the current pixel is white (f x, y = 0), continue scanning the next pixel; if the pixel is black (f x, y = 0), jump to step 2 2) Get the 3*3 sliding window from the current pixel (f x-1: x+1, y-1: y+1) 3) If the current sliding window contains the pattern that is in the noise patterns, remove the current pixel and jump to step l; otherwise keep the current pixel in the final skeleton result. Algorithm 4: (Generating Circles) © 2011 ACADEMY PUBLISHER
TABLE I THE RELATIVE COORDINATE
Quadrant X axis (positive) and First quadrant X axis (negative) and Second quadrant Y axis (positive) and Third quadrant Y axis (negative) and Fourth quadrant
X coordinate x= r:-1:0
Y coordinate y= 0: r-1
x = 0:-1:-(r1) x = -r: 0
y= r:-1:0 y= 0:-1: -(r-1)
x = 0: r-1
y = -r: 0
VI. EXPERIMENT In this section, we present several images to show how the mechanism works, and then present some CAPTCHA images and their skeleton results. Figure 6 shows the mechanism by the skeletonization of a standard letter B. Figure 6(a) is the skeleton extracted. We use two colors to represent two kinds of skeleton points by their white pixel heap numbers in firsttime touch. The green parts denote the pixels whose first non-zero white pixel heap count is 2, and the red parts denote the first non-zero white pixel heap count is 1, which are considered as primary skeleton points via fault tolerance techniques. The fault tolerance techniques work in the situations that the width (sum of black pixels inside the stroke) of the straight line is not odd and the line is a curve. Figure 6(b) shows the first non-zero count of the white pixel heaps of every pixel in the image. X axis and Y axis compose the 2-demensional image plane, and each point is corresponding to a pixel in the image. Z axis represents their first non-zero white pixel heap count. Figure 6(c,d) shows the depth of each pixel, and the local deepest pixels are skeleton points. In this example, the maximum depth of skeleton points is no bigger than 6, indicating stroke of the character is no wider than 12. Early CAPTCHA designs usually use the combination of deformations and conglutination, which could be easily thinned and recognized by machine. The Microsoft’s early CAPTCHA design in Figure 7 and Gimpy which is created by Carnegie Mellon University in Figure 8 are typical, which. Next, Figure 9 displays the skeleton of reCAPTCHA in the register page on MSN website. They are of previous style without any foreground or background interference. In these figures, the skeleton image keeps the basic skeleton structures of characters and of good visual quality.
JOURNAL OF MULTIMEDIA, VOL. 6, NO. 6, DECEMBER 2011
(a)
531
( (b)
(c)
(d)
Figure 6:: Explains the meechanism by the skeletonization s of a standard letterr A. (a) the plan figure fi with depth denoting. (b) thee 3D figure of lettter A. (c, d) the 3D 3 figure of the skeleton with deptth d denoting.
Then, we present the Google G CAPT TCHAs and theirs t skeleton im mages in Fiigure 10. As A well as the reCAPTCHA A, the securrity methodss of the Gooogle CAPTCHA are deformaations and coonglutination. But APTCHA coonglutinates these charaacters Google CA unnaturally, leading to a terrible user experieence. Unfortunatelly, over-deforrmed Google CAPTCHA, like the third exam mple, cannot be skeletonizeed well becauuse of artifacts. Buut other exxamples show w our propposed approach is efficacious, which demoonstrates thatt the proposed appproach is robuust to affine traansform. The Tencent CAPTCH HA is another example, which w can be seen in Figure 111. This kind of CAPTCHA As is deployed byy QQ, which is an instant message plattform that has the largest quantity of users inn the world. They don’t have any a affine traansformation, but assured by a well-designeed conglutinattion method, they are still hard to break. Still, S with faault tolerancee techniques our approach exttracts the skeleeton out succeessfully. The last example coontains three typical singular patterns. Figuure 12 illustraates the extraccted the skeletton is conformed to t human peerception, witthout leavingg any unwanted arttifacts and braanches in the singular s regionns. All experiments are connducted in Maatlab and execcuted on the Core 2 Due proceessor with 2.00GHz in Winddows Server 20088 x64. The computationaal time of these CAPTCHAs and images is i shown in thhe TABLE II.. The executed tim me of our deptth- based methhod is much faster f than some current metthods, for example, e iterrative algorithm annd wavelet-bbased approaach. Because our approach scans the CAP PTCHA imagge only once, the computationaal time is rellated to the resolution r andd the depth of strookes. Howeverr the other meethods suffer from great
© 2011 ACADEMY PUBLISHER
Figure F 7: Examplees of the Microsooft CATPCHA an nd their skeleton resultss.
Fig gure 8: Gimpy-r, a well-known earrly scheme design ned at Carnegie Melllon University, prreprocessed imagges and their skeleeton results. The results show s the robustneess of our approach.
Figure 9: Tow imaages of reCAPTC CHA and their skeeleton results.
Figure 10: Tw wo images of Thee Google CAPTC CHA and skeletonns.
532
JOURNAL OF MULTIMEDIA, VOL. 6, NO. 6, DECEMBER 2011
Figure 11: Three images of Tencent CAPT TCHA and their skeleeton results.
o to striike the balaance betweenn Meaanwhile, in order efficciency and roobustness agaainst distortion n, three faultt tolerrance techniqques have beeen applied in n the process.. Then n in the amenndment stage, we use noisse patterns too filterr redundant points. p Experim ments show th hat the depth-baseed skeletonizaation scheme is applicable and efficientt to th he widely usedd text-based C CAPTCHA im mages, and thee skeleton is robbust againstt rotated, distorted orr glutinated chaaracters. Our rresearch on sk keletonizationn cong prov vides a new w preprocess method forr CAPTCHA A reco ognition. ACKNOWLED DGMENT The T research was supporteed by the Hub bei Provinciall Natu ural Science Foundation oof China under Grant No.. 2010 0CDB08603, the Fundam mental Researcch Funds forr the Central Uniiversities No. 6082022, the Nationall Natu ural Science Foundation oof China under Grant No.. 6094 40028, the Outstanding Youth Foundattion of Hubeii Prov vince under Grant No. 22009CDA148 8, the Youthh Chen nguang Sciennce Project of W Wuhan (200950431189).
Figure 122: Original image of symbols and the t final skeleton TABLE II THE COMPUTATION C NAL TIME OF EACH E IMAGE
Typee
Imagee
Resolutioon
Microssoft CAPTC CHA
1 2 1 2 1 2 1 2 1 2 3 1 2
114*4222 108*4088 178*4444 170*4200 114*6000 114*6000 140*4000 140*4000 106*2600 106*2600 106*2600 576*9766 545*2255
Gimppy reCAPTC CHA Googlle CAPTC CHA Tencent CAPTC CHA Symbool
Time (secondss) 0.45 0.46 1.52 1.48 1.42 1.26 0.54 0.40 0.31 0.44 0.28 3.67 0.56
VII. FUTURE WORK K In the futture, we willl work on breaking deforrmed CAPTCHAs, such likee reCAPTCH HA and Gooogle CAPTCHA. Based on thee skeleton, wee can easily figure f out key poinnts and lines,, which comppose a graph with nodes and eddges. In the grraph nodes aree the extractedd key points includding ends, inteersections andd inflection pooints, and edges are a extractedd skeleton linnes with its own direction andd curvature attributes. a Thhrough topoloogical analysis methhod, we will find a more general g solutioon to recognize thee deformed CA APTCHAs baased on skeletoon. VIII. CONCLUSION N In this papper, we propoosed a depth-bbased approacch to extract the skeleton from m deformed CAPTCHAs. By scanning thee CAPTCHA images, skeleeton points caan be located accuurately using the criterionn of pixel depth. d
© 2011 ACADEMY PUBLISHER
REFEREN NCES [1]
L von Ahn, M Blum and J Langford. “T Telling Humanss and Computeer Apart Autoomatically,” CA ACM, Vol.47,, No.2, 2004. [2] El Ahmad, Ahmad A Salah, Y Yan Jeff, Marrshall, Lindsay,, “The robustneess of a new CA APTCHA," Proceedings of thee 3rd Europeean Workshoop on System Security,, EUROSEC'100, pp.36-41, 2010 [3] Jing-Song Cuui, Jing-Ting Mei, Wu-Zho ou Zhang, “A A CAPTCHA Implementation I n Based on Moving M Objectss Recognition Problem,” Inteernational Con nference on E-Business andd E-Governmennt (ICEE 2010 0), Guangzhou,, China, May 7th 7 to 9th, 2010.. [4] JingSong Cuii, WuZhou Zhhang, Yang Pen ng, “A 3-layerr Dynamic CAPTCHA Implementation”, 2n nd Internationall Workshop on Education Technology and a Computerr Science (ETC CS 2010),Wuhaan, China, Marrch 6th to 7th,, 2010. [5] Mian Yang, Zhi-Wu Z Liao ,““The skeletonizzation researchh of low-qualitty Chinese chharacters based d on principall curves,” Prooceedings of the Eighth Internationall Conference on Machine Learning and d Cybernetics,, Baoding, 12-115 July 2009 [6] Yu-Shuen Wang; W Tong-Y Yee Lee. “C Curve-Skeletonn Extraction Ussing Iterative L Least Squares Optimization,”” Visualization and Computerr Graphics, IEE EE Transactionss on Vol. 14, issue 4, pp. 926 – 936 , July-Au ug. 2008. [7] Gisela Klettee. “A comparaative discussio on of distancee transforms annd simple defformations in digital imagee processing,” Machine Grapphics & Vision n Internationall Journal, vol. 12, 1 No. 2, pp 2335-256, Feb. 20 003. [8] B. Kégl, A. Krzy˙zak. K “Pieecewise linear skeletonizationn using principaal curves,” IEE EE Trans.Patterrn Anal. Mach.. Intell., vol. 244, no. 1, pp. 59––74, Jan. 2002. [9] H. Blum, “A transform mation for ex xtracting new w descriptors off shape,” in M Models for the Perception off Speech and Visual Form m, W. Wath hen-Dunn, Ed.. Cambridge, MA: M MIT Press,, pp. 362–380, 1967. [10] J. C. Simon, Ed. E Amsterdam m, “A complem mental approachh to feature deetection in Froom Pixels to Features”, F Thee Netherlands: North-Holland, N , pp. 229–236, 1989. 1 [11] Jeff Yan and Ahmad Salah E El Ahmad, “A low-cost l attackk on a Microsooft CAPTCHA A”, Proceedingss of the ACM M Conference on o Computer aand Communicaations Securityy 2008 pages 5443 554, Alexanndria, VA, Uniteed states, 2008.
JOURNAL OF MULTIMEDIA, VOL. 6, NO. 6, DECEMBER 2011
[12] reCAPTCHA. http://recaptcha.net/ . Accessed in Jan 2011. [13] Google. http://www.google.com/recaptcha . Accessed in Jan 2011. [14] Tencent. http://www.imqq.com/ . Accessed in Jan 2011. [15] T. M. Alcorn and C. W. Hoggar, “Preprocessing of data for character recognition”, Marconi Rev., vol. 32, pp. 6181, 1969. [16] E. S. Deutsch, “Preprocessing for character recognition”, in Proc. IEEE NPL Conf. Pattern Recognition, pp. 179190, 1968. [17] L. Lam, S. W. Lee and C. Y. Suen. “Thinning methodologies A comprehensive survey”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 9, pp. 869885, Sep. 1992. [18] J. J. Zou and H. Yan, “Skeletonization of ribbon-like shapes based on regularity and singularity analyses”, IEEE Trans. Syst. Man. Cybern. B, Cybern., vol. 31, no. 3, pp. 401407, Jun. 2001. [19] Wan, Y., Yao, L., Xu, B. and Zeng, P., “A distance map based skeletonization algorithm and its application in fiber recognition”, International Conference on Audio, Language and Image Processing, Shanghai, China, pp. 1769–1774, 2008. [20] You, X. and Tang, Y. Y., “Wavelet-based approach to character skeleton”, IEEE Transactions on Image Processing 16(5): 1220–1231, 2007. [21] Saeed, K., Rybnik, M. and Tabedzki, M., “Implementation and advanced results on the noninterrupted skeletonization algorithm”, in W. Skar bek ( Ed.) Computer Analysis of Images and Patterns, Lecture Notes in Computer Science, Vol. 2124, Springer-Verlag, Heidelberg, pp. 601–609, 2001. [22] R. D. T. Janssen, “Interpretation of maps: From bottomup to modelbased,” in Handbook of Character Recognition and Document Image Analysis, H. Bunke and P. S. P.Wang, Eds. Singapore:World Scientific,1997. [23] Zhang, Y. Y. and Wang, P. P., “A parallel thinning algorithm with two-subiteration that generates one-pixelwide skeletons”, International Conference on Pattern Recognition, Vienna, Austria, Vol. 4, pp. 457–461, 1996. [24] Rockett, P. I., “An improved rotation-invariant thinning algorithm”, IEEE Transactions on Pattern Analysis and Machine Intelligence 27(10): 1671–1674, 2005.
533
Jingsong Cui received a bachelor’s degree of computer software in 1997, at the department of computer science and technology, Wuhan University. In the same year, he went on graduate studies for a master degree in computer applications at Wuhan University without examination. In the year of 2000, he worked for a PhD. Degree at School of mathematics and computer, Wuhan University. The main research topics are information security and algorithm optimization. In 2003, he received a PhD. degree in 2003, and began to teach in Wuhan University in 2004. He has published more than 20 academic papers in academic journals and international conferences, among which 17 articles are EI indexed. He has accumulated rich research experience of systems infrastructure security, information security, and network security. Lu Liu is a senior student major in information security at Wuhan University, Wuhan, Hubei, China. In the year of 2010, she participated in the scientific research project on CAPTCHA, including CAPTCHA design and security analysis. She has carried out some research on CAPTCHA recognition algorithm and security assessment approaches. Gang Du is a senior student major in computer science and technology at Wuhan University, Wuhan, Hubei, China. In the year of 2009, he participated in the research project on CAPTCHA analysis and recognition. He and his partners won second prize in the 2010 Undergraduate Electronic Design Contest - Information Security Technology Invitational Contest. During 2010 and 2011, he furthered his research on security assessment approaches. Ying Wang is a senior student major in information security at Wuhan University, Wuhan, Hubei, China. In the year of 2010, she participated in the scientific research project on CAPTCHA, including CAPTCHA design and security analysis. She has studied the CAPTCHA assessment methods and detecting algorithms. Qianqi Guan is a junior student major in information security at Wuhan University, Wuhan, Hubei, China. In the year of 2010, she participated in the scientific research project on CAPTCHA, including CAPTCHA design and security analysis.
© 2011 ACADEMY PUBLISHER