Research Article
Secondary segmentation extracted algorithm based on image enhancement for intelligent identification systems
International Journal of Distributed Sensor Networks 2018, Vol. 14(12) Ó The Author(s) 2018 DOI: 10.1177/1550147718818737 journals.sagepub.com/home/dsn
Heng Dong, Ying Jiang, Yaping Fan, Yu Wang and Guan Gui
Abstract Due to the indefinite position of the characters in the invoice and the difference of the color shades, which greatly increases the difficulty of intelligent identification, it is difficult to meet practical applications. In order to solve this problem, this article proposes a quadratic segmentation algorithm based on image enhancement. Specifically, we first enhance the color of the image based on gamma transformation, and then separate the machine-printing character from the blank invoice based on the color analysis of the machine-printing character. Then according to the open operation in the image processing field and the bounding rectangle algorithm, the pixel information of the machine-printing character is obtained, which is convenient for getting the character information. The algorithm can achieve effective extraction of machine-printing characters and also reduce the difficulty of invoice identification and improving the accuracy of invoice identification. Simulation results are given to confirm the proposed algorithm. After many experiments, the extraction accuracy of this algorithm is as high as 95%. Keywords Quadratic segmentation algorithm, image processing, invoice identification, color determination
Date received: 6 September 2018; accepted: 7 November 2018 Handling Editor: Gianluigi Ferrari
Introduction With the rapid development of social economy, the usage of invoices in China is required in financial management. At present, hundreds of millions of invoices have been used for annual reimbursement in China, and the number of invoices has shown an upward trend. However, most of the reimbursement of invoices is done manually. The manual reimbursement of invoices has many disadvantages such as complex reimbursement procedures, long manual processing time, and high processing error rate. In other words, the manual reimbursement of invoices not only aggravates the financial staffs’ workload but also takes up a lot of extra energy of the people who reimburse the invoices. From financial market perfective, the manual
reimbursement spends extra labor cost and thus increases the costs of the product or management.1 In recent years, with the rapid development of image processing and computer vision technology, high-precision, high-efficiency, and low-cost text recognition technology has been realized. Many researchers have introduced the emerging technologies of computer Key Lab of Broadband Wireless Communication and Sensor Network Technology, Nanjing University of Posts and Telecommunications, Ministry of Education, Nanjing, China Corresponding author: Guan Gui, Key Lab of Broadband Wireless Communication and Sensor Network Technology, Nanjing University of Posts and Telecommunications, Ministry of Education, Nanjing 210003, China. Email:
[email protected]
Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (http://www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/ open-access-at-sage).
2 vision into related fields such as invoice identification, and conducted rigorous and profound analysis on the feasibility of these technologies. So it is becoming more and more urgent to find an effective and practical invoice processing method. Value-added tax (VAT) notes are printed by dot matrix printers, and the position of the invoices printed by different printers is indefinite, and the shades of color are different, which are the main reasons that lead to a serious decline in the quality of invoice information extraction. Therefore, it is significant to study image enhancement secondary segmentation in practical applications.2 At present, the colors of the machine-printing characters for the invoice are divided into blue and black. The depth of character color directly affects the effect of image segmentation and recognition. Extracting too much information at one time also affects the recognition. There are still some deficiencies in the existing method for extracting characters from the machine. According to the extraction of the frame, the content to be recognized is too much, which greatly reduces the recognition accuracy. This article proposes a secondary segmentation extracted algorithm, which can be applied in the actual invoice identification system. The proposed algorithm first performs color enhancement3 on the image. Then, based on the color analysis of the machine characters, the first segmentation is performed to separate the machine-printing characters from the blank invoices. The pixel information of the player is then obtained, and the secondary segmentation is performed so as to realize the extraction of the machine-printing
Figure 1. An example of an original invoice image.
International Journal of Distributed Sensor Networks characters. Experiment results are given to validate the proposed algorithm.
Extraction of the machine character The character extraction method of this article is mainly composed of image enhancement and secondary segmentation. Figure 1 is the scan invoice image. Figure 2 shows the system flow. First, image enhancement is performed on the acquired invoice color image to obtain a clearer image I. A color judgment is made on the password area of the invoice, the color of the machine-printing character is detected, and the first blank invoice and the machineprinting character are segmented to obtain the image I1 with only the machine-printing character. The black and white conversion of I1, corrosion, and the acquisition of the rectangle of the outside world are performed. The pixel information of the machine-printing character is obtained, and after the secondary division, the machine extracts the character block.
Image enhancement The color of invoices produced by different merchants is due to the difference of printers. Hence, the image enhancement is required for invoices to make them clear. We use a method of image enhancement based on gamma transformation.4 The enhancement effect is shown in Figure 3. The gamma conversion is mainly used for image correction and the correction of the image with too high
Dong et al.
3 low.5 The correction effect of the gamma transformation on the image is actually achieved by enhancing the details of low gray level or high gray level. We can understand intuitively from the gamma curve of Figure 4. The g value is demarcated by 1. When g is less than 1, the gradation of the brighter region is compressed, the gradation of the darker region is brighter, and the overall image is brighter. When g is greater than 1, the gradation of the brighter region is stretched, and the grayscale of the dark region is darker. The compression is darker and the image as a whole is darkened. And the smaller the value is, the stronger the expansion effect is on the low-gradation part of the image. The larger the value is, the stronger the expansion effect on the high-gradation part of the image is. So by changing the gamma value, the effect of enhancing the details of low gray levels or high gray levels can be achieved.
Secondary segmentation
Figure 2. Flow chart of image segmentation.
First split grayscale or low grayscale, which aims to enhance the contrast. The transformation formula does a multiplication of each pixel value on the original image S = crg , r 2 ½0, 1
ð1Þ
where g in the formula is the ‘‘convection coefficient’’ of the image forming image. If the gamma curve is steep, the output contrast is relatively high. If the gamma curve is slow, the output contrast is relatively
Figure 3. An example of the enhanced image.
The machine-printing characters on the general invoice are divided into blue and black. Only the color of the machine-printing character of the value-added ticket is not fixed. Therefore, it is necessary to determine the color of the character played in the ticket. The character color that is the color other than white can be judged by using RGB in the password area. According to the judged color of the machine, the corresponding color is divided.
4
Figure 4. The image of gamma curve when c = 1.
There is a fixed frame line for the VAT ticket, we can first use the mouse to take the function ginput(),6 which can manually cut out the part to be extracted according to the fixed frame of the blank invoice, and generate the Excel file of the location information, such as the buyer, the seller, and the password area. Then import Excel in the code, you can initially split the invoice, reduce the follow-up workload, and improve the extraction rate. The function ginput provides a cross cursor so that we can more accurately select the position we need and return the coordinate value. The function call form is ½x, y = ginput(n), which enables you to read n points
Figure 5. An example of the image after first split.
International Journal of Distributed Sensor Networks from the current coordinate system and return the x, y coordinates of these n points, all of which are nX1 vectors. Importing Excel here to segment the password area is used for color determination. Assume that the color of the machine-printing character is blue, which is shown in Figure 3. Currently, color digital images can be expressed in a variety of color space models. However, in computer image processing, the RGB model and the HSV model are often used. The RGB model is based on the three primary colors of human vision where the R stands for red, the G stands for green, and the B stands for blue. The appropriate color mixing of red, green, and blue colors can cause any color perception on the electromagnetic spectrum. Since these three color components are highly correlated and form an uneven color space, the perceived difference or color difference between the two colors cannot be expressed as the distance between two points in the color space. So the RGB model is mainly used as a color space model for hardware devices, such as color monitors and color cameras. The HSV model is a color space based on human visual perception characteristics, where chromaticity. which H stands for, represents different colors, such as red, green, and blue; saturation, which S stands for, represents the depth of the color, such as dark green and light green; and brightness, which V stands for, indicates the degree of lightness and darkness of the color, such as very bright and very dark. It has two important characteristics. First, the luminance component is independent of the color information of the image. Second, the chrominance component and the saturation
Dong et al.
5
Figure 6. An example of the inverted image.
Figure 7. An example of the eroding image.
component, the ways which people use to feel the color is closely linked.7 Therefore, people often use the HSV model to specify color segmentation. So the invoice image should be converted from RGB space to HSV space8 first. Then, we use the in Range function to segment the blue image region we need by adjusting the H, S, and V regions to obtain a white binary image I1 that satisfies the interval condition, which is shown in Figure 5. After testing, H: 100-140, S: 30255, V: 0-255, we can have the best effect.
Second split Since the image etching operation combines the black areas into blocks, it is necessary to perform a bitwise not transform on I1 to achieve black and white reverse color,9 which is shown in Figure 6. Then, the etching10 is performed in a block operation to obtain an image I2 which is shown in Figure 7; after that we can respectively open the imported Excel to separate the parts I3 and I4 except the password area in the images I and I2.
6
International Journal of Distributed Sensor Networks
Figure 8. An example of the cut of the enhanced image.
Figure 9. An example of the cut of the eroding image.
Figure 10. An example of the image with rectangular block.
I3 and I4 are respectively shown in Figures 8 and 9. The bounding rectangle11 function is used to calculate the minimum rectangle of the vertical boundary of the contour for I4, where the rectangle is parallel to the upper and lower boundaries of the image, and then the corrosion block boundary information is obtained. For visual convenience, the rectangle boundary is drawn using the rectangle function, as shown in Figure 10. Then split the segmented part a second time. We can suppose that the horizontal position of the invoice is X and the longitudinal direction is Y. The pixel information of the top and bottom points of the rectangular block is stored in a two-dimensional matrix in the order of vertex Y, bottom point Y, vertex X, and bottom point Y. The two-dimensional matrix is sorted in ascending order of vertex Y, and the array of vertex Y is traversed. If the interval of vertex Y is smaller than k, the range of k is between [5,30], and the same row is determined. The rectangular block in the same row is sorted by two-dimensional matrix from small to large according to the vertex X. The row of rectangular blocks is cut out from left to right, and the image G shown in Figure 11 is extracted by analogy. However, there are two cases that require special handling. One is that the rectangular block contains a small rectangular block, which is shown in the second
Figure 11. An example of the results after segmentation algorithm.
rectangular block in Figure 10. It is necessary to first determine and then filter the small rectangular block. The second is that two lines of text are in a rectangular block and the space between the two lines of text form a rectangular block, which is shown in Figure 12. It is necessary to first determine whether the rectangular block is too large and then split.
Conclusion and future work This article proposed a quadratic segmentation algorithm based on image enhancement. The algorithm can achieve effective extraction of machine-printing characters. Through the secondary division of the enhanced invoice image, the pixel information of the machineprinting character is finally obtained, and a small block image of the machine-printing character is obtained.
Dong et al.
7
Figure 12. The special case of the image with rectangular block.
The automatic extraction process of invoice information is conducive to the promotion, application, and reality of the invoice intelligent reimbursement system, which has broad application prospects. This article is only a preliminary study on the extraction of invoice image information. In the future work, we will continue to conduct in-depth studies to achieve better results and more convenient operations. It is also necessary to increase the number of experimental samples and conduct large-scale tests to further increase the effectiveness. At last, it should be added that the information that the VAT invoice is used for reimbursement is information about the purchaser, the seller, the purchase details, and the total amount.
2.
3.
4.
Acknowledgement
5.
Conceptualization, Heng Dong and Ying Jiang; Methodology, Yu Wang and Guan Gui; Software, Yaping Fan; Validation, Ying Jiang and Guan Gui; Writing-Original Draft Preparation, Ying Jiang; Writing-Review & Editing, Heng Dong and Guan Gui; Supervision, Guan Gui.
6.
Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions, National Natural Science Foundation of China (61701258), Jiangsu Specially Appointed Professor Grant (RK002STP16001), ‘‘Summit of the Six Top Talents’’ Program of Jiangsu (No. XYDXX-010), Innovation and Entrepreneurship of Jiangsu High-level Talent Grant (CZ0010617002), NUPTSF (No. XK0010915026), and 1311 Talent Plan of Nanjing University of Posts and Telecommunications.
7.
8.
9.
10.
ORCID iD Guan Gui
https://orcid.org/0000-0003-3888-2881
References 1. Zhu J, He HJ, Tang J, et al. Tax-control network invoice machine management platform based on socket. In: Proceedings of 2016 2nd IEEE international conference on
11.
computer and communications (ICCC), Chengdu, China, 14–17 October 2016, pp.2343–2348. New York: IEEE. Cai D, Tan M and Cai J. The VAT tax burden warning model and modification based on CTAIS system data. In: Proceedings of 2011 2nd international conference on artificial intelligence, management science and electronic commerce (AIMSEC), Dengleng, China, 8–10 August 2011, pp.2653–2656. New York: IEEE. Li T and Zhang H. Digital image enhancement system based on MATLAB GUI. In: Proceedings of 2017 8th IEEE international conference on software engineering and service science (ICSESS), Beijing, China, 24–26 November 2017, pp.296–299. New York: IEEE. Kumari A, Thomas PJ and Sahoo SK. Single image fog removal using gamma transformation and median filtering. In: Proceedings of 2014 annual IEEE India conference (INDICON), Pune, India, 11–13 December 2014, pp.1– 5. New York: IEEE. Abdenova GA. Scaling of input-output data and number of conditionality of a matrix of Grama. In: Proceedings of 2012 IEEE 11th international conference on actual problems of electronics instrument engineering (APEIE), Novosibirsk, 2–4 October 2012, pp.16–18. New York: IEEE. Song SH, Antonelli M, Fung TWK, et al. Developing and assessing MATLAB exercises for active concept learning. IEEE T Educ. Epub ahead of print 19 March 2018. DOI: 10.1109/TE.2018.2811406. Chmelar P and Benkrid A. Efficiency of HSV over RGB Gaussian mixture model for fire detection. In: Proceedings of 2014 24th international conference Radioelektronika, Bratislava, 15–16 April 2014, pp.1–4. New York: IEEE. Indriani OR, Kusuma EJ, Sari CA, et al. Tomatoes classification using K-NN based on GLCM and HSV color space. In: Proceedings of 2017 international conference on innovative and creative information technology (ICITech), Salatiga, Indonesia, 2–4 November 2017, pp.1–6. New York: IEEE. Mercado MA, Ishii K and Ahn J. Deep-sea image enhancement using multi-scale retinex with reverse color loss for autonomous underwater vehicles. In: Proceedings of OCEANS 2017—Anchorage, Anchorage, AK, 18–21 September 2017, pp.1–6. New York: IEEE. Palekar RR, Parab SU, Parikh DP, et al. Real time license plate detection using openCV and tesseract. In: Proceedings of 2017 international conference on communication and signal processing (ICCSP), Chennai, India, 6– 8 April 2017, pp.2111–2115. New York: IEEE. Sharma T, Kumar S, Yadav N, et al. Air-swipe gesture recognition using OpenCV in Android devices. In: Proceedings of 2017 international conference on algorithms, methodology, models and applications in emerging technologies (ICAMMAET), Chennai, India, 16–18 February 2017, pp.1–6. New York: IEEE.