data compression using shannon-fano algorithm ... - IEEE Xplore

80 downloads 0 Views 1MB Size Report
August 01-02, 2014, Dr. Virendra Swarup Group ofInstitutions, Unnao, India. DATA COMPRESSION USING SHANNON-FANO. ALGORITHM IMPLEMENTED BY ...
IEEE International Conference on Advances in Engineering & Technology Research (ICAETR - 2014), August 01-02, 2014, Dr. Virendra Swarup Group of Institutions, Unnao, India

DATA COMPRESSION USING SHANNON-FANO ALGORITHM IMPLEMENTED BY VHDL

Mr.Mahesh Vaidya Department OrElectrical Engineering Shiv Nadar University, Greater Noida, Uttar Pradesh India. maheshvaidva786((iJgmail.com

Abstract- In

Mr. Ekjot Singh Walia Research Scholar

digital communication while transmit the

should be as minimum as possible, so to compress the data there are several technique used. In this paper we have implemented a Shannon-fano algorithm for data through

VHDL

Electronics & Telecommunication College ofEngineering Pune, Pune,lndia adityagupta2 5 90((iJgmail.com

[email protected]

data it is well desire that the transmitting data bits

compression

Mr. Aditya Gupta

coding.

Using

VHDL

implementation we can easily observe that how many bits we can save or how much data gets compressed during transmission, and we can also see the encoding of the respective symbol of transmit data. In the field of data compression the Shannon-fano algorithm is used, this algorithm is also used in an implode compression method which are used in zip file or .rar format. To implement this algorithm in VHDL we use ModelSim SE 6.4 simulators and to synthesize these code Quartus-II

sequence. In which the first part which is divided in two parts in which the highest probable symbol assign with a single bit '0' and remaining part assign to a bit I 'and similarly in second iteration the second highest probable symbol are again assign to '0' and remaining part are assign to '1' in this way the bits are assign to this tree. While encoding to each symbol we follow the path from top to respective symbol and take all the bits in that path to generate a proper encoded bits for that symbol. So now we can observe that first highest probable symbol assign with '0', second highest probable symbol encoded to "10", third highest probable symbol encoded to "110", fourth highest probable symbol encoded to "1110", In this way we encoded all the symbol of the given data to reduce the bits and finally we can verify by counting all the bits of encoded symbol that our resultant data get compressed.[6] [7] '

tool has been used.

Keywords- Shannon-fano, VHDL, ModelSim SE 6.4,

Quartus-II, ALUT, RTL, 110. I.

INTRODUCTION

In today's world of digitalization, all the data which is to be processed or transmit or received that should contain memory or bits as minimum as possible. So to reduce the bits or to compress the data, there are several technique, one of which technique is called Shannon-fano algorithm, in which we construct a prefix code based on the set of symbol and their probabilities or frequency.[6] In Shannon-fano algorithm sorting of the different symbol is done according to their frequency or occurrence. Then these sorted symbols with their occurrence are divided in two parts according to their probabilities. The first part is the highest probable symbol and second part is the remaining symbols, and now this remaining symbols are again divided in two parts one part is 2nd highest probable symbol and second part is remaining others symbols, in this way all the symbol are separated by same iterative process and after dividing all the symbols separately, now we ready to assign bits to this

Figure .1. Chain of Shannon-fan 0 algorithm.

978-1-4799-6393-5/14/$31.00 ©2014 IEEE

IEEE International Conference on Advances in Engineering & Technology Research (ICAETR - 2014), August 01-02, 2014, Dr. Virendra Swarup Group of Institutions, Unnao, India

Above figure shows the encoding tree of each symbol present in data. Let us assume that some data which contain a A,B,C,O, E and F symbol in it and suppose same sequence are maintain for the sorting of occurrence of symbol. Symbol A having highest probability or its occurrence maximum in data therefore data divided in two parts one part is A and second part is remaining data (R_OATA). So here A assign as '0' bit and R_OATA assign to '1' and in next iteration R_OATA divided again in two parts, first part is second highest occurrence of symbol in R_OATA and second part is new remaining data (R_OATA). Same iteration are performed till the end and assign bits to each branches of tree. Now we extract the encoded bits for the entire unique symbol and assign it to the respective symbol. For this, we follow the path from OATA to each unique symbol in given figure.I. and take all the bits in its path. Using this process we can observe from figure .1. that symbol A encoded to '0', symbol B encoded to "10", symbol c encoded to "110". Similarly symbol D,E and F encoded to "1110", "11110" and "11111". In this way we encoded data successfully and compressed the given data sequence. II.

IMPLEMENTATION IN

VHDL CODE

To see all the result in the form of wave and practically we use ModelSim SE 6.4 to simulate the VHDL code. To perform all the arithmetic logical operation in VHDL we should have to defme the library ieee in the program and use it. Without this library declaration simulation cannot be possible. To input the data first we have required array of character and to count its respective occurrence we have required a array of integer of same length. To get all these we have to define a separate package inside the work directory of ieee and use this package to get all these data type. Similarly if we want to create such type of data type then we can define a package to get this, which can provide the data type as per our requirement. [I] library ieee ; use ieee.std_Iogic_1164.all ; use ieee.std_logic_arith.all ; use ieee.numeric_std.all ; use ieee.math_real.all ; package shannon_fano_algo_packageI is type char_array is array (0 to 14) of character ; type integer_array is array (0 to 14) of integer; type binary_array is array (0 to 14) of stdJogic_vector (0 to 13); end package shannon_fano_algo�ackagel ; This is how we create a package and to use this package we define use work.shannon_fano_algo�ackage.all; command in main program and use all the data type which is define inside this package throughout the program. To perform all the arithmetical, logical, signed or unsigned operation we

defme the entire library inside the main program. After that we defme an entity for Shannon-fano technique which gives the information about all the input and output of Shannon fano algorithm. Here we use Input_message as a array of character data type which is define inside the package created by us, to input the data, to define the length of the data we have also initialize a variable InLimit, to find out the total bits used to send the data without encoding we initialize another variable TotalBits of integer type, and similarly to know at the output total number of encoded bits we use CompressedBits as a output variable. In this way we define the entity which is to be used in the Shannon-fano algorithm, which is as follows,[2]

entity shannonfanoproject is porte Input_message: in char_array ; InLimit: in integer; sorted_occurrence: inout integer_array; single_int_array: inout integer_array; Total_Bit_ Without_Encoding: out integer; Encoded_Symbol: inout binary_array ; CompressedBits: inout integer); end shannonfanoproject ; After declaration of the entity now we create architecture of the Shannon-fano algorithm. Inside this part all the command is written which is compliable. After initialization of the architecture in proper format we use a process statement, and in its sensitivity list we define those entire variables which are responsible for any changes in the a output. If we required any signal or variable then it should be define before the beginning of the architecture and process respectively. After all these declaration first we proposed a logic to find out the unique symbol inside the data. After fmding the unique symbol we developed logic to count the occurrence of the each different symbol in given data and arrange these symbols in descending order according to their frequency or occurrence. After sorting we proposed the logic to find out the number of bits used to send the data with and without the Shannon-fano algorithm both and then compare it then we find that the total number of bits reduced or data get compressed using these algorithm. here we can see the waveform to observe the output results. Figure shows the output result of the given example which is simulated by the ModelSim SE 6.4 tool. From given figure we can see that the input data which is entered by us having a length of 15 character of array, which are required 45 bits. Then next we find out the sorted occurrence of different symbol present in data and through it we calculate the total number of compressed bits through the Shannon-fano algorithm. And then we can easily observe by figure that our data get compressed to 34 bits using this algorithm.

978-1-4799-6393-5/14/$31.00 ©2014 IEEE

IEEE International Conference on Advances in Engineering & Technology Research (ICAETR - 2014), August 01-02, 2014, Dr. Virendra Swarup Group of Institutions, Unnao, India

Figure .2. Output of original program A.

Shannon fano algorithm to compress bits

C.

for i in 0 to (InLimit-l) loop single_int_array(i)

Suggest Documents