Bio-Inspired Soft-Computational Framework for ... - Springer Link

0 downloads 0 Views 505KB Size Report
Application of digital signal processing and certain bio-inspired soft-computing tools such as Artificial ... forms used by researchers to simulate ANN algorithms. .... The simulation in TMS320C6713 DSP Processor is done using C language.
Bio-Inspired Soft-Computational Framework for Speech and Image Application Dipjyoti Sarma and Kandarpa Kumar Sarma

Abstract Artificial Neural Network (ANN) based recognition systems show dependence on data and hardware for achieving better performance. The work here describes the use of DSP processors to design a bio-inspired soft-computational framework with which processing of speech and image inputs are carried out. Certain nonlinear activation function for implementation in DSP processor framework is also designed and configured appropriately to train a soft-computational tool like ANN. The results derived show that the capability of the ANN improves with the derived DSP processor framework. Its performance is further enhanced using the approximation of tan-sigmoidal nonlinear activation function. In terms of computational capability, the proposed approach shows around 12% improvement compared to a conventional framework. Similarly, improvement in recognition rate is around 4% with applications involving speech and image samples. Keywords: ANN, Bio-inspired, Recognition.

1 Introduction Application of digital signal processing and certain bio-inspired soft-computing tools such as Artificial neural Network (ANN) on speech and image signals demands high computing requirement. Computation of ANN resembles brain. As in brain, the ANN also employs many computational elements that works concurrently and finally achieves a brain like structure. ANN that performs speech recognition and synthesis, or pattern classification consist of large number of neurons and inputs. Every neuron computes a weighted sum of its inputs and applies a nonlinear Dipjyoti Sarma Department of ECT, Gauhati University, e-mail: [email protected] Kandarpa Kumar Sarma Department of ECT, Gauhati University, e-mail: [email protected] J. C. Bansal et al. (eds.), Proceedings of Seventh International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2012), Advances in Intelligent Systems and Computing 201, DOI: 10.1007/978-81-322-1038-2_5, Ó Springer India 2013

53

54

D. Sarma and K. K. Sarma

function to its result [1]. The ANN recognizes patterns based on information and weights during training. However, the use of ANN classifier remains constrained due to the availability of powerful hardware to provide sufficient speed during training. The basic operation performed by a neuron during classification can be written as, Y = f∑ xi ∗ wi + b (1) i

Thus for each classification the network must perform one multiplication and one addition for every connection which translates to a few billion multiply add operations per second. Only parallel implementations, in which several connections are evaluated concurrently, achieve such computational power [2]. General-purpose personal computers (GPPC) and workstations are the most popular computing platforms used by researchers to simulate ANN algorithms. They provide a convenient and flexible programming environment and technology advances have been rapidly increasing their performance and reducing their cost [3]. But ANN simulations for image and speech signals can still overwhelm the capabilities of even the most powerful GPPC. Although, the use of super computer reduces the required CPU time however this is not a clever solution as it is expensive. A convenient solution of this constraining solution is derived using DSP processors. These are design wise parallel processing blocks with a host of features which make them suitable for real time signal processing.Bio-inspired processing shows all features of an advanced form of real-time signal processing with supportive cognitive capabilities. Therefore, any bio-inspired system design must combine real-time computation with cognition. Such a setup designed using DSP processors has been proposed here. We specially focusses the implementation of certain ANN based applications involving image and speech inputs. The proposed architecture shows distinct advantage in terms of processing power and cognitive capability as compared to conventional approach of implementing a soft-computational framework like ANN for image and speech applications. In terms of computational capability, the proposed approach shows around 12% improvement compared to a conventional framework. Similarly, improvements in recognition rate is around 4% with applications involving speech and image samples. This paper focuses on the design of a bio-inspired soft computational framework using DSP processors. The DSP processor’s high throughput characteristic and capability of executing million instructions per second provides better computational result as the ANN by deign wise provides a parallel architecture. A prototype of the work is also reported in [4] using parallel processing. The role played by parallel computing environment in increasing the processing performance of real time applications involving speech and image processing is shown here. It provides certain insights into bio-inspired system design. Experimental results show that multicore CPU arrangement helps ANN to learn applied patterns better. Section 1 provides a brief introduction of the bio-inspired tools and related things. In Section 2, certain important features of the DSP processors for BioInspired design are discussed. A brief introduction to TMS320C6713 is also pro-

Bio-Inspired Soft-Computational Framework for Speech and Image Application

55

vided in this section. The system model of the work and experimental steps in detail are discussed in Section 3. Results of the experiments are provided in Section 4. Finally the work is concluded in Section 5.

2 Key Features of the DSP Processor for Bio-Inspired Design The features like speed, cost-effectiveness, reprogram ability in the field, energy efficiency etc have made the DSP Processor suitable and advantageous for application in bio-inspired soft-computing tool design. DSP’s differ from ordinary microprocessors in that they are specifically designed to rapidly perform the sum of products operation required in many discrete-time signal processing algorithms. They contain parallel multipliers, and functions implemented by microcode in ordinary microprocessors are implemented by high speed hardware in DSP’s. Since they do not have to perform some of the functions of a high end microprocessor like an Intel Pentium, a DSP can be streamlined to have a smaller size, use less power, and have a lower cost [5]. Most of these processors share various common features so as to support the high performance, repetitive, numeric intensive tasks and lowering the computational complexity. Some of the key advantages are specialize CPU architecture, Multiply and Accumulate units (MACs) and Multiple Execution Units, Efficient Memory Access, Circular Buffering, Dedicated Address Generation Unit, Specialized Instruction Sets etc. As shown in eq. 1 the multiplication and addition of ANN is mostly performed by the MAC operation of DSP processor. The basic DSP arithmetic processing blocks are registers, multipliers, Arithmetic Logic Units (ALUs), shifters which work in parallel during the same clock cycle and thus optimizing MAC as well as other arithmetic operations for faster computation. TMS320C6713 DSP Processor has been used here. This is a floating point DSP Processor of TMS320C6x(C6x) family manufactured by Texas Instruments (TI). During the training phase of an ANN, the resultant output of the nodes and the adaptively measured weights are generally floating point values. So the floating point processor, in this case, provides more reliable and precise results compared to a fixed point processor. DSP processors such as the TMS320C6x (C6x) family of processors are Fast special-purpose microprocessors with a specialized type of architecture and an instruction set appropriate for signal processing. The C6x notation is used to designate a member of Texas Instruments (TI) TMS320C6000 family of DSP processors. The architecture of the C6x digital signal processor considered suitable for digital signal processing. Based on a very long instruction word (VLIW) architecture, the C6x is considered to be one of the TIs most powerful processor [6]. The TMS320C6713 DSK which has been used during the experimental work contains the TMS320C6713 digital signal Processor. TMS320C6713 is a high performance floating point DSP, its working frequency up to 225 MHz, the single instruction execution cycle is only 5 ns, with a strong fixed-point floating-point computing power

56

D. Sarma and K. K. Sarma

generates a computational speed of up to 1.3 GFLOPS. TMS320C6713 processor consists of three main components: CPU core, memory and peripherals. The CPU contains eight functional units that can operate in parallel, has two sets of registers, address are 32 bit wide. On-chip program memory bus has a width of 256 bit. Peripherals including the expansion of the direct memory access (EDMA), low-power, external memory interface (EMIF), serial port, McBSP Interface, IIC interfaces and timers. The C6713 DSK is a low-cost standalone development platform that enables users to evaluate and develop applications for the TI C67xx DSP family. The figure 1 shows the functional block diagram of the DSK. It also serves as a hardware reference design for the TMS320C6713 DSP. Schematics, logic equations and application notes are available to ease hardware development and reduce time to market.

Fig. 1 Functional Block Diagram Of TMS320C6713 DSK.

3 System Model and Experimental Details The ANN implementation to speech and image data is carried out using back propagation feed forward ANN algorithm. The samples, generated from different sources contain speech extracts and face captures. Some of the samples are mixed with

Bio-Inspired Soft-Computational Framework for Speech and Image Application

57

noise. The sample sets thus generated consists of a sizeable number of data for use with the proposed system. Of these about 25% are categorized as training set, another 25% for validation and the rest taken for testing of the recognizer. Set of speech samples are recorded with variable sampling rates between 8 Kbps to 16 Kbps. The soft computational framework is designed with the DSP Processor and its performance is compared with INTEL Dual Core Processor. Figure 2 shows the process logic of the framework.

Fig. 2 Process Logic Diagram.

Table 1 Configuration of Intel duel core TMS320c6713. Parameter

Intel duel core

TMS320c6713

Frequency Memory

2 GHz 2 GB

225 MHz 256 KB

The configuration of both the processors are shown in table 1. The preprocessed image and speech data are applied to ANN, which is simulated in both the TMS320C6713 DSP Processor and Intel Dual Core Processor. The ANN is made as per the configuration as shown in table 2. Although the experiment is done using several sets of number of input neurons such as 10, 25, 50, 100, 150, and 200 however 50 input neurons is chosen as optimum value with a view to the memory available and the numbers of epochs required for training. The number of output number is 4 as we have recognized 4 different patterns. The same ANN is also simulated with another set of transfer functions, where the tanh like non linear transfer function is used in hidden layer instead of log sigmoid. The simulation in TMS320C6713 DSP Processor is done using C language

58

D. Sarma and K. K. Sarma

Table 2 Configuration of Artificial Neural Network. Parameter

Value

No of Input Neuron No of Hidden Neuron

50 Varied between .5 times to 2.5 times of the No of input neurons 4 4 log sigmoid, log sigmoid, log sigmoid .5

No of Output Neuron No of Patterns Transfer Functions Learning Rate

and is build and executed in the processor with Code Compose Studio, version 3.3 (CCS 3.3). In case of Intel Dual Core Processor, the simulation done in a Linux environment using C.

Speech Processing Application The speech signals are collected using the Mic input of TMS320C6713 DSK and sampled. The sampled data are corrupted by noise, with signal to noise ratio (SNR) value of ±5dB. Next, these noisy data are sent through filter block. The preemphasis filter is a digital filter, designed with adjusting components changing the filter coefficients so as to update the frequency characteristics [7]. For the speech signal, sets a transposed equiripple FIR filtering is used. Several filter structers are designed using TMS320C6713, transposed equiripple FIR structure are found to provide the minimum mean square error (MSE) and also less processing time [8].

Image Processing Application Ten numbers of images are taken using the web cam of computer and each of clean and noise corrupted images are used for the experiment. Various preprocessing steps are implemented and then the image data are used for ANN training.

4 Experimental Results For various number of hidden neurons, the number of epochs and processing time with TMS320c6713 and Intel Dual Core are shown in tables 3 and 4 using patterns as speech data and image data respectively. As shown in the tables 3 and 4, for both the speech and image data the number of epochs and hence the processing time required to meet the MSE of 1 × 10−3 is less for TMS320C6713, compared to the INTEL Dual Core processor. For number of hidden layer neurons equal to (2× input layer) neurons the processing speed

Bio-Inspired Soft-Computational Framework for Speech and Image Application

59

Table 3 Comparison of Epochs and processing time for various number of hidden neurons with TMS320c6713 and Intel Dual Core using Speech data pattern. Number of Hid- Number of Number of Total time re- Total time reden Layer Epochs required Epochs required quired (In Sec- quired (In Secusing using onds) onds) Neurons

TMS320C6713

INTEL Dual Core TMS320C6713

INTEL Dual Core

25 50 75 100 125

492 417 276 205 360

612 506 390 314 522

4.07 3.71 2.66 2.10 3.43

3.67 2.88 1.73 1.34 2.27

Table 4 Comparison of Epochs and processing time for various number of hidden neurons with TMS320c6713 and Intel Dual Core using Image data pattern. Number of Hid- Number of Number of Total time re- Total time reden Layer Epochs required Epochs required quired (In Sec- quired (In Secusing using onds) onds) Neurons

TMS320C6713

INTEL Dual Core TMS320C6713

INTEL Dual Core

25 50 75 100 125

549 361 185 148 395

719 594 403 336 458

3.91 3.28 2.82 2.41 3.67

3.45 2.19 1.30 1.05 2.74

performance is found to better compared to other cases of hidden layer neurons. In this case, compared to INTEL Dual Core processor, the TMS320C6713 processor provides improvement in processing speed efficiency of around 43% and 37% for image data and speech data respectively. However, if we compare the processing time between the number of hidden layer neurons equal to (1.5× input layer) neurons and (2.5× input layer) neurons with reference to (2× input layer) neurons, we see that the processing time for (2.5× input layer) neurons is much higher than (1.5× input layer) neurons. So this can be concluded here that for number of hidden layer neurons between (1.5× input layer) neurons to around (2× input layer) neurons the soft-computational framework will have faster processing. The figure 3, shows the plot of MSE versus number of Epochs. It clearly dictates that TMS320C6713 process the ANN faster than the INTEL dual Core processor. A plot of number of hidden layer neurons versus number of average epochs for both image and speech data required to have MSE of 10−3 is shown in figure 4. Next we have processed the ANN for 2 second with both speech and image data. Table 5 and 6 shows this performance. It is observed from the tables 5 and 6, that with varying the number of hidden layer neurons apart from the variation in MSE and there is a difference in the recognition efficiency shown by the framework

60

D. Sarma and K. K. Sarma Plot of MSE vs Number of EPOCHS

0

10

Using TMS320C6713 Using Intel Dual Core

−1

MSE

10

−2

10

−3

10

0

50

100

150 200 Number of EPOCHS

250

300

350

Fig. 3 Plot of MSE vs Number of Epochs for TMS320c6713 and Intel Dual Core.

formed by the processors. The recognition efficiency is better using TMS320C6713 DSP Processor compared to INTEL Dual Core Processor. The recognition efficiency difference in percentage between the processors is shown in the tables 5 and 6. The recognition efficiency found to increase by around 4% and 5% with TMS320C613 compared to INTEL Dual Core for 100 neurons in the hidden layer for image and speech data respectively. Table 5 Performance of TMS320C613 and Intel Dual Core processor for 2 seconds with speech data. Number of Hid- MSE attained using MSE attained using % Difference in Recognition Effiden Layer Neu- TMS320C6713 INTEL Dual Core ciency between TMS320C6713 and rons INTEL Dual Core processor 25 50 75 100 125

.103 .0621 .0006 .0002 .0337

.5549 .2811 .0239 .0077 .0926

1% 3% 5% 5% 2%

Bio-Inspired Soft-Computational Framework for Speech and Image Application

61

Plot of Hidden Neurons vs Processing Epochs 800 Using TMS320c6713 Using Core 2 Duo 700

Processing Epochs

600

500

400

300

200

100 20

40

60 80 100 Number of Hidden Neurons

120

140

Fig. 4 Plot of Hidden Neurons vs processing Epochs for TMS320c6713 and Intel Dual Core Table 6 Performance of TMS320C613 and Intel Dual Core processor for 2 seconds with image data. Number of Hid- MSE attained using MSE attained using % Difference in Recognition Effiden Layer Neu- TMS320C6713 INTEL Dual Core ciency between TMS320C6713 and rons INTEL Dual Core processor 25 50 75 100 125

.0744 .0031 .0007 .0003 .0319

.0880 .0097 .0045 .0029 .0612

0% 5% 2% 4% 2%

4.1 Evaluation of the Design and implementation of tanh activation function The hyperbolic tangent (tanh) sigmoid function is a popular and one of the most frequently used activation function in backpropagation ANN applications. This activation function is (referred to as ”tansig” in Matlab) provides at the output of a neuron, a non linear function that has tanh like transition between the lower and upper saturation regions and is given by eq. 2

62

D. Sarma and K. K. Sarma

en − e−n (2) en + e−n This function is not suitable for direct digital implementation such as digital signal processors, programmable logic etc, as it consists of an infinite exponential series. Many implementations use a lookup table for approximation. However, the amount of hardware required for these lookup tables can be quite large especially if one required for a reasonable approximation [9]. A simple second order nonlinear function exists which can be used as an approximation to a sigmoid function [10]. This nonlinear function can be implemented directly using digital techniques.   1, for L≤ n; f (n) = h(n), for -L

Suggest Documents