Exploring Recognition Network Representations for ... - CiteSeerX

Recommend Documents

Exploring Wireless Sensor Network Configurations for ... - CiteSeerX

IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.8, ...... [4] Q. Xue, A. Ganz, âMaximizing sensor network lifetime: analysis.

Learning Auditory Neural Representations for Emotion Recognition

emotion recognition: speech and music-based recognition. We propose the use ... and when classifying music, characteristics such as tempo and instrumentation [8] ..... Classical, Country, Disco, Hip hop, Jazz, Metal, Pop, Reggae and Rock.

Region-Based Representations for Face Recognition - MIT

Second, we learn âoptimalâ sets of region comparisons for recognizing faces across varying ... This primitive is then evaluated using the FERET database of face images and then ... changes in illumination and viewing angle), making them a poor ch

network pictures â concepts and representations - CiteSeerX

Network picture; network management; business-to-business marketing. Autobiographical Notes ...... automation âwe might not need our settlement departmentâ.

Exploring Segment Representations for Neural Segmentation Models

Apr 19, 2016 - Properly representing segment is critical for good segmentation ... *Email corresponding. However, to ..... [2013]'s feature templates for CWS.

Learning Representations for Speech Recognition using Artificial

in practice or even how to relate the optimal model capacity to the given data. Nevertheless, ANNs .... tion using parametric and differentiable pooling operators. Chapter 7 .... multi-layer, deep} neural networks or multi-layer perceptrons. Hereafte

Exploring Concurrent Auditory Icon Recognition - CiteSeerX

a prior classification of sounds to ensure no sound pair in the con- dition had the same action or ..... [15] R. F. Port, F. Cummins, and J. D. McAuley, Mind as Mo-.

Face Recognition using Sparse Representations

(based on the adjacency graph) and achieve dimensionality reduction by applying ... usually performed using the k-nearest neighbors (kNN) classifier.

Parametric dynamic neural network recognition power - CiteSeerX

Department of Engineering Science and Physics, The College of Staten Island ... the paper does not answer the question, it allows some parameters of the neural ...... IEEE Trasactions on Neural Networks, vol.11, N3, pp.734 -738 (2000). [24].

artificial neural network-based face recognition - CiteSeerX

This document demonstrates how a face recognition system can be designed with artificial neural network. Note that the training process did not consist of a ...

Neural Network based Road Sign Recognition - CiteSeerX

thesis presents a hybrid neural network solution for Road sign recognition ... The capability of recognition of a neural network ... Formation of 'HOSPITAL'. 2.

Exploring Graph-based Network Traffic Monitoring - CiteSeerX

promising, showing that TDGs can provide the basis for the next generation of network monitoring tools. I. INTRODUCTION. Modeling network traffic as a graph ...

Computing Appropriate Representations for ... - CiteSeerX

of multidimensional data and we present a framework for in- vestigating the ..... sorts the row. If r is a row and R is a representation, sorting a row means applying ...

GENERATION OF REPRESENTATIONS FOR ... - CiteSeerX

A two-step learning method for velocity estimation is pre- ... ear method in the supervised learning step. 1. .... A symbolic illustration of the method of using CCA.

A Neural Network Architecture for Automated Recognition ... - CiteSeerX

disability adjusted life years (DALYs)[2]. It has been estimated that the economic burden is .... Jeffry Johnson, Philipa Picton. Concepts in Artificial. Intelligence ...

Twitviz - Exploring Twitter network for your interests - CiteSeerX

it[1,2]. Nowadays, many people can no longer imagine a ... or republish, to post on servers or to redistribute to lists, requires prior ..... Boston University, 2008. 7.

Neural Network based Approach for Recognition Human ... - CiteSeerX

such as Military security, Public and commercial security, etc. The model ..... [9] James A. Anderson , Introduction to Neural Network , PHI,. 1999. [10] Jain, R.

Learning Network Node Representations from

6.3 Latent representations in R2 learned by DeepWalk, node2vec and struc2vec . ...... Last, we make struc2vec available at: https://github.com/leoribeiro/.

Assessing Text Representations With Recognition - Semantic Scholar

domain knowledge and text coherence influence readers' textbase and ...... Picard orders the ship's android, Lieutenant Commander Data, to learn it.

3 exploring uncertainty in knowledge representations

âHerman Melville, Moby Dick. The world is not a solid ... In this chapter, we continue to trace the lines of inquiry suggested by. Latour's notion of an ...... Page 28 ...

Exploring Multiple Representations In Elementary School Science ...

use of this technology in a single working public elementary school by providing a .... over 80 elementary school children within the Electronic. Visualization ...

Exploring multivariate representations of indices ...

Abstract: A study of the walkability of a Swiss town required finding suitable representations of multivariate geographical da-ta. The goal was to represent ...

Shape Recognition Through Dynamic Motor Representations

There are two problems involved here: first, forming the internal state based on ...... 54. Abelson, H.: LOGO for the Apple II. 1st edn. McGraw-Hill, Peterborough,.

Learning Sparse Representations for Human Action Recognition - IITK

call the Local Motion Pattern descriptor. We also show ..... call the Random Sample Reconstruction. ..... horse riding (s5), running (s6), skating (s7), swinging (s8).

Exploring Recognition Network Representations for ... - CiteSeerX

Download PDF

1 downloads 0 Views 1MB Size Report

Comment

on Highly Parallel Platforms ... Highly parallel computing platforms such as the graphics pro- ..... by matching fund from U.C. Discovery (Award #DIG07-10227).

Exploring Recognition Network Representations for Efficient Speech Inference on Highly Parallel Platforms Jike Chong∗† , Ekaterina Gonina∗† , Kisun You‡ , Kurt Keutzer∗ ∗

Department of Electrical Engineering and Computer Science, University of California, Berkeley † Parasians, LLC ‡ School of Electrical Engineering, Seoul National University [email protected], [email protected], [email protected], [email protected] ,%$"-(#."(&/%+0"12& C=/:$.=& 7/K%>&

!-/*:*=+#./*& 7/K%>&

L#*E:#E%& 7/K%>&

Abstract

1. Introduction Highly parallel computing platforms such as the graphics processing units (GPUs) are enabling tremendous parallel computing capabilities in personal desktop and laptop systems. This shifting landscape in the underlying computing platforms needs to be taken into account during algorithm design and evaluation. In this paper we explore the use of GPUs for continuous speech recognition (CSR) on the NVIDIA GTX480 (Fermi) and the NVIDIA GTX285 GPUs. A CSR application analyzes a human utterance from a sequence of input audio waveforms to interpret and distinguish the words and sentences intended by the speaker. Its top level architecture is shown in Figure 1. The recognition process uses a language database that is compiled offline from a variety of knowledge sources using powerful statistical learning techniques. The speech feature extractor collects feature vectors from input audio waveforms. The Hidden-Markov-Model-based inference engine infers the most likely word sequence based on the extracted speech features and the recognition network. In a CSR system, common speech feature extractors can be parallelized using standard signal processing techniques. A parallel inference engine traverses a graph-based knowledge network consisting of millions of states and arcs. As

!"#$%& '()*+&

F0%%="& G%#,:-%& HI,-#=,/-&

3)%%$4& 5%6+*1%7&

J&

The emergence of highly parallel computing platforms is enabling new trade-offs in algorithm design for automatic speech recognition. It naturally motivates the following investigation: do the most computationally efficient sequential algorithms lead to the most computationally efficient parallel algorithms? In this paper we explore two contending recognition network representations for speech inference engines: the linear lexical model (LLM) and the weighted finite state transducer (WFST). We demonstrate that while an inference engine using the simpler LLM representation evaluates 22× more transitions per second than the advanced WFST representation, the simple structure of the LLM representation allows 4.7-6.4× faster evaluation and 53-65× faster operands gathering for each state transition. We use the 5k Wall Street Journal corpus to experiment on the NVIDIA GTX480 (Fermi) and the NVIDIA GTX285 Graphics Processing Units (GPUs), and illustrate that the performance of a speech inference engine based on the LLM representation is competitive with the WFST representation on highly parallel computing platforms. Index Terms: Continuous Speech Recognition, Data parallel, Graphics Processing Unit, Linear Lexical Model, Weighted Finite State Transducer

;*D%-%*=%&& H*E+*%&

C-="+,%=,:-%&/D&,"%&+*D%-%*=%&%*E+*%2&

8"19& 3%:*%($%& I think therefore I am

)*%&+,%-#./*&0%-& .1%&$,%02& &&&&&34567&+*$,8&

!"#$%&'&

9/10:,%&;*,%*$+