Unknown Example Detection for Example-based ... - Semantic Scholar
Recommend Documents
In the categorization task of multiple classifications, the best results were achieved for the classes' mass mailer, backdoor and virus (no benign classes).
Abstract. This paper presents a novel technique for distinguishing images of forms from other document images. The proposed algorithm detects regions which ...
The best results were achieved by using 3-6 n-grams and a profile of ... achieved for the classes' mass mailer, backdoor and virus (no benign classes). In.
patch is provided by the operating system provider (if needed) and the antivirus vendors ... detection operations are performed locally at the host level. ..... best feature selection and the top features to evaluate the unknown worms detection.
classifiers using a set of labelled (as malware and legitimate software) and unlabelled instances. ..... bulk mail), Good Word Attack is a method that mod- ifies the ...
Abstract. This paper presents new results for the assessment of reliability of predictive interval maps constructed using a consistency criterion with respect to a ...
Sep 29, 2005 - screen and continuous point-source illumination from a powerful xenon ..... of firearms research, such as forensic investigation of point-blank.
Supervised machine-learning models have been used to solve this issue, ... proaches improve the accuracy of fully unsupervised methods (i.e., no labels within.
the hybrid of the contact force and the contact position feedback is developed. ..... and approaching the virtual contact points to the desired contact points on the ...
A search from mk along the normal direction of the edge vi,vi+1 yields a distance ... This is ac- complished through an observation function that predicts the view.
plored area that a robot can cover with its sensors upon reaching that location. ..... laser-range scanners and a Pioneer I robot equipped with a single laser ...
Mar 18, 2013 - a novel method to support semantic queries in relational databases with ease. Instead of casting ontology
Key to the autonomous exploration of an unknown area, by a scientific robotic rover is ... indicates the most common rock type over a very large area. It does not ...
Jan 2, 2013 - Wilhelm Busch's drawings (a famous German author and illustrator). However, Wilhelm Busch did not mention bed bugs in any of his stories (K.
Ted Roush, Andrew Moore, Martial Hebert, Dimitrious. Apostolopoulos, Peter Cheeseman and all the participants in CMU's RAMS program. This work was.
by the Covariance Intersection algorithm, which conducts the search only along a ... obtain a consistent estimate of the covariance matrix when two random.
visited node we know the number of unvisited edges leaving the node, but we ... [8, Problem 22-3] will explore any unknown Eulerian graph with at most 2m ...
algorithms can be traced back to Sutherland 79] who presents an outline of a proof for .... direction, it returns to the center of the square and tries again another 90 degrees to .... (1) Choose an arbitrary xed direction call it Finit and face this
Vinod Prabhu, James Johnston,. Duncan Ingrams, Carl Passant. Royal Gwent Hospital, Newport, UK. Abstract. The authors report the case to understand.
strategy to solve a unknown object local manipulation problem with an multifingered robot hand. .... based on the reforcement learning approach. Th.Schlegl and.
In this paper we present a variation of the template attack .... eâ 1. 2 (miât)Câ1 i. (miât)T. (3). Using the known plaintext (or ciphertext), the intermediate target ...
Autonomous long range navigation in partially known planetary-like terrains is ... bler [7]), to most recent developments throughout the world [10, 44, 52, 31, 54].
Jul 1, 2015 - or dyspareunia are the most common reasons to visit the ... vaginitis (DIV) is a chronic inflammatory process of unknown etiology, characterized.
German sentence Der SchloÃturm beherbergt das Museum, where ... der (figure 3) has an ambiguous lexical entry because its argument, the noun it combines.
Unknown Example Detection for Example-based ... - Semantic Scholar
Dialog strategy and topics. A spoken dialog systemâTakemaru-kun" has ... message is made simpleï¼ and its characteristics are also designed. å¸14ã´4. å¥lã ...
03-5
Unknown Example Detection for Example-based Spoken Dialog System
Graduate School of Information Science, Nara Institute of Science and Technology {shotルt,kawanami,sawatari,shikano}@かnaist.jp framework has two advantages regarding system maintenance. The first is ease of database construction. The second is the ability to treat utterances that are hard to be described by a dialog model. Especially on a one-question-one-answer dialog system [3], this framework only requires the example and response pairs (QA pairs), which are even easy to construct. Nevertheless using this framework, the expansion cost of the dialog system is still expensive since example construction includes utterance labeling costs. Therefore, a method that reduces the expansion cost is important for continuous operation of a dialog system. In the following sections, we discuss the labeling cost reduction method using a similarity score. We also show the experimental results of reducing the number of newly labeled expansion data.
Abstract In a spoken dialog system, the example-based response generation method generates a response by searching
a
dialog
example
database
for
the
example question most similar to an input user utterance. That method has the advantage of ease of system expansion. It requires, however, a number of utterance examples whose correct responses are labeled. In this paper, we propose an approach to reducing the system expansion cost. This approach employs
a
detection
method
that
screens
the
unknown examples, the utterαnces 10 be added 10 Ihe dalabαse
with
Iheir
correct
experimental resulls show
responses.
that Ihe method
The can
reduce the number of utterances required to be labeled
while
maintaining
the
syslem
response
accuracy improvement as well as full labeling
2. Speech-oriented information guidance 1. Introduction
system“Takemanトkun"
Spoken dialog systems have been investigated in the field of the interfaces on some infonnation search systems [1, 2, 3, 4, 5]. However, few case studies of long-tenn operation have been reported [6]. In view of a search tool, a dialog system is required to work stably. In view of medium of infonnation or communication, the system is required to be updated the dialog data to deal with the latest information. Especially in long-tenn operation, reflecting users' interest in the system response (e.g. local infonnation and seasonal infonnation) is important [7]. Thus it is important to choose a system framework that can easily update the contents it provides. The example-based spoken dialog framework [3, 4, 5] has been investigated for the sake of easy system expansion. This system employs the dialog example database to select the most similar example to the user input speech (recognition results). The
We have been operating a spoken dialog system “Takemaru・kun" [3] in the north community center of Ikoma City for over 6 years to investigate a real environment spoken dialog system. Through the long-term operation, we have also constructed a spontaneous speech database with speech infonnation tags (transcription, age groups, correct response, etc.) 2.1. Dialog strategy and topics
A spoken dialog system“Takemaru-kun" has been installed at the community center To make it familiar to a novice user, this system uses the agent character“Takemaru-kun," and the dialog is designed to be simple and friendly. Thus, the dialog strategy is based on the one-answer-one question dialog strategy, the response message is made simple, and its characteristics are also designed.
司14 ヴ4 句lム
Takemaru・kun mainly provides the brief guidance about the facilities of the community center and local sightseeing, and it also provides simple chat-like self-introduction and latest information such as current time, weather or news. Recently, we have improved the web search task by constructing a language model using a web keyword corpus [7]
Speech recognition results User input
2.2. Database
AII system inputs have been collected from the initial operation. We have constructed a spontaneous speech database from the first' two years of collected speech data. ln this paper, the database is employed in the experiments, and the features of that database are described in the section 5. More details of the database features are described in the references [7, 8]. We have also been collecting the unlabeled spontaneous speeches for four years and those data are available for an unsupervised training of an acoustic model [6].
ql掴吋
D1abg-Em-,system res凹nse example
Ouestion and Answer
User input
Figure
System respon田
Database (QADB)
1.
Response
generation
flow
in
example-based spoken dialog system.
3. Example-based spoken dialog system
An example-based spoken dialog system mainly employs an example database that stores pairs of user unerance information and its correct (appropriate) system response. When a user input occurs, the system selects the most similar dialog example to the user input and generates its response [Fig. 1]. Common frameworks of the system [1, 2] have these features; (a) multi-tum dialog is assumed as dialog strategy, (b) the example data includes a user input, (c) the system states (dialog context), the correct response, and the example database are often constructed with RDBMS (relational database management system). In the answering process, a search query is generated by the user input parsing to search dialog examples. As for the framework in Takemaru-kun, one question-one-answer dialog is assumed. Thus the framework is much simpler than the common one: the example data does not contain a system state, and the user input is compared with the example question directly so that the RDBMS is not employed. [n Takemaru-kun, that example database is called "question and answer database," QADB. This framework can avoids the input parsing e汀or that occurs when the speech recognition fails, especially when using ASR-QADB methodology, which uses the automatic speech recognition (ASR) results as the example question data [9]
コ弘 一一_//:-�" EN classifier S田町山EE主主 因 HLJ;!回一
ロADB
S阿酬問問n曲 expansion
Figure 2. Spoken dialog system operation with QADB expansion.
4. Unknown example detection
For QADB expansion, we collect the example from the user speech for reflecting the users' interests. Temporally for QADB expansion, we collect examples合om the user speech to reflect the users' Înterests. However, it is inefficient to manually anach transcription and a correct answer for all of the speech since some u口erances may exist in the QADB. Therefore, we propose a method to detect examples that are“unknown" (not included) in the existing QADB examples [Fig. 2]. In our framework, the similarity of input and example are calculated on the surface expression level (bag-of-words). Thus, for the QADB expansion, it is enough to judge the similarity of collected unerances and existing example questions in the initial QADB.
円〆臼 ヴt
4.1. QADB expansion with unknown example
Table 1. Dataset features
Aug. 2003 1,053 Adult Child ド543 同v. 2002 - Julv.2003 Period Adult 10,588 昨 of data Child じ8,440 Sep. 2003 - Oct.2004 Period 8,795 保of data 同ul Child 50,906 b2.2% Ratio of un・lA.dult Iknown ex. 140.2% Child
Evaluation lPeriod �ata 1# of data
detection
At the first, noise data are screened out of the collected user speech. Next, valid utterance data are processed by the ASR engine. Then the utterances are screened by unknown example detection using their ASR results. The detected unknown utterance will be labeled its correct response by a human labeler. ASR-QADB methodology also avoids the transcription process.
Initial rammg
はiti
4.2. Similarity score
The INMS (intersection normalization by maximum size) method is employed to caIculate the similarity score [9]. This method is based on word matching scoring, and it calculates the ratio of corresponding words over the maximum word count of the two sentences. This score takes the value between 0.0 and 1.0, and the higher value is meant to be more similar between two sentences. If the maximum score of a sentence between QADB examples becomes lower than the pre-defined threshold, the sentence is judged as "unknown." When the threshold is set lower, the labeling cost (amount of detected unknown utterances) becomes lower, although the true unknown examples will be rarely appended (thus, the system response accuracy will not improve). The proper threshold is on the trade-off point where the labeling cost becomes lower while the response accuracy improves.
Table 2. ASR conditions
内SR ngme s ti UL C O lI C M odel
anguage L恥f odel
Julius[10] Ver 3.5.3, 10-best-sentence output. Retrained介�AS model with 2・year Takemaru speech data 3・state L2R HMM, 2,000 triphone PTM, 64 GMM per state Corpus: Transcription of 2・year Takemaru speech data, morphological analysis with ChaSen[11] Smoothing: Witten・Bell method
In the experiments, we use the valid utterances of “Takemaru-kun" sp巴ech data that have already been labeled with the correct responses. See the next sectlO n. 5.2. Dataset
We employ the "Takemaru・kun speech database" for the experiments. This database consists of the user speech collected by the dialog system "Takemaru-kun," discussed in the section 2. The numbers of the responses in the whole utterance database are 275 and 285 for adult and child uロerances, respectively. In the experiment, we divide the valid u口erance data into three parts, 9 months of data for the initial QADB, 1 month for the evaluation data, and 14 months for the QADB expansion data [Table 1]. ASR conditions are shown in Table 2.
5. Experiments 5.1. Procedures
We prepare the three valid utterance datasets, initial QADB construction data, QADB expansion data and the evaluation data. In the experiment, firstly, the unknown example utterances are detected from the QADB expansion data. Next, the expansion QA pairs are constructed by labeling the correct response to detected u悦rances. Then the updated QADB is constructed by appending the expansion QA pairs to the initial QADB. Finally, the system evaluations are performed on the initial QADB and the updated QADB The system evaluation measure is response accuracy and gross detected inputs. Response accuracy is the rate of correctly answered evaluation data by the evaluated system. Gross detection ratio is the ratio of detected utterances in the expansion data. Gross detection ratio is meant to show the utterance labeling cost.
5.3. Results and Discussions
Figure 3 shows the relations of the response accuracy and the gross detected inputs for each score threshold of the detection method. Each point in the chart shows the result on each step of score threshold e = [0.0, 1.1] with 0.1 steps. e = 1.1 means“over 1.0" in order to treat all the expansion data as unknown. For comparison, manually transcribed data are also evaluated as reference results.
内ペリ 司i 司lA
一「
C ost reducti:m and RA in provem ent 包dult)
一一一一一一一一一一一ー 8
L.� � ... ... ・.' ・φ
0 8
aE恒 sc e SEE50
位
8 7
8
6 7
77.8 (447Iut.J
2 0 8 6 4 2 0 7 7 6 6 6 6 6
az出 aEEZ詰Sz
Lee
Conversationa1Interface for Weather Information," IEEE Transaction on Speech and Audio Processing, vo1.8, no.1,
Kiyohiro
Shikano,
・‘Pub1ic
Speech-Oriented
Guidance
System with Adu1t and Child Discrimination Capability," Proc. ofIC.\SSP2004,vol.1,pp.433-436,2004 [4] Hiroya Murao, Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya T紘eda, Yasuyoshi Inagaki, “Example-based Example
Spoken
Dialog
Augmentation,"
System
Proc.
of
with
Online
ICS LP2004,
Spec4402p・7‘pp.3073・3076,2004. [5] Cheongjae Lee, Sangkeun Jung‘Seokhwan Kim, Gary Geunbae practical
Lee, "Example-based multi-domain
dialog
dialog
modeling
system,"
[9]
ASR
and
QA
in
ロ
園田cr:.
「←speech 6. speech
e且07
伝言訂 主事='
U
3.
1.1 150袋路u1)
ul �
U
M
U
1 8""10
61.6 (27590
61.0 ι�一一一一一一 0.9 171初叶 u
(il孟)1 (iút.) ,
72.6! I
M
�
M
U
1
ross de也cti>n !8出
Labeling
cost
reduction
with
Dotted line with round points shows the 閃ference evaluation (transcription input with transcription QADB, shown by ‘'transcr."). Solid line shows the actual evaluation (ASR resu!t input with ASR司 QADB, shown by“speech"). Hollow symbols show the resu!ts of initial QADB. Each point shows the result on each step of predefined score threshold 8 = [0.0, 1.1] with 0.1 steps. The number under the point shows the RA on the predefined threshold and the parenthesized number shows the number of det疋cted exoansion utterance data. Shota
Takeuchi,
Tobias
Hiroshi
Cincarek,
Hiromichi
Saruwatari,
Kiyohiro
roiecls/chasen-I己gacv/
[11] Akinobu Lee, Tatsuya Kawahara, Kiyohiro Shikano, “Julius --- an Open Source Real-Time Large Vocabulary Recognition
Engine,"
pp.1691・1694,2001
for
Real
EUROSPEECH 2007, pp. 1469・1472,2007 [7] Jumpei Miyake, Shota Takeuchi,Hiromichi Kawanami. Hiroshi Saruwatari, Kiyohiro Shikano,日Language Model for the Web Search Task in a Spoken Dialog System for Children," in Proc. of WOCC1, Chania, Greece. October 2008 [8] Shota Takeuchi Cincarek Tobias, Hiromichi Kawanami. “
1
-.-.-一一 一ー.
_____
-
一・一国nscr.
[10] !lttn://soun:efome. i%
Environment Speech-oriented Guidance Task", in Proc. of
Hiroshi Saruwatari, Kiyohiro Shikano,
0.9
0.8
Using Speech Recognition Results," INTERSPEECH 2008,
Speech
a
0.7
pp.45 1・454,Sep,2008.
[6] Tobias Cincarek,Izumi Shindo, Tomoki Toda, Hiroshi for
0.6
Shikano, "Question and Answer Database Optimization
Saruwatari, Kiyohiro Shikano,“Development of Preschool Subsystem
Labeling cost reduction is desired for the long-term operation of the spoken dialog system. Detecting the unknown example utterances will reduce the labeling cost to expand the QADB while maintaining improvement of the response accuracy
“JUPITER:
0.1
G
6. Conclusions
Hetherington,
7 o
|
ー令-speech l 三一竺竺E竺剖
72
58
Po1ifroni, Christine
5 a '
4 7
About adult speech data, 8 = 0.8 brings a cost reduction to about 50%, with 3.3 points of response accuracy improvement, which is similar to full labeling (8 > 1.0). 8 = 0.4 brings a cost reduction to 15% (ratio of about a day per week working period), with 1.3 points of response accuracy improvement. On the other hand, the response accuracy for child data is saturated even if the initial QADB is expanded, since that initial QADB has sufficient examples. Considering the experimental resu!t, the cost reduction of QADB expansion for the long-term operation is achievable with our unknown example detection method.
Construction
and Optimization of a Question and Answer Database for a Real-environment Speech-oriented Guidance System." Proc. Of Oriental COCOSDA 2007,pp.1 49・154,2007