interface of a speech database called SPEECH-DB. Our Production .... designed just for beginners. The authors ... intelligent access mechanism. SPEECH-DB,".
EXPERT SYSTEM AS AN INTELLIGENT ASSISTANT FOR COMPUTER USERS
R i i c h i r o MIZOGUCHI*, Yukuo I SOMOTO** and Osamu KAKUSHO* * I n s t i t u t e of S c i e n t i f i c and I n d u s t r i a R e s e a r c h , Osaka U n i v e r s i t y , J a p a n ** Computation Center, Osaka U n i v e r s i t y , J a p a n abstract This paper is concerned w i t h a new implement a t i o n of a p r o d u c t i o n system by u s i n g a r e l a t i o n a l DBMS and FORTRAN and i t s a p p l i c a t i o n to construction of an i n t e l l i g e n t man-machine i n t e r f a c e of a speech d a t a b a s e c a l l e d SPEECH-DB. Our P r o d u c t i o n s y s t e m can be v i e w e d as a n o n d e t e r m i n i s t i c FORTRAN programming system based on c o n t e n t - d i r e c t d e d invocation of subroutines. The i n t e r f a c e is composed of the f o l l o w i n g t h r e e subsystems: Japanese Command processing Subsystem (JCS), w h i c h t r a n s l a t e s t h e q u e r i e s s t a t e d i n Japanese i n t o commands, Problem S o l v i n g Subsystem (PSS), w h i c h d i a g n o s e s e r r o r s and answers u s e r ' s q u e s t i o n s , and MONITOR, which manages the f l o w of c o n t r o l among t h e s e s u b s y s t e m s and SPEECH-DB. D e s i g n p h i l o s o p b y and b a s i c m e c h a n i s m a r e p r e s e n t e d t o g e t h e r w i t h some examples o f t h e system performance.
I.
INTRODUCTION
A l a r g e a p p l i c a t i o n software i s a p t t o b e d i f f i c u l t f o r n o n p r o f e s s i o n a l people t o use and many complicated commands sometimes cause misuse of them. Thus, a s o f t w a r e system w i t h high f u n c t i o n s needs a n i n t e l l i g e n t man-machine i n t e r f a c e [ 1 ] L 2 ] which a s s i s t s users in using i t . We have been designing such a man-machine i n t e r f a c e f o r a database c a l l e d SPEECH-DB[3L SPEECH-DB is a speech d a t a b a s e c o n s t r u c t e d f o r a s s i s t i n g t h e r e s e a r c h o f speech r e c o g n i t i o n . It stores a l o t o f sampled speech data together w i t h t h e i r a t t r i b u t e s so as to be a b l e to r e t r i e v e them under v a r i o u s c o n d i t i o n s . I t has a command l a n g u a g e c a l l e d IQL. A l t h o u g h IQL makes i t e a s y t o r e t r i e v e d a t a , users s t i l l need to have some knowl e d g e a b o u t SPEECH-DB and IQL in advance. Our i n t e r f a c e system of SPEECH-DB is designed to have the f o l l o w i n g t h r e e f u n c t i o n s : 1) To accept q u e r i e s s t a t e d in Japanese. 2) To make it easy f o r users to l e a r n IQL. 3) To diagnose e r r o r s occurred w h i l e using IQL. In order to achieve these ends, the i n t e r f a c e has s e v e r a l k n o w l e d g e a b o u t SPEECH-DB, I Q L , Japanese sentences and so on. Our u l t i m a t e goal is to c o n s t r u c t a s o - c a l l e d "manual-less" system. Such a system should have the knowledge of what it can d o , what f u n c t i o n s i t h a s , how t h e u s e r can use t h e m , what k i n d o f t r o u b l e s t h e u s e r w i l l m e e t , and so o n . The s y s t e m can be v i e w e d as a k i n d of expert systems. One o f t h e m a j o r c h a r a c t e r i s t i c s o f our system is t h a t most subsystems are implemented by u s i n g a r e l a t i o n a l d a t a b a s e c a l l e d INQ. INQ i s
employed f o r the purpose of s t o r i n g and r e t r i e v i n g p r o c e d u r e s used f o r managing d i a l o g between t h e system and the user and f o r diagnosing e r r o r s . A p r o d u c t i o n s y s t e m [ 4 ] [ 5 ] was f i r s t implemented by u s i n g INQ and FORTRAN. Then t h e above i n t e r f a c e was implemented by using the production system. In our c o m p u t a t i o n c e n t e r , a number of a p p l i c a t i o n p r o g r a m s have been d e v e l o p e d . The o b j e c t i v e of our p r o j e c t is to c o n s t r u c t a c o n s u l t a t i o n system about these programs as w e l l as the use of the computation center i t s e l f . For t h i s purpose, we need a t o o l f o r c o n s t r u c t i o n such a system. Production system is well-known to be u s e f u l f o r c o n s u l t a t i o n system c o n s t r u c t i o n . The problem is how to implement a PS. The t o o l should b e a b l e t o u t i l i z e DBMS, g r a p h i c s and o t h e r f u n c t i o n s o f the o p e r a t i n g system o f our host computer in order to cope w i t h the v a r i e t y of the t a r g e t programs. I n our c o m p u t a t i o n c e n t e r , FORTRAN i s t h e o n l y language s a t i s f y i n g these r e q u i r e m e n t s . This is one of the main reasons why our PS has been implemented by FORTRAN. II.
OVERVIEW OF THE SYSTEM
B l o c k d i a g r a m o f our system i s shown i n F i g . 1 . I t i s composed o f seven d a t a b a s e f i l e s , s i x of which c o n t a i n speech data and one of which c o n t a i n s s e v e r a l k i n d s o f knowledge o f c o n s u l t a t i o n about the usage of SPEECH-DB. SPEECH-DB c o n s i s t e s o f s i x f i l e s such a s phoneme f i l e , CV s y l l a b l e f i l e , VCV s y l l a b l e f i l e , word f i l e , environment f i l e and raw data f i l e . The f i r s t f i v e f i l e s are c a l l e d INQ f i l e s and c o n t a i n a l o t o f i n f o r m a t i o n u s e f u l f o r r e t r i e v i n g speech data. Users can r e t r i e v e speech data by using IQL i n t e r a c t i v e l y . IQL s e l e c t s a n e x t e r n a l scheme s u i t a b l e f o r the u s e r ' s q u e r y , s o t h a t the user can get speech data he wants w i t h o u t knowing the l o g i c a l s t r u c t u r e of SPEECH-DB. B. Man-machine i n t e r f a c e A l t h o u g h IQL i s d e s i g n e d f o r u s e r s t o r e t r i e v e d a t a e a s i l y , t h e y m u s t know some s y n t a c t i c and s e m a n t i c r e s t r i c t i o n s o f IQL. So, more s o p h i s t i c a t e d m a n - m a c h i n e i n t e r f a c e i s desirable especially for nonprofessional users. Our i n t e r f a c e c o n s i s t s o f t h r e e modules such a s Problem S o l v i n g Subsystem (PSS), Japanese Command p r o c e s s i n g S u b s y s t e m (JCS) and MONITOR. PSS d i a g n o s e s t h e e r r o r s o c c u r r e d d u r i n g t h e use o f IQL and answers q u e s t i o n s asked by the user. JCS t r a n s l a t e s q u e r i e s s t a t e d i n Japanese i n t o IQL.
174
R. Mizoguchi et al.
R. Mizoguchi et al.
(1) T h i s query means "I want some WorDs whose Averaged P i t c h is greater than t h a t of the speech of ID=11 and whose Averaged Root mean square of energy i s g r e a t e r t h a n t h a t o f t h e speech I D = 1 1 . " A l t h o u g h IQL p e r m i t s n e s t e d r e t r i e v a l , w h i c h i s i n d i c a t e d by a paired brackets, it may not be used in p a r a l l e l . T h e r e f o r e , a n error o c c u r s , t h e e r r o r number is d e t e c t e d by IQL and then MONITOR i n v o k e s PSS. (2) A r u l e w h i c h has t h e e r r o r number i n i t s action part is f i r e d . (3) MONITOR c a r r i e s out the corrected commands.
175
d e s i g n e d j u s t f o r b e g i n n e r s . The a u t h o r s b e l i e v e t h a t a r t i f i c i a l command language is more e f f i c i e n t than n a t u r a l language once one gets f a m i l i a r w i t h it. Semantics-based processing of queries is m a i n l y p e r f o r m e d i n s t e a d o f s y n t a c t i c one. T h i s s t y l e o f p r o c e s s i n g i s based o n t h e f a c t t h a t q u e r i e s t o SPEECH-DB i s much r e s t r i c t e d t o t h e semantics of SPEECH-DB. IQL is designed to be used as a t a r g e t language of n a t u r a l language q u e r i e s , so t h a t it was very easy to i m p l e m e n t JCS. For e x a m p l e , IQL can s e l e c t a n e x t e r n a l scheme a p p r o p r i a t e f o r t h e q u e r y . F u r t h e r m o r e , i t does n o t c a r e about t h e o r d e r o f c o n d i t i o n e l e m e n t s . These char a c t e r i s e t i c s o f IQL makes t h e t r a n s l a t i o n easy d r a s t i c a l l y . I n t h e c u r r e n t i m p l e m e n t a t i o n , JCS can accept a l m o s t q u e r i e s r e a l i z a b l e in IQL and can r e j e c t queries which are beyond t h e a b i l i t y of IQL w i t h some messages i n d i c a t i n g the i n a p p r o p r i a t e p a r t of them. JCS has two k i n d s of d i c t i o n a r i e s . One c o n t a i n s some keywords corresponding to the a t t r i b u t e names and t h e i r values used in SPEECH-DB. The other contains some words h a v i n g s y n t a c t i c i n f o r m a t i o n w h i c h a r e used f o r e x a m i n i n g t h e r e l a t i o n s h i p between c o n d i t i o n s . JCS do not have to know any o t h e r words stored in the above two d i c t i o n a r i e s , since a q u e r y composed of s u c h w o r d s can n o t be r e a l i z a b l e by IQL at a l l . V I . CONCLUSIONS Although the implementation of our i n t e r f a c e is based on a new p r o d u c t i o n s y s t e m , t h e whole model i s r a t h e r a d hoc. I n o r d e r t o r e a l i z e the " m a n u a l - l e s s " system m e n t i o n e d e a r l i e r , several problems should be resolved. 1) To e s t a b l i s h a psychological model of a u s e r . 2) To e s t a b l i s h a t h e o r y of p r o c e e d i n g w i t h conversation based on the u s e r ' s model. 3) To i n c o r p o r a t e the r e s u l t s of i n t e l l i g e n t CAI i n t o our system. These are the subjects to be attacked at the next step in our study.
(1) T h i s query means "I want some WorDs w h i c h c o n t a i n S Y l l a b l e / k u / w i t h RanK=A and S Y l l a b l e / r i / w i t h RanK=A." (2) The commands " ? " is i n p u t by the u s e r , s i n c e the r e t r i e v e d d a t a i s s l i g h t l y d i f f e r e n t f r o m those he expected to have. (3) Users who do n o t know t h e module name w h i c h answers h i s questions can say what they want to do i n Japanese. ( 4 ) The c o n d i t i o n RK1=A, w h i c h means t h e corresponding p a r t o f speech i s s t a t i o n a r y , i s the p r o p e r t y not of s y l l a b l e but of phoneme in SPEECHDB. So, PSS asks the user h i s i n t e n t i o n . D u r i n g t h e d i a l o g i n b o t h c a s e s , PSS c a l l s JCS when necessary, so t h a t users can i n p u t t h e i r r e q u i r e m e n t s in Japanese. T h i s makes t h e i n t e r a c t i o n very f r i e n d l y . V. JCS JCS t r a n s l a t e s q u e r i e s s t a t e d i n Japanese i n t o IQL and a s s i s t s u s e r s i n l e a r n i n g IQL b y s h o w i n g t h e t r a n s l a t e d IQL commands. JCS is
[REFERENCES] [ 1 ] P . Hayes e t a l . : " B r e a k i n g t h e man-machine communication b a r r i e r , " IEEE Computer, Vol. 14 p. 19 (1981). [2]M.R. Genesereth:"The r o l e of plans in automatic c o n s u l t a t i o n , " Proc. o f t h e 6 t h I J C A I , p . 311 (1979). [3]R. M i z o g u c h i e t a l . : " S p e e c h database w i t h a n i n t e l l i g e n t access mechanism SPEECH-DB," Trans, of I n f o r m a t i o n Processing of Japan, Vol. 24 (1983). [ 4 ] R . Davis e t a l . : " A n o v e r v i e w o f p r o d u c t i o n s y s t e m s , " Machine I n t e l l i g e n c e , V o l . 8 , p.300 (1977). [5]R. Mizoguchi et a l . : " H i e r a r c h i c a l production s y s t e m , " Proc. o f t h e 6 t h I J C A I , p.586 (1979). [6]E. S h o r t l i f f e : "Computer-based medical c o n s u l t a t i o n s : MYCIN," A m e r i c a n E l s e v i e r , New York (1976). [7] W. Clancey: " T u t o r i n g r u l e s f o r g u i d i n g a case method d i a l o g u e , " I n t e r n a t i o n a l J o u r n a l ManMachine S t u d i e s , V o l . 1 1 , p. 25 (1979).