Mobile Devices Converted into a Speaking Communication Aid

2 downloads 0 Views 172KB Size Report
aspects of converting standard devices into a mobile speaking aid for face-to- .... Considering the facts, that we need a small, portable device with rather large ...
Mobile Devices Converted into a Speaking Communication Aid Bálint Tóth, Géza Németh, Géza Kiss Department of Telecommunications and Media Informatics Budapest University of Technology and Economics 1117 Budapest, Magyar tudósok krt. 2., HUNGARY [email protected], {nemeth, kgeza}@tmit.bme.hu

Abstract. The goal of the present study is to introduce a speaking interface of mobile devices for speech impaired people. The latest devices (including PDAs with integrated telephone, Smartphones, Tablet PCs) possess numerous favorable features: small size, portability, considerably fast processor speed, increased storage size, telephony, large display and convenient development environment. Standardized easy-to-use speech I/O is missing, however. The majority of vocally handicapped users are elderly people who are often not familiar with computers. Many of them have other disorder(s) (e.g. motor) and/or impaired vision. The paper reports the design and implementation aspects of converting standard devices into a mobile speaking aid for face-toface and telephone conversations. The device can be controlled and text is input by touch-screen and the output is generated by a text-to-speech system. The interface is configurable (screen colors and text size, speaking options, etc.) according to the users' personal preferences.

1 Introduction The estimated number of severely speech impaired people living in the European Union is two million [1]. Speech impaired people have many difficulties both in faceto-face and in telephone conversations. The problem is more severe if the loss of speaking ability has just developed – the recently disabled person has difficulties making him/herself understood which may induce anxiety, frustration, even depression. Consequently “talking machines” can solve not only communicational, but also psychological and social problems. Furthermore impaired speech can often cause uncomfortable situations (e.g.: difficulties in asking for a glass of water, food, telling the need to go to the toilette, etc.), and unfortunately the lack of vocal communication can result in fatal events as well (e.g.: people can’t phone when she/he feels the symptoms of a heart attack). With the intensive development of technology the desire of disabled people - to approach the conditions of a fully abled person as much as possible - can be more and more fulfilled. Nowadays there are several devices with audio capability and telephony. But which would be the best for such a purpose? First of all we had to consider two important facts:

a)

The majority of vocally handicapped users are elderly people who are rarely familiar with computers. Many of them have other disorder(s) (e.g. motor) and/or impaired vision.

b) For everyday use the size of the device must be as small as possible with a rather large display panel. There are difficulties even with a laptop computer (especially for elderly people). For example if a speech impaired person would like to do shopping, s/he has to pack the laptop computer at home and carry it (it is rather large and heavy). If s/he would like to use it then the laptop has to be taken out of its bag and at the end of the conversation it has to be put back. The whole process is uncomfortable. In case of smaller devices the process is much easier. Furthermore the input method should be easy and fast and the integrated telephone is relevant too. The application requires fairly large storage size and considerably fast processor speed.

2 Problem Statement Considering the two aspects given above we had to face the problem that an interface for a small, portable device should be developed for speech impaired people who are rarely familiar with computers, and probably have vision, motor or/and mental disabilities. Therefore the decision was not easy. The choice of the device is significant because with an inappropriate hardware even the best software fails easily. 2.1 Related work We examined several devices and systems to take advantage of their features, design concept and effects on users to avoid the imperfections they had. In this study we did not examine the old mechanical machines [2], only the newer devices and interfaces. Let us have a look at the most important systems: a)

The example of a Swedish speech impaired teenager is very instructive [3]. He used a Swedish text-to-speech (TTS) system as a communication aid first in 1978. This instance demonstrated the social, psychological and everyday help of a speaking machine. From that time on there was no doubt about the benefits of a speaking device.

b) The Multi-Talk system was a real communication aid (Galyas and Rosengren, 1989) [3]; this was portable and had several important and handy services. We considered useful many of these features (stored messages, repeatable last word, sentence, configurable display, etc.) though telephony is not supported and developing software on Multi-Talk is difficult for third parties. c)

In the early 90s, as laptop computers became increasing accessible for the average user it also became suitable for “speaking” purposes. One of the

beneficial speaking applications is VOXAID (Gábor Olaszy and Géza Németh, 1993) [4]. The usefulness of the system is proved among others by an elderly non-speaking woman, who has been using this system for about ten years now. With the help of an acoustic telephone adaptor the system is able to make phone calls but telephone interface is not integrated. There are disadvantages of the system as well: the everyday usage of a laptop is uncomfortable and VOXAID is black and white, what can often make difficulties for visually impaired people. (Configurable text and background color results in better visibility and vision independent interface.) In our application we inherited some features (free text input, formerly fixed text) and improved many (better telephony) from VOXAID. 2.2 Analysis Summarizing the observations we considered free and fixed text input and telephony beneficial from the former systems. We also applied new features in the software according to the needs of speech impaired users. User requirements have been kept in mind during the whole development process. Some of the new features are as follows (more detailed below): formerly stored, swiftly editable text ( “Partly-fixed text”); configurable user interface with presets; easy usage, even with one hand, “hot keys”. 2.3 Approach Considering the facts, that we need a small, portable device with rather large memory, built-in sound module, telephony and reasonable processor performance we chose the nowadays rightly popular mobile devices, including Pocket PC based Personal Digital Assistants (PDAs) with integrated telephone, Tablet PCs, Smartphones and Symbian based mobile phones. Each device has its own special profile: mobile phones are the best for telephone conversations, Tablet PCs for face-to-face conversation, and PDAs for both, with some compromises. Currently the PDA version of the software is ready. We are just implementing the application on the other platforms as well. In our point of view choosing mobile devices was the best choice as the first step of creating a scaleable speaking interface, though we had to make compromises. These devices have several deficiencies (see 4.2) but can satisfy our purpose even so by solving the most critical problems with a smart speaking interface. Some augmentative communication aids already exist on PDAs. WinSpeak [5] and IconSpeak [6] are very similar to each other – they displays icons on the screen, every icon means a word, and the user can make sentences from them by clicking on the icons. These software are rather rigid, not suitable for wide range of utilization, because speech impaired people cannot adapt them for their needs and the software do not provide text input. There are some multifunctional speaking applications for PDAs as well (like Polyana 3 [7]), but they are able to speak only the language that the application was written for. Namely, no standardized speech I/O has been developed yet on these mobile systems. Furthermore these systems are expensive ones – not only the hardware, but the software too. Consequently there was no doubt we had to develop our own, customizable, multifunctional speaking application.

3 Interface Description Our concept was to create an intuitive interface the basic usage of which can be easily learned by a non-computer user in a few hours with the help of the documentation. We included many partly hidden functions, what make communication faster (see below: Enhanced features, Administrator functions). The application’s name is “MonddKi!” what means “SayIt!” in English. 3.1 Main Functions There are three main functional parts of the program, called “Free Text”, “Fixed Text” and “Partly-fixed Text”. In Free Text one can see a notepad like editable text field. The program may read a line (Fig 1., “Hello ICCHP!”), a selection (Fig 1., “king”) or the whole content of the text field (Fig 1., “Hello ICCHP! I’m MonddKi, a speaking app!”). The text can be saved (File/Save), can be loaded (File/Load) and can be erased (File/New) anytime. The content of the text field remains the same if the mode is changed to Partly-fixed or to Fixed Text. Cut/Copy/Paste/Clear commands are also available. In Fixed Text mode the user can choose from a set of categorized sentences (Fig. 1.). The categories may have subcategories, and the subcategories may have further subcategories, and so on. For example the category “The way you feel” has subcategories like “common”, “disease”, “pain”, etc. These categories have sentences like “I feel well.”, “I’m sick.”, “I have headache.”, etc. The number, the length and the depth of the categories and sentences are limited only by the memory size of the device. The selected sentence can be read by clicking on it, or on the Say icon/menu item. The selected sentence can also be copied to Free Text for additional changes.

Figure 1. Free text (left) and fixed text (right)

Figure 2. Partly-fixed text

The Partly-fixed Text is a combination of editable and formerly fixed text. It has two parts. The first part is almost the same as the Fixed Text. When the user selects (a) sentence(s) the second part (Fig. 2.) is invoked where some formerly defined parts of the sentence(s) can be edited. Clicking on a non-editable area the cursor jumps to the nearest editable text. One can move along the editable texts by the two arrows (“”). The whole text, a line or a selected area can be read by the software. For example we have the “Meet at