Multimedia Model for Infants Cognitive and Emotional Development ...

Keywords: software, multimedia model, infants, children, early cognitive de- velopment ... sequent step of speech development - emergence of the first words.
«INFANT.MAVS» - Multimedia Model for Infants Cognitive and Emotional Development Study Elena Lyakso¹, Aleksei Grigorev¹, Anna Kurazova¹, Elena Ogorodnikova² ¹ St. Petersburg State University, St. Petersburg, Russia [email protected] ² Pavlov Institute Physiology, RAS, St. Petersburg, Russia [email protected]

Abstract. A model of multimodal sensory environment «INFANT.MAVS»is elaborated. It comprises two bases of stimuli of different perceptual complexity: 1) simple stimuli (visual, audible, tactile and graphic) and 2) a set of complex stimuli synthesized as combinations of simple ones. The software includes a database management component and the database itself. The management component is created with Microsoft Visual Basic v.6.0 and is designed to run on operating systems of Windows. The model test results showed that stimuli evoked children's responses in the form of focused attention, vocalizations, smiles and simulation activity; in adults they evoked positive emotions. These data allow us to conclude that the model "INFANT.MAVS" complies with the objectives it was intended for. Keywords: software, multimedia model, infants, children, early cognitive development, emotional development.



Studies of multimodal human-computer interaction are in the center of rapt attention among researchers [1, 2]. An audiovisual speech synthesis model "talking head" was created on Russian material. It included synchronization of speech and visual speech modalities [3]. An information system for training audio and verbal functions in children with hearing and speech disturbances was developed and introduced into clinical (rehabilitation of patients after cochlear implantation) and correctional practice [4]. There are very few such systems for young children, teaching and creating conditions of sensory-cognitive interaction with the outside world. More widely used are children's social networks, designed for primary school children. Their main feature is the predominant focus on the 'edutainment' of a small user through game and parental control over the child's activities on the network. Commercial software "Audio nurse - Video nurse" is frequently used to help parents and caregivers. This is a special device that can help adults watch their child ("Video nurse") and/or hear the baby's voice anywhere in the apartment ("Audio nurse"). Another system implemented in practice is «Why Cry Baby Analyzer HCWHYCRY» [5]. It shows parents and caregivers what is happening with their baby.

After an approximately 20 second analysis of the baby's crying power, frequency and crying intervals, the unit lights up the corresponding illustrated face diagram on the front. There are five categories: hungry, bored, annoyed, sleepy and stressed. A chart offers advice to help comfort and calm the baby based on the category. The system is designed to help parents. It should not be used as a medical device. These devices are more auxiliary than training. Our data about the features of vocal-speech interaction in “mother-child” dyads with normally developing infants and infants having neurological disorders, proverbial orphans [6, 7, 8] – are the fund for constructing the model “Virtual mother” [9] intended for orphans and children with disabilities. The training computer program allows a child to stimulate vocalizations activity in the first 6 months of life; which results in quality complication of vocalizations in the second half-year: expanding the repertoire of sounds, appearance syllabic structures, providing a transition to the subsequent step of speech development - emergence of the first words. However, there is no special software to create adequate conditions for the development of sensorymotor, emotional and cognitive abilities of infants. The purpose of this project was to create a model of multimedia sensory environment with interactive elements for infants and young children and develop of software to work with the model. The model is designed to prevent and remove negative effects of sensory deprivation and to normalize communicative and psychophysiological state of infants under prolonged lack of contact with mother and limitations of social interactions. The tasks of the study: 1) selection of stimuli of different modalities and creating a database of stimuli, differentiated by the degree of complexity; 2) software development framework; 3) testing the program to determine the effect caused by the presentation of different modality stimuli from the created database.




Stimuli selection and organization

A model of a multimedia sensory environment “INFANT.MAVS” including basic stimuli of different perceptual complexity and software to work with them is elaborated. The base consists of two parts - simple stimuli (BSS) and complex stimuli (BCS) (fig. 1). 2.1.1 The base of simple stimuli The base of simple (1380 files, 1.47 Gb) stimuli contains directory video (915 files, 732.3 Mb), audio (401 files 533.4 Mb), tactile (64 files, 229.6 Mb) stimuli. Section "Visual stimuli" contains two subsections video and graphics. The subsection "Graphics" includes black-white and color images. Catalog "black -white image" includes faces, face-like stimuli with all the elements (eyes, nose, mouth, hair), and face-like stimuli with three or two elements presented in different combinations.

Fig. 1. Schematic model and conditions for its uses in testing infants and children.

This subsection includes lines of different thickness and orientation (vertical, horizontal, inclined); lattices; patterns - simple and complex (consisting of a set of simple patterns); geometric figures - simple, two-dimensional and three-dimensional; images of animals and toys. In the "Color image" catalog includes photographs and drawings of people, animals, birds, toys, plants, geometric shapes, household items and everyday scenes. It presents cartoons and illustrations of fairy tales. Section "Sound stimuli" contain subsections of music and speech. The music subsection includes songs and musical tunes, "mothers songs" and lullabies, nature sounds; physiological and life sounds; acoustic stimuli. Subsection ‘speech’ comprises ‘comfortable’ infant’s vocalizations; samples of "mother’s" and "father’s"

speech soothing, attracting infant’s attention and stimulating vocal imitation; nursery rhymes and poems. Section "Tactile stimuli" - includes photographs of surfaces and textures. Stimulus is used to refer to letter writing, corresponding to the type of the stimulus, followed by division into subtypes according to the partition used in the listing of stimuli. Stimulus modality: a - audio, v - video, g - graphics; Audio stimuli: sp speech: ms - maternal speech: (c - soothing, at - attracting attention, i - sounds to stimulate imitation, r - poetry). 2.1.2. The base of complex stimuli The base of the complex includes stimulus complexes of different modalities (177 complex stimuli, 15 - compounds) synthesized by the use of simple stimuli, and provide the basis for creating audio and /or video track. Visuals can be represented as a sequence of videos, static images and animations of specified duration. Compound stimuli are synthesized on the basis of complex stimuli. Sound and speech stimuli are presented in the format - *.WAV, music - *.MP3, video - *.MPG. The organization of the complex stimuli base implies storage of ready-made combinations of stimuli and the ability to create new combinations of stimuli by the user. The model contains a dynamic system of sensory stimulation, the corresponding change depending on the age of the child and competencies of an additional section for adult users. Section "Stimulus for infants from 0 to 6 months" includes subsections - faces, black-white images (audio-graphic and video), lullabies (picture change with appropriate musical accompaniment). Section "Stimuli for 6-12 months old infants," contains all simple stimuli of different modalities in various combinations. Special section "Stimuli for user" includes the subsection "for adult users'' assigned for relaxation, fatigue relief and create a positive attitude in the caregivers and parents.



Software to work with stimuli was created. It allows users to do the following: 1) input and storage of stimulus material in the directories; 2) selection of the stimuli, depending on the problem (for the activation of the filter); 3) viewing and listening to the stimulus material; 4) creating a complex stimulus based on a compound of simple stimuli. This software includes a component database operation and the database per se. Database stimulus material is divided into two large relatively independent parts (with the possibility of interacting with each other). One part of database is the database of simple stimuli. It serves for the orderly storage and retrieval of media data that are the basis for designing complex (composite) stimulus. The second part is the database of complex stimuli. They are a combination of media materials, that are formed by user requirement in a certain order and in a specify method. The management component

of the program is developed in Microsoft Visual Basic v 6. 0 and is designed to run under the operating systems of MS Windows (9x, NT, ME, 2000, XP, Vista, 7). Outer shells are used: Microsoft Visual Basic Run-time (integrated in the installation package), video and audio codec installed on the user's operating system, allowing to work with the following compression standards: video - MPEG (2,4), AVC, H.265, for the sound - MP3 (MPEG-1 Layer I, II, III), AAC, WMA. The program interface is graphical. The main window of the program allows the user to access the database of interest. On the left is the category tree, right at the top - the control panel with 1) the playback button, 2) button to go to the main menu, 3) button to save the changes and filter settings. Category tree provides efficient navigation all over the database. The software allows you to edit the category via the context menu, and by ‘dragging’. Context menu is used to open a window for editing stimuli. The created model "INFANT.MAVS" is a software product that is ready to be installed on personal computers.



The model "INFANT.MAVS" was tested on the 22 children (from 1.5 months to 7 years), growing at home and normally developing. Informed consent for the study was approved by the Ethics Committee of St. Petersburg State University (№ 02-36, 1.16.2014). 4.1

Testing procedure

Children were presented the «INFANT. MAVS» stimuli in a computer screen. Infants of 0-6 months were lying in bed, the monitor was placed in front of the baby's face approximately at 25 cm; 6 - 12 months infants were sitting on the mother's lap in front of the monitor; 1-7 years old children were located in front of the monitor without an adult. The child's behavior during testing was recorded on two video cameras: one of them fixed the child's reactions and the stimuli presented on the monitor, the second one was focused only on the child. Adult subjects viewed the stimuli presented with multimedia. Prior to starting the test adults reported in the questionnaire on their state and mood and after watching - on the sensations they had. Stimulus material presented to children was combined into three groups: stimuli with images, i.e. "parent speech" (stim-1); lullabies from different images (stim-2), tales and fairy tales (stim-3). The total duration of stimulus presentation for children aged 0-6 months was 1-2 min; for children 6-12 of months - 2-5 min; 1-3 years - 2-7 min; 4-7 years - 2-10 min. Adults were presented two tests of combined stimuli "Flowers and Herbs" (5min 17s) and "Mood" (3m 40s). Video analysis was performed using Pinnacle Video Studio. A statistical analysis was made in «SPSS v. 20» using the Mann-Whitney test.


Child’s reactions

The following types of the child’s responses were selected: 1. Look at the monitor; 2. Look away from the monitor; 3. Smile; 4. Crying sounds of discomfort and/or the appropriate facial expressions; 5. Comfortable vocalizations; 6. Movement toward the screen; 7. Turning away from the screen (distracted); 8. Falling asleep or yawning. The most common reactions in children of all age groups were the direction of gaze towards the monitor, movement toward the screen, comfortable vocalization and smile (fig. 2).

Fig. 2. Child’s reactions to stimulus presentation. On a horizontal axis – stimulus; on a vertical axis – the frequency of children's reactions.

In the first six months olds infants, presentation of black and white images with faces of children and adults (no sound) caused a smile. Comfortable vocalizations were recorded in response to nursery rhymes combined with "mother’s speech." Second half-year infants voiced upon presentation of stimuli containing video with "mother’s speech» and tales or stories with music. Movement to the side of the screen, waving or clapping hands, rising to their feet and bouncing is registered. Lullabies caused infants’ decreased motor activity, closing their eyes and yawning. Upon presentation of audio sequences, children turned to the sound source, smiling and vocalize. One to three years old children are demonstrating a greater range of diverse reactions. On music stimuli, all the children registered dance moves (3-5 min), color pictures of animals - pronouncing sounds, imitating animal sounds. Children from an older age group (4-7 years) imitated the sounds of animals; singing; listening to lullabies they yawned and closed their eyes. Children above one year of age, for all the stimuli presented, looked at the monitor longer, than children of the first year of life. Significant differences were found in the time for the stimulus-1 fixation in children 1-3 years old (p