2018 3rd International Conference for Convergence in Technology (I2CT) The Gateway Hotel, XION Complex, Wakad Road, Pune, India. Apr 06-08, 2018
HMM Based Emotion Detection in Games: An Aperҫu Prof. Prerna Mishra
Prof. Saurabh Ratnaparkhi
Assistant Professor, Department of CSE-IT, RTMNU, Nagpur, Maharashtra, India
[email protected]
Assistant Professor, Department of CSE-IT, JDCOEM, Nagpur, Maharashtra, India
[email protected]
Abstract— There has been always this question asked, do games really need emotion intelligence? With the advances in artificial intelligence and great researches in computer internal structure, gaming design has been evolved to emotional detections and feedback. Again a question comes here, what if gaming characters recognize and feel your emotions and change the gameplay as per your mood? In this paper we have presented the Hidden Markov Model (HMM) based real time player’s emotion recognition system. The input of the model is video of six emotions as joyful, rage, astonish, antipathy, scared and sorrow. The work employs existing methods and proposes a system model of HMMs for automatically splitting and identifying facial expression(emotions) from video sequences.
Apart from many ways of expression, human use facial expressions as the rawest mode of expression. In the past 2 decades, researchers carried many researches on recognizing emotions through facial expressions. Hence it was seen to construct some mechanical methods for recognizing emotions of facial expressions from the images or videos [1, 2, 3, 4, 5]. Chen [5], Chen et al. [6], and DeSilva et al [7] proposed the methods of emotion recognition from voice and video. Their methods used all the temporal data displayed in the videos. In [2], author used left-to-right models having three states for modeling apiece kind of facial expression. The benefit of using this model lies in the actuality that it seemed normal to sculpt a chronological event with a model that begins from a fixed starting state and invariably ends to the goal state. Hidden Markov model has extensive applicability for numerous classifications and modeling query. Perchance the widespread appliance of HMM is in speech recognition.
Index Terms— HMM, Emotion Detection, Games, Gameplay, AI
I. INTRODUCTION The gaming industry is the only one merchantry that endeavors on via ducting the hole between technology and humans to crop up with a modernistic experience. The gaming industry has come from stern to stern to prove and provide amusement, also aiding to polish motor skills, developing tactical opinion, and thus contributing to the comprehensive wellbeing of a player. As for the concern of gamers’ physical movement, the gaming world has grabbed it firmly with games that makes player to use his motor skills in order to succeed the respective level. For the emotional gamut, gaming world has made some evolution with emotion-aware gaming which recognizes and interprets the player’s emotion in real time. Emotion recognition and effectual involvement are these days well renowned beloved trait of intelligent gaming systems [7]. Facial expression detection, results in determining the basic individual emotion like rage, panic, abhorrence, sorrow, joy and shock. It was squabbled that to in facts attain efficient humancomputer interaction intelligently; there is was necessity for the computers to be able to communicate with the user naturally, similar normal human to human communication takes place. Humans interact with one another not only through speech, but also through body gestures, to accentuate a definite part of the speech, and spectacle of emotions.
978-1-5386-4273-3/18/$31.00 ©2018 IEEE
II.
ROLE OF EMOTION DETECTION IN GAMES
Artificial intelligence has infused our life, from Apple’s Siri to Affectiva, all the hottest updates has augmented the game's competence on assimilating emotion recognition in their gaming software. Gaming software uses information from players' camera to trace their reaction of emotional agony and simultaneously amends what occurs on the game. Emotion-aware gaming are about to bring a great epoch. These games will observe players fear, stress or anxiety and changes the gameplay according to the player’s emotion to make it harder to finish. It arouses all the negative feelings in the player and lets the gamer take intelligent decisions while going through the negative feelings. This mechanism thus prepares the player to confront real-world situations in a very composed way. The main motive of this technology is to make gamer deal with challenges in games once, which eventually leads him to surpass stressful situation in real life. This technology works as, if player is stressed out and responds pessimistically to any circumstances in the game, the game will turn out to be sturdy. On other hand if player is calm and placid, the game will get more comfy. The game responds to player, on the basis of emotion given by him. Accordingly,
1
assumed that the human face recapitulate a bland position before it shows up a new emotion.
player may find himself abruptly ambushed within a precarious stage, and is thus enforced to stay tranquil. Gamers will have a unique ability of driving gameplay with their emotions However gaming world is only the onset. The future is not that far away where technology will grow to be even superior in scrutinizing and even foresee emotional reactions. Emotionally intelligent technology may aid improved interaction between humans-computer, who progressively commune via digital means. Conceivably hyper-custom-made products like video games will not only make players feel more prized, but also give them probability of relating to each other. III. HMM USEFUL IN EMOTION DETECTION Many researchers employed five diverse classifiers involving Support Vector Machine (SVM), Hidden Markov Model (HMM), and Hidden-state Conditional Random Field (HCRF) approaches. HMM are idyllic for building deduction about emotional state as emotions changes with time and are moderately resolved by forming observations about a meticulous trait such as a player’s vocal gesture and facial expressions. The next pace of exploiting a HMM is determining what apparent traits to use in the model. It was seen that HMMs trails an emblematic machine learning outline. After acquiring the data and extricating the essential features, a training set is splitted into features and given as input to the model so it can be trained. After training, sample data is given as input unswervingly to the culpably controlling portion of the system and return an epithet for the emotion. Since the revelation of a facial expression in videos is epitomized by a temporal sequence of facial gestures. This paper discusses about six such HMMs, one for each of the six emotional states: joyful, rage, astonish, antipathy, scared, sorrow.
Fig.1. State diagram of HMM
The training procedure of the system is as follows: 1. Feed all six HMMs with the continuous (labeled) facial expression sequence. Each expression sequence contains several instances of each facial expression with neutral instances separating the emotions 2.
Obtain the state sequence of each HMM to form the six-dimensional observation vector of the higherlevel HMM, i.e. ࡻࢎ࢚ ൌ ሾ࢚ ǡ ǥ ǥ Ǥ ǡ ࢚ ሿ (1) where qt is the state of the ith emotion-specific HMM.
3.
Learn the probability observation matrix for each state of the high-level HMM using R ሺ ݆݅ݍȁܵ ݇ݐሻ= probable frequency of model i being in state j, such that value of k is true B(h) = [ߨ݅ൌͳ R( ݆݅ݍȁܵ ݇ݐሻ] (2)
IV. PROPOSED SYSTEM
where j=(1, No of states)
The fundamental means of the emotion detection system can be stated by author[8] in which the first step is face detection. The machine captures an image by camera or by video stills, by segmenting the skin color it detects person skincolor followed by detection of human face. Background from the image is alienated to acquire the concerned section of the input image. Second step is to normalize in order to discard the noises and systematize the face to balance luminosity and pixel position. In third step traits are extricated and all inappropriate features are abolished. In the final stage vital gestures are codified into six emotions like: joyful, rage, astonish, antipathy, scared and sorrow. Here the model consists of six main states with one extra bland state as shown in Figure 1. The bland state is essential as for the maximum fraction of time instance; as when there is no emotion on a person’s face. Every transition between six emotions is forced to pass through the bland state as it is
4.
5.
Calculate the transitional probability of the upperlevel HMM using transitional frequency from each of the six emotional states to the bland state in the training sequences and from the bland state to all other emotion states.
Set the initial probability of the upper-level HMM to be one for the bland state and zero for all other emotional states. This will force the model to always initiate from the bland state and also presumes that a person will display a bland expression in the beginning of any video sequence. This assumption is made just for simplicity of the testing. The system model to recognize emotions from a video is illustrated by the Figure 2 and Figure 3(system block diagram). 1) Pre-processing block: In the initial phase, the image color is converted into gray scale, balancing the requisite image, noise is aloofed, brightness and color
2
effects normalization is accomplished by histogram. Higher decomposition of Discrete Wavelet Transform and Histogram Equalization is used for expression/gesture and light effect normalization. 2) Feature Extraction block: In this phase facial traits are extorted using edge detection technique, Principle Component Analysis (PCA) technique which is used for color reduction and targeting global details of a face.
By the approach said, six emotions (joyful, rage, astonish, antipathy, scared, sorrow) will be recognized. The first intention was to enumerate the performance of the system, to indentify the potency and limitations of the approach in order to improve the overall recognition rate of the system. V. CONCLUSION By the advancements in the research and developments of the upcoming interactive games, developers have the potential to persuade not just the future of interactive video games but also the fields of interactive training methods, education, etc. We can recognize emotions by using facial expressions without any effort but dependable facial recognition by computer middleware will be a challenge. An ultimate emotion detection model should recognize expressions regardless of any inequity. However in this paper, an overview of implementation of defined emotion extraction model using HMM in order to have real time emotions detections from video has been presented. For better recognition rate, care should be taken for position and orientation of the face with respect to the camera as it can result in wide changeability of image outlook. One of the primary limitations in all the works stated till now on emotion recognition from real time videos is the lack of a benchmark database to test different algorithms.
3) Training and testing block: in this block SVM, neural network and Random Forest can be used.
REFERENCES [1]
K. Mase, “Recognition of facial expression from optical flow”, Published in IEICE Transactions, E74(10):3474–3483, October 1991. [2] T. Otsuka and J. Ohya, “Recognizing multiple persons” facial expressions using HMM based on automatic extraction of significant frames from image sequences”, Published in Proc. of International Conference on Image Processing (ICIP-97), pages 546–549, Santa Barbara, CA, USA, Oct. 26-29, 1997. [3] Y. Yacoob and L.S. Davis, “Recognizing human facial expressions from long image sequences using optical flow”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(6):636–642, June 1996. [4] M. Rosenblum, Y. Yacoob, and L.S. Davis, “Human expression recognition from motion using radial basis function network architecture”, IEEE Transactions on Neural Network, 7(5):1121–1138, September 1996. [5] L. S. Chen, “Joint processing of audio-visual information for the recognition of emotional expressions in human-computer interaction”, PhD thesis, University of Illinois at Urbana-Champaign, Dept. of Electrical Engineering, 2000. [6] L. S. Chen et.al, “Emotion recognition from audiovisual information”, Published in Proceedings of IEEE Workshop on Multimedia Signal Processing, pages 83–88, Los Angeles, CA, USA, Dec. 7-9, 1998. [7] L. C. De Silva, T. Miyasato, and R. Natatsu, “Facial emotion recognition using multimodal information”, Published in Proceedings of IEEE Int. Conf. on Information, Communications and Signal Processing (ICICS’97), pages 397-401, Singapore, Sept. 1997. [8] Hone K, “Empathic agents to reduce user frustration: The effects of varying agent characteristics”, Published in Journal of Interacting with Computers, Vol. 18 No. 2, pp.227–245, 2006. [9] Ruth L. Diaz, “Violent video game players and non-players differ on facial emotion recognition”, Published in Wiley Journal of Aggressive Behavior, 24 August 2015. [10] Rode Snehal, Manjare Chandraprabha, “Emotion Detection of Speech Signals with Analysis of Salient Aspect Pitch Contour”, International Research Journal of Engineering And Technology, E-ISSN: 2395 -0056, Volume: 03 Issue: 10, Oct-2016.
Fig. 2. The Framework for Emotion Detection
From the above figure, we can see that HMM follows a learning pattern. After attaining data and extorting the required features, a training set is partitioned into features and given as input to the model so that it could learn by adjusting the probability matrices. After training the models, facial expressions are obtained depending upon what image was captured.
Fig. 3. Block Diagram of System Model
3
[11] Minakshee Sarma, Kaustubh Bhattacharyya, “Facial expression based emotion detection, A Review”, Published in ADBU-Journal of Engineering Technology, ISSN: 2348-7305, Volume 4(1), 2016. [12] Bogdan Vlasenko et al, “Tendencies regarding the effect of emotional intensity in inter corpus phoneme-level speech emotion modeling”, Published in IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), 13-16 Sept. 2016. [13] Rahul Singh Negi, Rekha Garg, “Face Recognition Using Hausdroff Distance as a Matching Algorithm”, Published in International Journal of Scientific Research, Vol 4, No 9, 2015. [14] Antonio Fernández-Caballero et al, “Smart environment architecture for emotion detection and regulation”, Published in Journal of Biomedical Informatics, Volume 64, December 2016, pp 55–73.
[15] Muhammad Hameed Siddiqi et al, “A Novel Maximum Entropy Markov Model for Human Facial Expression Recognition”, published online: http://dx.doi.org/10.1371/journal.pone.0162702, September 16, 2016. [16] Ciprian Adrian Corneanu et al, “Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications”, Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 38, Issue: 8, Aug. 1 2016. [17] Amol Patwardhan, Gerald Knapp, “Augmenting Supervised Emotion Recognition with Rule-Based Decision Model”, Published online: https://arxiv.org/abs/1607.02660, 9 Jul 2016. [18] Xiaoyu Ding, “Cascade of Tasks for facial expression analysis”, Published in Journal of Image and Vision Computing Volume 51, July 2016, Pages 36–48.
4