Discrimination of Eating Habits with a Wearable Bone Conduction Sound Recorder System Masaki Shuzo, Guillaume Lopez, Tomoko Takashima, Shintaro Komori, Jean-Jacques Delaunay, Ichiro Yamada The University of Tokyo, Tokyo, Japan
[email protected] Abstract— Continuous monitoring of eating habits could be useful in preventing lifestyle diseases such as the metabolic syndrome. Conventional methods consist in self-reporting and mastication frequency calculation from myoelectric potential of the masseter muscle, both resulting in a significant burden for the user. We developed a non-invasive wearable sensing system that can record eating habits over a long period of time in daily life. Our original sensing system is composed by a bone conduction microphone placed in the ears, from which sound data are collected to a portable IC recorder. Applying frequency spectrum analysis on collected sound data, we could not only count the mastication number during eating, but also accurately differentiate eating, drinking, and speaking activities, which can be used to evaluate the regularity of meals. Moreover, using clustering of sound spectra, we found it is possible to classify types of foods eaten regarding their texture.
I.
INTRODUCTION
In recent years, with the increasing number of patients with lifestyle diseases, Japanese government has been promoting health preservation in everyday lifestyle, focusing on exercise, rest, meal, smoking, and drinking alcohol. Nowadays, it is possible to monitor the calorie consumption level or the quality of sleep using wearable devices such as a digital camera or a wristwatch. On the other hands, although handwritten check sheets are available, no objective method has been clearly established for eating habits monitoring. Healthcare specialists such as dentists and nutritionists identify time of meals (regularity of meals), mastication count during meals, and types of foods eaten as medically meaningful eating habits factors.
Shintaro Yanagimoto The University of Tokyo Hospital Tokyo, Japan
Our objectives are to develop an objective and noninvasive sensing system, as well as robust analysis methods that can accurately monitor sensing habits in daily life. II.
EATING HABITS MONITORING SYSTEM
We developed a prototype of a wearable eating habits sensor using a bone conduction microphone (EM20 from TEMCO corp.) for inside-body sound, and a condenser microphone for environmental sound. This sensing element was placed in user’s ear, so that decreasing the environmental noise, we could record inside-body sound information over a long period of time in daily life. In this study, for the feasibility study of eating habits analyses using inside-body sound data, microphones output signals were recorded by an IC recorder (LS-10 from Olympus corp.) as a 16 bits signal sampled at 48 kHz (Fig. 1). Practically, we suppose that sound data recorded during a few days is sent to a data center and analysis results are presented to the user himself or his doctor, so that we could expect dietary modifications accordance to regular and objective feedback about one’s eating habits (Fig. 2). Using above described prototype, we collected inside-body sounds, such as mastication and conversation during meals, and analyzed these data to detect meal related activity, count mastication, and classify food category. Figure 3 shows the flow chart of our proposed analyses methods.
Current techniques based on myoelectric potential from the masseter muscle can count bites [1], but wearing the device results in a significant burden for the user. Recently, analysis of inside-body sound spectra as attracted attention, to differentiate biting and speaking activity [2], and classify automatically several aliments [3], as well as being a promising method to reduce burden. However, activity discrimination performances are too much influenced by individual differences, and the classification method is limited to specific aliments.
978-1-4244-5335-1/09/$26.00 ©2009 IEEE
Seiji Tatsuta Olympus Corporation, Tokyo, Japan
1666
Condenser microphone
Bone conduction microphone IC recorder
Figure 1. Overview of the eating habits sensing system.
IEEE SENSORS 2009 Conference
1.0
Inside-body sound in daily life
User
Amplitude [a.u.]
Wearable sensor
Eating Mastication number Food types
(a) 0.5 0 -0.5
a)
b)
c)
-1.0 a)
Activity
Data Center
Drinking
Healthcare specialist
d) (b)
b) c) d)
Speaking
0
15
30
45
60
Time [sec] Guidance
a) Eating b) Drinking c) Speaking d) Nothing
Figure 2. Schematic of the eating habits monitoring system.
Figure 4. A sample of recording sound data (top) and corresponding discrimination result (bottom) for a participant.
Sensing data Bone-conducted sound (eating, drinkg, speaking)
1. Discrimination of meal time Distinguish sound of meal from other sounds Sensing data =Meal
2. Counting mastication
3. Classification of food types
•Mastication number = 30 times •Food type = hard
Figure 3. Proposed analysis flow of eating habits.
First, a sound segment is classified whether it is originated from mastication or not. Next, for each segment classified as a mastication sound, we processed mastication counting and food types’ classification. The following section describes each operation in details. EXPERIMENTS AND RESULTS A. Extraction of time of meals For the feasibility study of discrimination from insidebody sound, by using the above mentioned wearable sensor, the sound data which were caused by four types of actions; eating a hard food, eating a soft food, drinking water, and oral reading of a book, were collected under the laboratory environment. Participants were five males in their twenties. At first, the inside-body sound is divided into one second data and is calculated to the power spectrum by FFT. Then, the features related to the frequency are extracted from these data. Referring to the previous works [4-6], the selected features are the barycentric frequency, the maximum peak frequency, the roll-off frequency, the total power which is integration value of frequency section from 200 to 3000 Hz, the ratios of divided section power of 1/3 octave band to the total power, and the coefficiency of the linear predictive cording. After the normalization to the three seconds data set of features, we did the pattern recognition by using the nearest neighbor algorithm.
Figure 5. Discrimination result from five subjects.
Figure 4 shows a sample of recording sound data (top) and corresponding discrimination result (bottom) for a participant. When we classified one’s activities with his own database, we could differentiate four activities. The average classification score from five participants was more than 90% accuracy for each activity (Fig. 5). However, when we classified one participant’s activities with other four participants’ database to check the effect of individual differences, the average classification score dropped to about 70%. Although the classification scores for eating hard and soft foods were still high, the score for drinking water was very low (40%). It was considered that there will be differences in how to eat and drink among individuals. So, it is necessary to select the features which are less affected by individual differences or to improve the classification algorithm. B. Counting the number of mastication We conducted the extraction of the number of mastication from the sound data defined as eating. We counted the number by applying a low-pass filter based on the person’s masticatory cycle, and then detecting peaks (Fig. 6). For the validity check of this algorithm, we collected sound data during mastication. A participant masticated a food sample 30 times with a chewing cycle of 0.4 - 1.6 seconds. A gum was
1667
used for a target food because its texture does not change during mastication. First, the cut-off frequency of the filter was set to a constant value of 1.5 Hz on the ground that usual chewing cycle is 0.6 - 0.8 seconds. As an example, the extraction result of the number of mastication for a gum is shown in Fig. 7. There was a problem that the difference between the actual number and the calculation number became larger in shorter or longer chewing cycle condition. Next, the cut-off frequency was set to a variable which was determined from the power spectrum of sound data at that moment. We chose the frequency of the maximum peak in the rage of 0.5 - 5 Hz. As a result, we were able to achieve stable counting of the number of mastication regardless to chewing rate.
C. Classification of food types Concerning food types’ classification, we focused on the food texture, defined by mechanical properties such as hardness, cohesiveness (brittleness, chewiness, and gumminess), viscosity, springiness, adhesiveness [7]. In the early stage of mastication, hardness and cohesiveness are likely to change during mastication. For the feasibility study of food types’ classification regarding texture, we carried out principal components analysis (PCA) on the mastication sounds. As food samples, we used the following three groups of food texture: hardcrumbly (rice cracker and peanut), high springiness (fish sausage and konjac food), and soft (rice and bread roll). Figure 8 shows the food textures of these samples [8]. A participant masticated these samples of 3 g with a mastication frequency of 1 Hz. Figure 9 shows results of food types clustering regarding food texture. It was found that the first principal component reflected the effect of sound. So we could classify the hardcrumbly group and the others in the feature space. Through clustering analysis of power distribution in multiple frequency bands, we were able to identify parameters related to the hardness of the food being chewed, and could also detect eating transition states of hard food becoming softer and then being swallowed (Figure 10). On the other hand, although each data points were organized, we could not clearly classify the other two groups in the feature space. This may be caused by the possibility that first defined food groups had other type of textures.
Figure 6. Mastication number counting flow. Constant cut-off frequency
Gumminess
Variable cut-off frequency
Calculation of mastication number
Hardness Springiness
Others
35
Adhesiveness
Chewiness
30
Brittleness Viscosity
hard-crumbly high springiness
Actual number
25
20 0.4
7 5 3 1
0.6
0.8 1.0 1.2 1.4 Chewing cycle [sec]
soft
sample 1: rice cracker
sample 3: fish sausage
sample 5: rice
sample 2: peanut
sample 4: konjac food
sample 6: bread roll
1.6
Figure 7. Calculation of the mastication number using a constant or a variable cut-off frequency in low-pass filter.
Figure 8. Food textures of each sample.
1668
III.
CONCLUSION
We developed a non-invasive wearable sensing system that can record eating habits in daily life. Applying frequency spectrum analysis on collected sound data, we could extract meal time, count the number of mastication during eating, and classify food types eaten regarding their texture. ACKNOWLEDGMENT This research was supported by the research fund for Core Research for Evolutional Science and Technology (CREST), granted by Japan Science and Technology agency (JST). REFERENCES
Figure 9. Food types clustering result regarding food texture. [1]
[2]
[3]
[4]
[5]
[6]
[7] [8]
K. Kohyama, L. Mioche, and P. Bourdio, “Influence of age and dental status on chewing behaviour studied by EMG recordings during consumption of various food samples,” Gerodontology, vol. 20(1), pp. 15-23, 2003. H. Mizuno, H. Nagai, K. Sasaki, H. Hosaka, C. Sugimoto, K. Khalil, and S. Tatsuta, “Wearable sensor system for human behavior recognition,” Proceedings of Transducers & Eurosensors ’07, pp. 435438, 2007. O. Amft, M. Stager, P. Lukowicz, and G. Troster, “Analysis of chewing sounds for dietary monitoring,” Proceedings of 7th International Conference on Ubiquitous Computing, pp. 56-72, 2005. T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, “Instrument identification in polyphonic music: feature weighting to minimize influence of sound overlaps,” EURASIP Journal on Advances in Signal Processing, vol. 2007, article 51979, 2007. D. Li, I. K. Sethi, N. Dimitrova, and T. McGee, “Classification of general audio data for content-based retrieval,” Pattern Recognition Letters, vol. 22(5), pp. 533-544, 2001. S. Z. Li, “Content-based audio classification and retrieval using the nearest feature line method,” IEEE Transactions on Speech and Audio Processing, vol. 8(5), pp. 619-625, 2000. A. S. Szczesniak, “Classification of textural characteristics,” Journal of Food Science, vol. 28(4), pp. 385-389, 1963. S. Yoshikawa and M. Okabe, “Texture profile pattern of foods by profile terms and texturometer,” Report of National Food Research Institute, vol. 33, pp. 123-129, 1978. (in Japanese)
.
Figure 10. Possibility to detect eating transition states of hard food becoming softer and then being swallowed.
1669