Improved estimation of energy expenditure by artificial neural network ...

1213

Improved estimation of energy expenditure by artificial neural network modeling Dean Charles Hay, Akinobu Wakayama, Ken Sakamura, and Senshi Fukashiro

Abstract: Estimation of energy expenditure in daily living conditions can be a tool for clinical assessment of health status, as well as a self-measure of lifestyle and general activity levels. Criterion measures are either prohibitively expensive or restricted to laboratory settings. Portable devices (heart rate monitors, pedometers) have gained recent popularity, but accuracy of the prediction equations remains questionable. This study applied an artificial neural network modeling approach to the problem of estimating energy expenditure with different dynamic inputs (accelerometry, heart rate above resting (HRar), and electromyography (EMG)). Nine feed-forward back-propagation models were trained, with the goal of minimizing the mean squared error (MSE) of the training datasets. Model 1 (accelerometry only) and model 2 (HRar only) performed poorly and had significantly greater MSE than all other models (p < 0.001). Model 3 (combined accelerometry and HRar) had overall performance similar to EMG models. Validation of all models was performed by simulating untrained datasets. MSE of all models increased when tested with validation data. While models 1 and 2 again performed poorly, model 3 MSE was lower than all but 2 EMG models. Squared correlation coefficients of measured and predicted energy expenditure for models 3 to 9 ranged from 0.745 to 0.817. Analysis of mean error within specific movement categories indicates that EMG models may be better at predicting higher-intensity energy expenditure, but combined accelerometry and HRar provides an economical solution, with sufficient accuracy. Key words: ANN, exercise, EMG, accelerometry, heart rate, aerobic. Re´sume´ : L’estimation de la de´pense eńerge´tique dans la vie de tous les jours peut servir en clinique a` l’e´valuation de l’e´tat de sante´ et a` une e´valuation personnelle du mode de vie et du niveau global d’activite´ physique. Les mesures de validation sont couˆteuses ou re´serveés a` la clinique. Les appareils portatifs (cardiofre´quenceme`tre, podome`tre) gagnent en popularite´, mais la prećision de leurs e´quations de pre´diction demeure douteuse. Cette e´tude applique le mode`le des re´seaux neuronaux artificiels pour estimer la de´pense d’eńergie en lui fournissant diffe´rents signaux d’entreés dynamiques (acce´le´rome´trie, fre´quence cardiaque au-dessus de la valeur de repos (« HRar »), e´lectromyographie (« EMG »)). On entraıˆne donc neuf mode`les de propagation en boucle ouverte « feedforward » et en boucle fermeé « feedback » avec l’objectif de minimiser la moyenne du carre´ des erreurs (« MSE ») de l’ensemble des donneés. Le mode`le 1, incluant seulement l’acce´le´rome´trie, et le mode`le 2, incluant seulement la HRar, donnent de pie`tres re´sultats et re´ve`lent une plus grande MSE que tous les autres mode`les (p < 0,001). Le mode`le 3, incluant l’acce´le´rome´trie et la HRar, affiche une performance globale semblable aux mode`les EMG. La validation de tous les mode`les se fait par la simulation de toutes les se´ries de donneés non entraıˆneés. La MSE de tous les mode`les augmente quand ces derniers sont teste´s au moyen des donneés de validation. Encore une fois, les mode`les 1 et 2 se comportent lamentablement; la MSE du mode`le 3 est infe´rieure a` toutes sauf a` celle de deux mode`les EMG. Le carre´ des coefficients de corre´lation des mesures observeés et pre´dites de la de´pense eńerge´tique des mode`les 3 a` 9 varie entre 0,745 et 0,817. L’analyse de l’erreur moyenne calculeé dans une cate´gorie spećifique de mouvements re´ve`le que les mode`les EMG semblent supe´rieurs pour la pre´diction d’une plus grande de´pense d’eńergie; la combinaison de l’acce´le´rome´trie et de la HRar s’ave`re toutefois une solution plus ećonomique pre´sentant suffisamment de prećision. Mots-cle´s : re´seaux neuronaux artificiels, exercice physique, EMG, acce´le´rome´trie, fre´quence cardiaque, ae´robie. [Traduit par la Re´daction]

Received 12 May 2008. Accepted 23 September 2008. Published on the NRC Research Press Web site at apnm.nrc.ca on 5 December 2008. D.C. Hay.1 Faculty of Education, Nipissing University, North Bay, ON P1B 8L7, Canada. A. Wakayama. Tokyo Women’s College of Physical Education, Kunitachi-shi, Tokyo, Japan. K. Sakamura. Ubiquitous Networking Laboratory, University of Tokyo, Tokyo, Japan. S. Fukashiro. Department of Life Sciences, University of Tokyo, Tokyo, Japan. 1Corresponding

author (e-mail: [email protected]).

Appl. Physiol. Nutr. Metab. 33: 1213–1222 (2008)

Introduction Accurate estimation of energy expenditure has been a goal of researchers and practitioners in disciplines including epidemiology, physical education, and exercise science. Gold standards, such as direct calorimetry (Levine 2005), gas analysis (Kalman 2004; Wasserman et al. 2005), and doubly labeled water (Schoeller 2002; Speakman 1997), are prohibitively expensive and (or) impractical for daily living situations. Prediction equations based on anthropometric parameters (Goran 2005; Henry 2005), activity logs (Riley et al. 2005; Trivel et al. 2006), heart rate (Londeree and Ames 1976; Keytel et al. 2005), accelerometry (Crouter et

doi:10.1139/H08-117

#

2008 NRC Canada

1214

Appl. Physiol. Nutr. Metab. Vol. 33, 2008

Table 1. Subjects’ physical characteristics. Cohort Men, n = 13 Women, n = 6

Age, y (range) 34.5±16 (19–68) 34.3±9 (20–46)

Height, cm (range) 172.5±6.8 (162–184) 158.8±5.1 (153–167)

Mass, kg (range) 68.1±11.9 (52–88) 53.5±7.8 (42–63)

al. 2006a; Karabulut et al. 2005), and electromyography (EMG) (deVries et al. 1976; Seliger et al. 1980) have all been used, to varying degrees of accuracy, as alternative means to estimate activity levels and daily energy expenditure. The correlation between heart rate and energy expenditure has been well researched, but the accuracy of this methodology is not optimal, especially in populations of obese individuals, those with heart disease, and the very young (Miller et al. 1993; Whaley et al. 1992). Heart rate above resting (HRar) (Whaley 2006) has been shown to reduce some of the error in predicting oxygen consumption during exercise, especially at lower intensities. Accelerometry has also been used as a convenient and inexpensive estimate of human activity (Melanson and Freedson 1995; Pambianco et al. 1990), but squared correlation between actual and predicted energy expenditure has been reported to be as low as 0.17 during experimental conditions (Crouter et al. 2006a). Brage et al. (2004) demonstrated that combined heart rate and accelerometry can improve energy expenditure estimation, but their use of flex-point modeling to calculate the contribution of accelerometer data during walking and running may not sufficiently account for nonlinear interactions between variables. Rothney et al. (2007) demonstrated the utility of artificial neural network (ANN) modeling to estimate energy expenditure from nonintegrated acceleration signals. However, it remains to be seen whether the use of other biosignal inputs — EMG, for example — can improve energy expenditure estimation through ANN modeling. The current study was conducted to explore whether the addition of EMG data to accelerometer and heart rate data can improve ANN modeling of energy expenditure in a range of movements and exercise intensities.

Materials and methods Subjects A total of 26 subjects (24 ethnic Japanese, 2 ethnic Western European), with no contraindications to performing physical exercise, gave written consent to participate in the study, following a verbal explanation of the experimental protocol and completion of the Physical Activity Health Questionnaire. The study design was approved by the University of Tokyo Ethics Committee. The raw data from 7 trials had unacceptable noise in one or more signals, necessitating exclusion of those subjects from further analyses. Anthropometric data for the remaining 19 subjects (6 female, 13 male) were used as common variables for all subsequent ANNs (Table 1). Anthropometric data for all 26 subjects are listed as supplementary data, with asterisks denoting the subjects whose data were only used for the ac-

Body fat, % (range) 17.3±3.5 (12–22) 23.8±6.6 (17–32)

Body mass index, m2kg–1 (range) 22.7±2.5 (26–18) 21.2±3.3 (25–18)

celerometer and heart rate ANN that was developed for public access (see Fig. A1 and Table A1 in supplementary data2). Instrumentation Breath-by-breath oxygen consumption and carbon dioxide expiration was monitored by an Aeromonitor AE300S (Minato Medical, Japan). Facial mask fitting and machine calibration were performed according to the manufacturer’s instructions. A wireless 3-lead electrocardiogram (ECG) Life Scope 6 (Nihon Koden, Japan) was electronically synched with the gas analyzer to provide real-time heart rate data. As the recording rate for the Aeromonitor and ECG was dependent upon rate of respiration, the data sampling rate varied with exercise intensity, between 1 Hz (maximal effort) and 0.1 Hz (resting, fit individual). An Actiheart sensor (Mini Mitter) was attached to the sternum to also record heart rate (redundant check) and activity counts per minute (vertical axis changes in acceleration), with data recorded to internal memory once every 15 s. Two separate accelerometers (WZ130, Seiko S-Yard, Japan; and FB-720, Tanita, Japan) were attached to the back of the neck and left upper arm; these were used to confirm the activity count data from the Actiheart sensor. After each of the activities, the activity counts from the backup accelerometers were recorded. The data from the 3 accelerometers were compared postexperiment to ensure that there were no corrupt recording periods. Because the calibrations for each of the accelerometers were not equal, activity counts for each of the continuous activities varied slightly. However, as there were no significant differences between the accelerometers in the relative changes in activity counts between activities, the Actiheart accelerometer counts were assumed to be reliable and valid. Bipolar EMG electrodes (Ag–AgCl, Nihon Koden) were placed on the right arm biceps and triceps muscles and right leg soleus and vastus medialis muscles, and a ground electrode was situated on the right tibia. One subject affixed with all recording sensors is shown in Fig. 1. Skin preparation was conducted, according to standard protocols. EMG data were recorded at 1000 Hz (PowerLab, ADInstruments, Australia) and stored on computer for postprocessing. Body fat was evaluated by bioelectrical impedance (InBody 3.2, Biospace, Korea). Data processing Respiratory data, ECG, and activity counts were resampled to 1 Hz through spline interpolation. A sampling rate of 1 Hz was selected on the basis that, during higherintensity exercise, subject ventilation frequency approached

2 Supplementary

data for this article are available on the journal Web site (http://apmn.nrc.ca) or may be purchased from the Depository of Unpublished Data, Document Delivery, CISTI, National Research Council Canada, Building M-55, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada. DUD 3843. For more information on obtaining material refer to http://cisti-icist.nrc-cnrc.gc.ca/cms/unpub_e.html. #

2008 NRC Canada

Hay et al.

1215

Table 2. Input parameters used for artificial neural networks.

Model 1 2 3 4 5 6 7 8 9

Anthropometric data

Activity counts

Heart rate above resting

Raw EMG data

Submax normalization

Maximal normalization

Mean

Mean

Mean

Peak

Peak

Peak

Note: All models included personalized anthropometric data (Table 1) and one or more dynamic signal inputs. Mean and peak electromyography (EMG) data were calculated from 1 s intervals and time synchronized with accelerometer and heart rate above resting data. Models 4 and 7 used raw EMG values; models 5 and 8 used EMG normalized by averaged submaximal muscle contractions; and models 6 and 9 used EMG normalized by predicted maximal EMG activity.

or equaled that value. Furthermore, developing ANNs with 1 Hz input data sampling was hypothesized to better account for transient changes in oxygen kinetics (Cooper and Storer 2001; Wasserman et al. 2005). After interpolation, respiratory data were smoothed with a 3-point moving average to attenuate the mismatch of inspired and expired volumes between breaths (Weisman and Zeballos 2002). Resting heart rate was calculated as the lowest 15 s mean heart rate recorded during 10 min of supine resting. Respiratory exchange ratio and oxygen consumption were used to calculate energy expenditure in kilocalories (kcalh–1), according to Wasserman et al. (2005). Energy expenditure was used as the training target (dependent variable) for all subsequent ANNs (Table 2), while HRar and activity counts were used as model inputs (independent variables). EMG data were filtered using a 0-lag fourth-order Butterworth bandpass (20–350 Hz), and then full-wave rectified. For the processed EMG signal, running mean and peak values were synchronized and recorded at 1 Hz with the other signal inputs. Model 4 (mean) and model 7 (mean and peak) used the processed signals for each of the muscles without normalization. For model 5 (mean) and model 8 (mean and peak), normalization was performed by dividing the processed signal by the average amplitude of submaximal contractions for each respective muscle group. Model 6 (mean) and model 9 (mean and peak) were normalized from predicted maximal muscle activity. During postprocessing and after filtering, each subject’s EMG data were sorted in order of magnitude. A 5 s average of the signal within the maximal amplitude range (sampled data, 5 to 10 s) for each muscle was used as the assumed maximal voluntary contraction EMG signal intensity. Test activities Subjects refrained from drinking caffeinated beverages and eating food for 2 and 3 h, respectively, prior to arriving at the laboratory. Subjects wore light exercise clothing and sneakers or running shoes. Each experiment was divided into 2 parts, with a rest period of 5 min, during which the mask was removed and the gas analyzer was recalibrated. To begin the experiment, each subject lay supine on a mas-

sage table and covered with a blanket for 10 min. The lights of the laboratory were turned off and the subject was instructed to remain still and not to try to stay awake if feeling drowsy. Following this, the subject was asked to sit up and was given a few moments to return to an alert state. Next, the subject was asked to assume a series of static postures and perform a number of static and dynamic muscular contractions of the upper and lower extremities, lasting approximately 10 min. The last task before recalibration entailed standing up and sitting down, from floor level, every 30 s for 5 min. After recalibration of the gas analyzer, the subject performed 3 submaximal muscular endurance-type exercises, including pushups with arms at shoulder width (women were allowed to use modified knee posture), upright barbell row with a load of 25% body mass, and barbell squats with a load of 50% body mass. Each task was performed for a period of between 30 and 60 s, at a cadence of 15 repetitionsmin–1. The final task of the experiment was a graded aerobic treadmill test. Walking speed began at 2 kmh–1 and increased in 2 kmh–1 increments until a speed of 14 kmh–1 was attained, the subject terminated the test, or the researcher terminated the test. Each stage lasted 3 min, and was interspersed with a 30 s rest period. A 2 min cooldown, at 4 kmh–1, followed by 3 min of quiet sitting completed the data collection phase. The entire data collection period was recorded on videotape, and a movement log, consisting of 14 discreet classifications, was developed. The classifications were used to quantitatively evaluate model accuracy of each of the movements performed in the experiment (Fig. 2). Artificial neural networks Nine models were developed with differing input parameters (Table 2). The ANN structure was constant for each model, with a 20 neuron hidden layer (tan-sigmoid function) and a single output neuron (linear transfer function). Twenty-five and 30 neuron networks were simulated during pretesting, without noticeable improvement in model accuracy; they were therefore discarded in favour of the 20 neuron network. For a more detailed explanation of ANN #

2008 NRC Canada

1216 Fig. 1. One subject with all recording sensors in place. EMG electrodes were placed on the biceps and triceps brachii, soleus, and vastus medialis of the right extremities. A ground electrode was positioned on the right tibia. A 2-lead Actiheart sensor (accelerometer and heart rate) and a 3-lead electrocardiogram were placed on the sternum and ribcage. Two additional accelerometers were located on the subject’s left upper arm and back of neck.


(Hagan and Menhaj 1994). In this experiment, each of the training sessions was continued until a maximum of 500 iterations was completed or the gradient descent value at a specific iteration fell below 1.5–5. Gradient descent is a method that reduces the time required to approach an optimal solution (in this case, MSE of the difference between the measured and the estimated oxygen consumption). In brief, it is an indication of the rate of improvement in the model estimation between each training step. If the rate of descent is large, then it can be said that the model error is being reduced each iteration at a (relatively) rapid rate. Conversely, when the gradient decent approaches 0, it is an indication that an optimal solution has been approximated. All model development was performed using Matlab and the Matlab neural networks toolbox (MathWorks). Statistical analysis The MSE of the training data models were compared with a 1-way analysis of variance (ANOVA) with post hoc Tukey’s HSD comparison between models. Model validity was performed with data that had not been used for training the models. The error between predicted and actual energy expenditure was calculated for each model, and these matrices were compared with an ANOVA with post hoc Tukey’s HSD comparison between models. A 2-way ANOVA (model type movement type) was used to evaluate relative performance of each model within each of the classified movement patterns. Linear regression equations were also calculated for each model, and the resulting relations between predicted and actual energy expenditure were displayed in scatter plots.

Results

architecture, see Hagan et al. (1996) and Rothney et al. (2007). For each model, the entire dataset was divided into 19 training sets, where the data from 18 subjects were used to train the ANN, and the data from the remaining subject were used to validate the model. This method of leave-oneout-and-repeat ensures that each subject’s data are used at least once as a validation sample for each model. A limitation of nonlinear ANNs is that the solution cannot be proven to be the global optimal solution. The reason for this, in short, is that the solution space has local minima, which the model may converge upon, rather than the global minimum (Hagan et al. 1996). To increase the model’s ability to approximate the optimal solution, multiple training sessions with the same dataset should be performed. In the current study, each training set (n = 19) was trained 12 times for each of the 9 model conditions (total, 2052 training sessions). The simulation with the lowest MSE for each of the training sets in each model condition was used for subsequent data analyses (n = 171). The Levenberg–Marquardt optimization algorithm was used to minimize MSE. This algorithm, which adjusts the model’s neuron weights and biases by feed-forward back-propagation, has been demonstrated to converge on a solution in fewer iterations than standard descent algorithms, thereby reducing training times

Training dataset The MSE of the training data model estimates are presented in Fig. 3. Each column is composed of the mean MSE and 95% confidence interval (CI) of 19 simulations (the best simulation for each of the subject datasets). Models 1 and 2 performed poorly and had a significantly higher MSE than all other models (p < 0.001). The model 3 MSE was higher than all EMG models, but only significantly different from model 9 (p < 0.01). The absolute best single simulation of all dataset trials produced an MSE of 1514 kcal2h–2 for model 9 (not shown). Validation dataset The best simulated ANN for each dataset and each model (19 9) were validated using data that had not been used to train the models. Figure 4 shows the predicted energy expenditure of 4 models against the actual energy expenditure of 1 validation data set (1 male subject). It is clear that models 1 and 2 do a poor job of predicting energy expenditure. Conversely, model 3 closely approximates the actual energy expenditure throughout the range of exercise intensities. Model 8 has a better approximation at the highest exercise intensity, but does not appear noticeably better than model 3 throughout most of the lower-intensity activities. An ANOVA of the prediction error found significant differences between all models (p < 0.01), except between models 6 and 8, and between models 5 and 9. Model error #

2008 NRC Canada

Hay et al.

1217

Fig. 2. Mean error of artificial neural network (ANN) models based on classified movements (+95% confidence interval (CI)). All movement patterns were logged in synch with the dynamic inputs — i.e., activity counts, heart rate above resting (HRar), and electromyography (EMG) — and used to sort the validation results into 14 discrete movement categories. Transient refers to noncontinuous movements (e.g., standing up and sitting down). Model 8 had significantly reduced error, compared with model 3 for 14 kmh–1 running, pushup, and squat classifications (p < 0.01).

Fig. 3. Mean squared error (MSE) is a method of evaluating the overall performance of the model’s ability to predict energy expenditure from independent variables. Means of 19 training datasets and 95% CI are shown. a, Significant difference between all other models (p = 0.001); b, significant difference between models 3 and 9 (p = 0.01).

Fig. 5. The model with the highest correlation coefficient was model 8 (r2 = 0.817). All models with EMG inputs had squared correlation coefficients greater than 0.774, with the exception of model 9 (r2 = 0.745). The performance of model 3 was comparable to the best EMG input models (r2 = 0.806). To assess each model’s accuracy during different movements, a 2-way ANOVA was performed (model type movement type), using the video classification log to differentiate the various movements performed during the experiment (Fig. 2). Models 1 and 2 had larger error fluctuations between movement types, with especially large bias evident at 14 kmh–1 running. Model 8 had the lowest mean error across activity types, and was significantly better than model 3 for estimating the energy expenditure of 14 kmh–1 running, pushup, and squat movements (p < 0.01). The other 5 models performed slightly less well overall than model 8, but better than models 1 and 2 (not shown).

Discussion

bias ranged from 32 kcalh–1 (model 1) to –24 kcalh–1 (model 2). Positive and negative values indicate model underestimation and overestimation, respectively. Model 3 (–1.8 kcalh–1), model 6 (0.1 kcalh–1), and model 8 (–0.1 kcalh–1) showed the lowest mean bias (data not shown). Scatter plots of the relation between predicted and actual energy expenditure for all models are displayed in

ANN training performance ANNs have been shown to be very good at minimizing prediction error with a sufficiently large training sample (Hagan et al. 1996). In this experiment, 9 models were developed in which the input data differed (independent variables), but where neuron structures and target data (experimentally measured energy expenditure) remained the same. More than 17 h of recorded data were used to train the networks, with 62 025 pairs of input and target samples. #

2008 NRC Canada

1218

Fig. 4. Time course of actual energy expenditure (gray line) and predicted energy expenditure (black line) for 4 different ANN models. The validation data are from 1 male subject (age, 31 years; height, 180 cm; mass, 76.1 kg; body fat, 17.3%).

2008 NRC Canada


#

Hay et al.

1219

Fig. 5. Regression plots of predicted and actual energy expenditure from validation data (n = 62 025). Top row (from left): models 1, 2, and 3; middle row (from left): models 4, 5, and 6; bottom row (from left): models 7, 8, and 9. Line of best linear fit (solid line) and line of identity (dashed line) are also shown.

As is shown in Fig. 3, the MSE was greatest for models 1 and 2, indicating that activity counts and HRar alone are poorer predictors of energy expenditure. However, combining activity counts and HRar (model 3) improves prediction accuracy significantly. The addition of EMG further enhances model accuracy, but the magnitude of the improvement is marginal. Of the 6 EMG input models, the model with mean and peak values normalized by maximal EMG activity had the best performance (model 9), and was significantly better than model 3 (p < 0.01). ANN validation performance The utility of a trained ANN is its ability to predict energy expenditure when presented with new input data. When each of the models was validated with independent datasets, performance was degraded. Models 1 and 2 scatter plots (Fig. 5) show that model error was significantly greater than that in all other models. Model 3 performed well, despite a small but significant increased model error, compared with model 8 (p < 0.01). Model 9, which had the best training performance, failed to suitably predict the entire validation data of 1 subject, as can be seen by the data cluster in the negative y-axis region. Model 8, on the other hand, performed well with both the training and validation datasets and, in fact, had the highest correlation coefficient (0.904) and least total prediction bias over the range of exercise intensities and movement classifications (Figs. 2 and 5). These results indicate that using the running mean and peak EMG

values is better than using only the running mean, and that normalization of the EMG signal by submaximal average values is more robust than either the non-normalized (models 4 and 7) or maximal-normalized EMG signals. It was found that, during testing, several subjects — especially the nonactive and elderly — had difficulty eliciting true maximal voluntary contractions. The estimated maximal EMG normalization method used in models 6 and 9 divided the raw signal by the largest 5 s sample of all EMG data within each trial. However, in the case of 1 male subject, the range of movements was limited and, therefore, the maximum recorded EMG activity was low. That subject’s data were responsible for the outlier cluster in model 9 (Fig. 5), and highlight the weakness of using estimated maximal EMG activity as an input parameter. Model 8, which used mean and peak EMG data that were normalized by submaximal muscle activity, produced a more robust solution. As it is easier for untrained individuals to perform submaximal muscle contractions, this normalization method may prove to be more practical in a variety of applications. Accelerometry model performance Several papers have been published and an increasing number of commercial products have been developed to estimate energy expenditure from accelerometry and (or) heart rate. Crouter et al. (2006a) found that single prediction equations of accelerometer data underestimated energy expenditure of vigorous activity, and Brage et al. (2003) dem#

2008 NRC Canada

1220


onstrated that prediction errors of energy expenditure from a pedometer-type accelerometer increased with running speed (up to 48% at 16 kmh–1). Crouter et al. (2006b) developed a 2 regression equation model to improve energy expenditure from a uniaxial accelerometer, using a lifestyle and leisure equation for data above a coefficient of variation cutoff, while data below the threshold were evaluated by a walk and run equation. This method was reported to be within 0.75 metabolic equivalents (3.5 mL consumed O2kg–1min–1) for each of the measured activities, but an examination of their study’s Bland–Altman plots reveals that the high-intensity exercise (>9 metabolic equivalents) prediction error was consistently near or above the +0.95% CI (Crouter et al. 2006b). In the current study, despite using a nonlinear modeling approach, high-intensity activity was also consistently underpredicted (Fig. 5). While able to provide a reasonably good overall estimation of total energy expenditure, models based on uniaxial accelerometry appear to consistently underpredict high-intensity exercise, and do not track transient changes in energy expenditure well. It must be noted, however, that the native sampling rate of the Actiheart used in this experiment was low (4 samplesmin–1), and these data were interpolated to 1 Hz for ANN input. It is possible that collecting accelerometry data at a greater frequency could improve model performance.

heart rate and accelerometry significantly improved estimation of energy expenditure during arm and leg work (r2 = 0.81), compared with heart rate or accelerometry alone. Brage et al. (2004) developed a branched equation model, using heart rate and accelerometer countsmin–1 to predict energy expenditure. Their best fit model of 12 male subjects produced a squared correlation coefficient of 0.78. However, as their reported findings are of total energy expenditure during a 22 h calorimetry session, the accuracy of their model in predicting transient changes in energy expenditure is unknown. During low- to moderate-intensity physical activity, combined heart rate and accelerometry was found to have a mean bias of 6% (Thompson et al. 2006). In the current study, model 3, which combined heart rate and activity counts in much the same way as the preceding studies, was an excellent predictor of energy expenditure (r2 = 0.910; not shown). However, in this study — unlike many previous studies — cross-validation was conducted to determine the ability of each ANN model to predict energy expenditure from new data. With the validation sample, model 3 performed well, but was significantly less accurate than with the training sample (r2 = 0.806). The ANN modeling approach clearly shows that initial prediction accuracy is as high or higher than existing linear models, and that subsequent validation results are also relatively good.

Heart rate model performance Commercial heart rate monitors have become a tool commonly used to predict physical activity in a variety of settings (Dauncey and James 1979; Ekelund et al. 2001; Rennie et al. 2001). However, the accuracy of generalized models is limited because of the nonlinear relationship between heart rate and energy expenditure and high interindividual variability (Levine 2005). Keytel et al. (2005) developed a linear regression equation, using anthropometric and heart rate inputs, to predict energy expenditure during cycle ergometry and treadmill testing. While the model had a good fit to the experimental data (r2 = 0.734), the correlation coefficient generated from a validation group was lower (r2 = 0.593). In the current study, which used a wider range of movement activities and exercise intensities, ANN modeling of HRar (model 3) resulted in a very high correlation for the training datasets (r2 = 0.874; not shown). However, generalization of the trained model to a validation set was much lower (r2 = 0.533; Fig. 5). Model underestimation at low exercise intensity and overestimation at high exercise intensity indicate that using HRar as the only dynamic independent variable is less than optimal.

Benefits of EMG modeling EMG, perhaps because of its inherent sensitivity to noise artifacts, cost, and cumbersome equipment requirements, has been a tool largely unused to predict energy expenditure in free living conditions. Studies that have sought to correlate EMG to energy expenditure have mostly been limited to fixed movements in laboratory conditions (deVries et al. 1976; Seliger et al. 1980). However, as technology improves, true portability will become a reality, and so it is important to understand whether EMG can significantly improve the prediction of energy expenditure and exercise intensity. From the current study, the answer is equivocal. From the training data, model 9 MSE was significantly lower than model 3 (Fig. 3), and total model error of the validation dataset for model 8 was significantly lower than model 3 (Fig. 5). Furthermore, significant reductions in error for specific movement patterns were possible through the addition of EMG (Fig. 2). Yet, the magnitude of the improvements is perhaps not large enough to warrant the additional cost of EMG. In the current experiment, only 4 unilateral muscles were monitored, and the input parameters were limited to the mean and peak values of 1 s intervals. It is possible that using different signal components at higher frequencies and bilateral electrodes would provide a better model solution. The practical barriers to widespread use of surface EMG, apart from cost, include inconsistent electrode conductivity and placement over target muscles, and electrode and amplifier bulk (with associated discomfort to the wearer). It is not within the scope of this paper to propose a compact, durable, and accurate EMG system for practical living conditions, but continuing advancements in algorithm development, sensor miniaturization, and increased battery life should make a wearable system a reality in the near future.

Combined heart rate and accelerometry model performance Several studies have recently examined the benefits of combining accelerometry and heart rate data to predict energy expenditure (Brage et al. 2004, 2005; Plasqui and Westerterp 2005). A model to predict energy expenditure that included a ratio between heart rate and activity counts was found to be significantly correlated (r2 = 0.70) with energy expenditure in a noncross-validated sample of 25 individuals (Plasqui and Westerterp 2005). Prior to that, Strath et al. (2001) showed that combining

#

2008 NRC Canada

Hay et al.

Study limitations There are 2 limitations to this study related to sampling rates that should be considered when interpreting the results. To prepare data for input to the model used in this study, all variables had to be time normalized. A time interval of 1 Hz was chosen as a compromise between the various native sampling frequencies of the biosignal data. As calculation of the target variable (energy expenditure) varied as a function of the breathing frequency of the subjects, splines were used to interpolate datapoints when breathing frequency was lower than 1 Hz. However, because ventilation rate is correlated with exercise intensity, the number of interpolated energy expenditure values was greater at lower exercise intensities, which may have increased model error. Cooper and Storer (2001) caution that the breath-by-breath method increases physiological variability in the data, but they also point out that, with careful smoothing, pattern recognition and thresholds may be more detectable. In this regard, a 3-point moving average was applied to the time-normalized respiration data. The second limitation concerns the native sampling frequency of the Actiheart accelerometer used to collect activity counts. The activity counts were recorded in 15 s intervals, which were then interpolated to 1 Hz. During steady-state activities, such as resting, walking, and jogging, activity counts were likely not subject to transient fluctuations; therefore, interpolation errors may not have been large. However, during motions that were completed within 15 s (e.g., sitting down and standing up), interpolation would have attenuated the activity counts, thereby decreasing the accuracy of the input data.

Conclusion To improve the predictive ability of any ANN model, suitable training data must be available. In the current study, a range of low-intensity and vigorous physical activities were performed, from which the various ANNs were able to model with a high degree of accuracy, despite a relatively small bigender sample size and large age variation. However, a variety of movement types was not included in the experimental design (e.g., cycling, stair climbing), and the strength training components were relatively brief in duration and limited by load. To improve the applicability of the methodology proposed in this paper, a wider range of activities and a larger number of human participants will be required. Furthermore, a greater amount of data collected at higher exercise intensities (>500 kcalh–1) would provide more convincing evidence of the model’s total predictive strength. Nonetheless, these preliminary results clearly demonstrate that overall energy expenditure and transient changes in exercise intensity can be estimated to a high degree of accuracy with ANNs. Furthermore, while the addition of EMG inputs can improve model accuracy — especially at high exercise intensities — this study has shown that the use of accelerometry and HRar as dynamic input parameters by themselves is an economical and sufficient solution.

References Brage, S., Wedderkopp, N., Franks, P.W., and Froberg, K. 2003.

1221 Reexamination of validity and reliability of the CSA monitor in walking and running. Med. Sci. Sports Exerc. 35: 1447–1454. doi:10.1249/01.MSS.0000079078.62035.EC. PMID:12900703. Brage, S., Brage, N., Franks, P.W., Ekelund, U., Wong, M., Andersen, L.B., et al. 2004. Branched equation modeling of simultaneous accelerometry and heart rate monitoring improves estimate of directly measured physical activity energy expenditure. J. Appl. Physiol. 96: 343–351. doi:10.1152/japplphysiol. 00703.2003. PMID:12972441. Brage, S., Brage, N., Franks, P.W., Ekelund, U., and Wareham, N.J. 2005. Reliability and validity of the combined heart rate an movement sensor actiheart. Eur. J. Clin. Nutr. 59: 561–570. doi:10.1038/sj.ejcn.1602118. PMID:15714212. Cooper, C.B., and Storer, T.W. 2001. Exercise testing and interpretation: a practical approach. Cambridge University Press, Cambridge. Crouter, S.E., Churilla, J.R., and Bassett, D.R. 2006a. Estimating energy expenditure using accelerometers. Eur. J. Appl. Physiol. 98: 601–612. doi:10.1007/s00421-006-0307-5. PMID:17058102. Crouter, S.E., Clowers, K.G., and Bassett, D.R. 2006b. A novel method for using accelerometer data to predict energy expenditure. J. Appl. Physiol. 100: 1324–1331. doi:10.1152/ japplphysiol.00818.2005. PMID:16322367. Dauncey, M.J., and James, W.P. 1979. Assessment of the heart-rate method for determining energy expenditure in man, using a wholebody calorimeter. Br. J. Nutr. 42: 1–13. doi:10.1079/ BJN19790084. PMID:486384. deVries, H.A., Burke, R.K., Hopper, R.T., and Sloan, J.H. 1976. Relationship of resting EMG level to total body metabolism with reference to the origin of ‘tissue noise’. Am. J. Phys. Med. 55: 139–147. PMID:937519. Goran, M.I., , 2005. Estimating energy requirements: regression based prediction equations or multiples of resting metabolic rate. Public Health Nutr. 8(7A): 1184–1186. PMID:16277827. Hagan, M.T., and Menhaj, M. 1994. Training feedforward networks with the Marquardt algorithm. IEEE Trans. Neural Netw. 5: 989–993. doi:10.1109/72.329697. PMID:18267874. Hagan, M.T., Demuth, H.B., and Beale, M.H. 1996. Neural network design. PWS Publishing, Boston, Mass. Henry, C.J.K. 2005. Basal metabolic rate studies in humans: measurement and development of new equations. Public Health Nutr. 8(7A): 1133–1152. doi:10.1079/PHN2005801. PMID:16277825. Kalman, E. 2004. Monitoring energy metabolism with indirect calorimetry: instruments, interpretation, and clinical application. Nutr. Clin. Pract. 19: 447–454. doi:10.1177/011542650 4019005447. PMID:16215138. Karabulut, M., Crouter, S.E., and Bassett, D.R. 2005. Comparison of two waist-mounted and two ankle-mounted electronic pedometers. Eur. J. Appl. Physiol. 95: 335–343. doi:10.1007/ s00421-005-0018-3. PMID:16132120. Keytel, L.R., Goedecke, J.H., Noakes, T.D., Hiiloskorpi, H., Laukkanen, R., and Lambert, E.V. 2005. Prediction of energy expenditure from heart rate monitoring during submaximal exercise. J. Sports Sci. 23: 289–297. doi:10.1080/02640410 470001730089. PMID:15966347. Levine, J.A. 2005. Measurement of energy expenditure. Public Health Nutr. 8(7A): 1123–1132. doi:10.1079/PHN2005800. PMID:16277824. Londeree, B.R., and Ames, S.A. 1976. Trend analysis of the % VO2 max-HR regression. Med. Sci. Sports Exerc. 8: 123–125. PMID:957932. Melanson, E.L., and Freedson, P.S. 1995. Validity of the computer science and applications, Inc. (CSA) activity monitor. Med. Sci. Sports Exerc. 27: 934–940. PMID:7658958. #

2008 NRC Canada

1222 Miller, W.C., Wallace, J.P., and Eggert, K.E. 1993. Predicting max HR and the HR-VO2 relationship for exercise prescription in obesity. Med. Sci. Sports Exerc. 25: 1077–1081. PMID:8231778. Pambianco, G., Wing, R.R., and Robertson, R. 1990. Accuracy and reliability of the Caltrac accelerometer for estimating energy expenditure. Med. Sci. Sports Exerc. 22: 858–862. PMID:2287266. Plasqui, G., and Westerterp, K.R. 2005. Accelerometry and heart rate as a measure of physical fitness: proof of concept. Med. Sci. Sports Exerc. 37: 872–876. doi:10.1249/01.MSS. 0000161805.61893.C0. PMID:15870644. Rennie, K.L., Hennings, S.J., Mitchell, J., and Wareham, N.J. 2001. Estimating energy expenditure by heart-rate monitoring without individual calibration. Med. Sci. Sports Exerc. 33: 939–945. doi:10.1097/00005768-200106000-00013. PMID:11404659. Riley, D.J., Wingard, D., Morton, D., Nichols, J.F., Ji, M., Shaffer, R., and Macera, C.A., 2005. Use of self-assessed fitness and exercise parameters to predict objective fitness. Med. Sci. Sports Exerc. 37: 827–831. doi:10.1249/01.MSS.0000162618.69807.0E. PMID:15870637. Rothney, M.P., Neumann, M., Beziat, A., and Chen, K.Y. 2007. An artificial neural network model of energy expenditure using nonintegrated acceleration signals. J. Appl. Physiol. 103: 1419– 1427. doi:10.1152/japplphysiol.00429.2007. PMID:17641221. Schoeller, D.A. 2002. Validation of habitual energy intake. Public Health Nutr. 5: 883–888. doi:10.1079/PHN2002378. PMID: 12633511. Seliger, V., Dolejs, L., and Karas, V. 1980. A dynamometric comparison of maximum eccentric, concentric, and isometric contractions using emg and energy expenditure measurements. Eur.

Appl. Physiol. Nutr. Metab. Vol. 33, 2008 J. Appl. Physiol. Occup. Physiol. 45: 235–244. doi:10.1007/ BF00421331. PMID:7193132. Speakman, J.R. 1997. Doubly labelled water: theory and practice. Chapman & Hall, London. Strath, S.J., Bassett, D.R., Swartz, A.M., and Thompson, D.L. 2001. Simultaneous heart rate motion sensor technique to estimate energy expenditure. Med. Sci. Sports Exerc. 33: 2118–2123. doi:10.1097/00005768-200105001-01423. PMID:11740308. Thompson, D., Batterham, A.M., Bock, S., Robson, C., and Stokes, K. 2006. Assessment of low to-moderate intensity physical activity thermogenesis in young adults using synchronized heart rate and accelerometry with branched-equation modeling. J. Nutr. 136: 1037–1042. PMID:16549471. Trivel, D., Leger, L., and Calmels, P. 2006. Fitness assessment by questionnaire. Sci. Sports, 21: 121–130. doi:10.1016/j.scispo. 2005.12.004. Wasserman, K., Hansen, J.E., Sue, D.Y., Stringer, W.W., and Whipp, B.J. 2005. Principles of exercise testing and interpretation: including pathophysiology and clinical applications (4th ed). Lipppincott Williams & Wilkins, Philadelphia. Weisman, I.M., and Zeballos, R.J. (Editors). 2002. Clinical exercise testing. Kerger, New York. Whaley, M.H. (Editor). 2006. ACSM’s guidelines for exercise testing and prescription (7th ed.). Lipppincott Williams & Wilkins, Philadelphia. Whaley, M.H., Kaminsky, L.A., Dwyer, G.B., Getchell, L.H., and Norton, J.A. 1992. Predictors of over- and underachievement of age-predicted maximal heart rate. Med. Sci. Sports Exerc. 24: 1173–1179. PMID:1435167.

#

2008 NRC Canada