Yaw Estimation Using Cylindrical and Ellipsoidal Face ... - IEEE Xplore

6 downloads 0 Views 1MB Size Report
Sep 26, 2014 - Abstract—Accurate head yaw estimation is necessary for de- tecting driver inattention in forward collision warning systems. In this paper, we ...
2308

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 15, NO. 5, OCTOBER 2014

Yaw Estimation Using Cylindrical and Ellipsoidal Face Models Athi Narayanan, Ramachandra Mathava Kaimal, and Kamal Bijlani

Abstract—Accurate head yaw estimation is necessary for detecting driver inattention in forward collision warning systems. In this paper, we propose three geometric models under the ellipsoidal framework for accurate head yaw estimation. We present theoretical analysis of the cylindrical and ellipsoidal face models used for yaw angle estimation of head rotation. The relationship between cylindrical, ellipsoidal, and proposed models is derived. We provide error functions for all models. Furthermore, for each model, over/under estimation of angle, zero crossings of error, bounds on yaw angle estimate, and bounds on error are presented. Experimental results of the proposed models on four standard head pose data sets yielded a mean absolute error between 4◦ and 8◦ demonstrating the efficacy of the proposed models over the state-of-the-art methods. Index Terms—Driver monitoring system, gaze estimation, headorientation estimation.

I. I NTRODUCTION

D

RIVER inattention is a leading cause of accidents on highways. Automatic detection of inattention can prevent accidents by providing an alert to the driver. During driving, the driver may lose attention either due to distraction or due to fatigue [1]. Using the knowledge of orientation of driver’s head, both the distraction and fatigue nature of driver inattention can be detected ahead of time before the accident. Hence, the headorientation estimation plays an important role in driver inattention monitoring systems, which are incorporated in intelligent vehicles. In a computer vision context, head pose estimation is the process of inferring the orientation of a human head from digital imagery [2]. Head pose is described using three angles, i.e., pitch angle for vertical orientation of the head (looking up or down), yaw angle for horizontal orientation of head (looking left or right), and roll angle for lateral orientation of head (tilting left or right). Head pose estimation has many reallife applications, such as human–computer interfaces for phys-

Manuscript received August 14, 2013; revised December 27, 2013 and March 11, 2014; accepted March 20, 2014. Date of publication April 29, 2014; date of current version September 26, 2014. The Associate Editor for this paper was R. I. Hammoud. This work was supported in part by the National Mission on Education through Information and Communication Technology, by the Ministry of Human Resource Development, and by the Amrita Vishwa Vidyapeetham. A. Narayanan is with the Amrita E-Learning Research Lab, Department of Computer Science, Amrita Vishwa Vidyapeetham, Kollam 690 525, India (e-mail: [email protected]). R. M. Kaimal is with the Department of Computer Science, Amrita Vishwa Vidyapeetham, Kollam 690 525, India. K. Bijlani is with the Amrita E-Learning Research Lab, Amrita Vishwa Vidyapeetham, Kollam 690 525, India. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TITS.2014.2313371

ically challenged people, gaming, virtual reality, detection of driver inattention, dyslexia detection in children, human–robot interaction, gaze estimation, etc. Head pose estimation is also considered as a preprocessing step for pose-invariant face recognition, emotion recognition, etc. Among the three head pose angles (yaw, pitch, and roll), yaw angle estimation has many important applications compared with pitch and roll angles estimation [3]. Due to the applications of yaw angle estimation, research has been more focused on the estimation of yaw angle [4]–[8], [13]. The roll angle can be easily estimated by the relative position of feature points, but the estimation of yaw and pitch angles is difficult. Considerable effort has been put by the computer vision community to solve the head pose estimation problem. A survey of the computer vision algorithms used for head pose estimation is given in [2]. The yaw angle estimation methods can be broadly classified as appearance- and model-based methods. Appearance-based methods extract texture features from 2-D face image and relates the features with the 3-D head pose. The appearancebased methods are considered as nonparametric methods, since these methods do not assume the head to follow any specific model. The appearance-based methods can further be classified as appearance template methods, detector array methods, nonlinear regression methods, and manifold embedding methods. Nonlinear regression methods uses support vector regressors or neural networks to learn a nonlinear functional mapping from image space to head pose space. Dimensionality reduction techniques such as principal component analysis (PCA), linear discriminant analysis (LDA), and their kernelized versions are used in manifold embedding-based head pose estimation methods. Watta et al. [50] developed nonparametric approaches for driver’s head pose estimation using eigenfaces and fisherfaces. Lu and Tan [9] utilized the discriminative power of ordinary preserving information to improve the head pose estimate through manifold embedding. Yan et al. [10] used synchronized submanifold embedding technique to render the missing pose manifolds between subjects during the training phase. Fu et al. [11] approximated the global head pose manifold by a set of localized linear manifolds. Ma et al. [5] and Hu et al. [7] utilized local Gabor binary pattern for head pose estimation. Ma et al. [3] estimated the head yaw angle using Fourier transform, which extracts the asymmetry present in facial appearance due to pose changes. By properly eliminating the background, which is introduced due to the face detection module, the yaw estimation performance of appearance-based methods can be improved [6]. Ranganathan et al. [12] introduced an online sparse Gaussian process regression for head pose estimation. A block-based sparse representation classifier

1524-9050 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

NARAYANAN et al.: YAW ESTIMATION USING CYLINDRICAL AND ELLIPSOIDAL FACE MODELS

Fig. 1. Face images at varying yaw angles from CAS-PEAL data set. (Left to right: −45◦ , −30◦ , −15◦ , 0◦ , 15◦ , 30◦ , 45◦ ). The violet, green, and yellow lines are the left boundary, center, and right boundary of face, respectively.

is used by Ma and Wang [13] for head yaw estimation. Chen et al. [14] exploited the spectral regression discriminant analysis for head pose estimation. Recently, tensor representations [15], [55] have been used for head yaw estimation. Model-based methods model the 3-D structure of human head, and the yaw angle is estimated by mapping the face feature points with the model. The model-based methods are considered as parametric methods, since these methods assume the head to follow a particular model, which, in turn, has parameters. Model-based methods can be further classified as flexible models and geometric methods. Flexible models such as elastic bunch graph, active appearance model, and active shape model are employed to estimate the head pose. Estimating the head pose using a flexible model involves two stages, i.e., initialization and tracking. In the initialization stage, the flexible model will be fit to the face region, and the initial absolute head pose is obtained. In the tracking stage, the face region will be tracked, and the relative change in head pose is obtained. Krinidis et al. [16] used a 3-D deformable surface along with radial basis functions to track head orientation. Dornaika and Raducanu [17] initialized a wireframe model using the eigenface, and tracking is performed using an online appearance model. Mbouna et al. [18] mapped a 3-D head model with the face region using POSIT and tracked using optical flow information. Reale et al. [20] fitted a generic 3-D head model and tracked using feature-based scene flow information instead of optical flow information. Chutorian and Trivedi [25] fitted a texture mapped 3-D model and tracked using particle filtering. Three-dimensional cylinder [28], elliptic cylinder [19], [21], and 3-D ellipsoidal models [24] are also utilized to fit the face region and further track the head orientation. Jimenez et al. [22], [23] constructed a 3-D model automatically using stereo correspondence, and the head pose is estimated by tracking feature points. Geometric methods make use of the head shape and face feature point locations to estimate head pose. Based on the usage/nonusage of eye position, the geometric methods can be divided further. Nikolaidis and Pitas [51] computed the head yaw from the distortion of the isosceles triangle formed by the two eyes and the mouth. Batista [53] and Ji and Hu [54] used the eye locations to fit an ellipse on the face region. Continuous yaw values are estimated from the fitted ellipse. The face images at varying yaw angles are shown in Fig. 1. In the front view image (0◦ yaw), the left and right boundaries of the face are equidistant from the face center. For head turns with negative (positive) yaw angles, the face center is close to the left (right) boundary and far from the right (left) boundary. With the face varying from front to profile view, the closeness or farness of the face center from the face boundary increases. Therefore, the relative distance between the face center and the face boundaries can be exploited to compute the head yaw.

2309

Cylindrical and ellipsoidal face models are generally used for the estimation of yaw angle. Ohue et al. [26] and Lee et al. [27] did not use the eye locations, instead they use the face center and the face boundaries for yaw estimation. Recently, Fu et al. [52] has proposed the driver’s gaze zone estimation method that employs the face center and the face boundaries for the preparation of training patches. An ellipsoidal face model was introduced by Lee et al. [27] to estimate the yaw angle. The ellipsoidal face model [27] outperforms the cylindrical face model [26], as the human head is ellipsoidal in shape and is not cylindrical. Model-based methods run very fast and is suitable for realtime applications. These methods are tolerant to the appearance variation between humans. These methods do not require training with large number of images. However, these methods have some disadvantages. First, these methods are sensitive to the misalignment of face feature points. Second, a single head model fails to represent all the heads exactly. Third, these methods require high image resolution and image quality. Fourth, these methods are very sensitive to partial occlusions; as some of the feature points will be lost and the pose estimate may not be computed under occluded scenarios. Appearance-based methods support near-field and far-field imagery. As these methods extract features from the entire face region, they are less sensitive to partial occlusions. However, these methods have some common disadvantages. First, these methods are sensitive to the head localization error induced due to the face detection module. The presence of the small background region from the face detection output impacts the performance of these methods. Second, these methods require training with large number of images. In real-world applications, the training data may be disjoint set of poses for each person. Third, the texture features used to train these methods not only contain pose information but also the information on identity, lighting, and facial expression. Fourth, in these methods, the extracted texture features are represented as 1-D vector; thus, the face structure about the pose is lost. Fifth, in nonlinear regression methods, it is not clear how good the mapping function is. The appearance-based methods (nonparametric methods) become less accurate than the model-based methods (parametric methods) when the assumptions of the model-based methods are met. The current approaches can automatically and reliably detect locations of facial features [2]. Thus, in this paper, we further explore the dependent geometric methods, which are yet to reach their full potential [2]. The simplicity of geometric methods makes them reliable candidates at the initialization and tracking failure stages of tracking-based pose estimation techniques. The geometric methods, which use only the location of the center of the face and face boundaries (without using eye location) for head yaw estimation, are very attractive as these methods: 1) are robust against wearing glasses; 2) are invariant to facial expressions; and 3) can support large head rotations. Hence, we propose geometric models that use only the location of the center of the face and the face boundaries for head yaw estimation. In life-critical applications such as driver inattention detection, accurate yaw angle estimation is required to provide forward collision warning.

2310

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 15, NO. 5, OCTOBER 2014

Fig. 2. Cylindrical face model. Fig. 3.

In this paper, we propose three ellipsoidal framework-based geometric models for accurate yaw angle estimation. We derive the relations between the yaw angles estimated by different models. The error function for each of the models is also derived. Furthermore, we present over/under estimation of yaw angle, zero crossings of error, bounds on yaw angle estimate, and bounds on error. Thereby, we identify two rules that are to be satisfied by a geometric model for accurate yaw estimation. This paper is organized as follows: In Section II, we review the cylindrical and ellipsoidal face models. In Section III, we derive the proposed face models. In Section IV, the relationship between the models is presented. The error functions and insights on cylindrical and ellipsoidal models are given in Section V. The details of the experimental results are presented in Section VI. Conclusions are given in Section VII. II. R EVIEW OF C YLINDRICAL AND E LLIPSOIDAL FACE M ODELS A. Cylindrical Face Model In a cylindrical model [26], head rotation is considered as the rotation of a cylinder with respect to (wrt) its center point. For yaw computation, the circle of the cylindrical model is given by x2 y2 + 2 = 1. 2 r r

(1)

Here, r is the radius of the circle. The cylindrical model for yaw computation is shown in Fig. 2. Using the cylindrical model, the yaw angle θc can be estimated by   x c − x c Rc −1 θc = sin (2) r x c Rc = r=

xr + xl 2

(3)

xr − xl . 2

(4)

Ellipsoidal face model. (a) At yaw angle 0◦ . (b) At yaw angle θe .

Cylindrical model is easily implemented, but it has the following drawbacks: a) an ellipsoid models the head better than a cylinder; and b) it assumes that the face boundaries xl and xr are independent of head rotation, as the center of rotation of head is assumed to be the axis of the cylinder. The second drawback is a consequence of the first and also because the center of rotation of head is not the center of head but the center of neck. Hence, as the yaw angle varies, the radius of head rotation varies in cylindrical model. Radius of head rotation has to be fixed irrespective of the yaw angle. B. Ellipsoidal Face Model In an ellipsoidal model [27], head rotation is considered as the rotation of an ellipsoid wrt a point that is shifted from the center of ellipsoid (along the depth). Thus, the model considers the center of rotation as the center of neck. For yaw computation, the ellipse of the ellipsoidal model is given by (y + αr)2 x2 + = 1. 2 r (βr)2

Here, r and βr are the semiminor and semimajor axis, respectively, of the ellipse; α and β are anthropometric constants with values 0.25 and 1.25, respectively. The ellipsoidal model for yaw computation is shown in Fig. 3. In the ellipsoidal model, the center of rotation is not the center of the ellipsoid but the center of the neck. Due to this, a translation of αr is introduced along the y-direction of the ellipse, and the corresponding radius of rotation is R. Using the ellipsoidal model, the yaw angle θe can be estimated by   x c − x c Re θe = sin−1 (6) R R = (α + β)r

Here, xcRc is the x-coordinate of the center of rotation cRc , and r is the radius of rotation. The left border, the right border, and the center of the face, i.e., xl , xr , and xc , respectively, are obtained from the face image. The face boundary xl (xr ) can be identified as the intersection point between the x-axis and the tangent (that is parallel to the y-axis) to the circle.

(5)

(7)

α=

cRe − cE  r

(8)

β=

major axis minor axis

(9)

x c Re =

xr + xl − αr sin θc . 2

(10)

NARAYANAN et al.: YAW ESTIMATION USING CYLINDRICAL AND ELLIPSOIDAL FACE MODELS

Here, xcRe is the x-coordinate of the center of rotation cRe , cE is the center of ellipse (CE), and θc is the estimated yaw angle by cylindrical model. In (10), θc is used instead of θe . As θe is unknown, θc is used as an approximation to θe . The face boundaries xl and xr can be found by the two intersection points between the x-axis and the two lines that are parallel to the y-axis and are tangents to the ellipse. The ellipsoidal model mimics the human head, but it has the following drawbacks. First, the ellipsoidal model makes use of the cylindrical model’s estimated yaw angle for the calculation of center of rotation. Hence, the modeling error propagates from the cylindrical model to the ellipsoidal model. Second, the ellipsoidal model employs r (4) of the cylindrical model for computing the radius of rotation R (7). Because r (4) varies with the yaw angle, R, which is a fixed value, also varies with the yaw angle. III. P ROPOSED FACE M ODEL We use the same ellipsoidal framework [27], as shown in Fig. 3. We propose three geometric models for accurate yaw estimation. Compared with the cylindrical [26] and ellipsoidal models [27], the proposed models have reduced variation in the radius of head rotation over head turns; thus, they are more accurate. The parametric form of the ellipse in Fig. 3(a), with the center of rotation cRe as origin, is given as x = xcRe + r cos t,

y = ycRe + βr sin t − αr, for t ∈ [0, 2π].

(11)

The ellipse is rotated in counterclockwise direction by a yaw angle θ (In order to avoid confusion between the yaw estimate θe (6) in Section II, the subscript e of the yaw angle is omitted in all contexts of the proposed models)        x x c Re cos θ − sin θ r cos t = + (12) y sin θ cos θ βr sin t − αr y c Re x = xcRe + r cos t cos θ − βr sin t sin θ + αr sin θ

(13)

y = ycRe + r cos t sin θ + βr sin t cos θ − αr cos θ. (14) The slope at each point on the rotated ellipse is given as −r sin θ sin t + βr cos θ cos t dy = . dx −r cos θ sin t − βr sin θ cos t

(15)

The left and right boundaries of face are the tangents of the ellipse, which are parallel to y-axis. For tangents parallel to y-axis dy = ∞. dx

(16)

2311

Substituting t (17), (18) on x (13), we get the face boundaries xl and xr . Thus   xl = xcRe − r cos θ cos tan−1 (β tan θ)   − βr sin θ sin tan−1 (β tan θ) + αr sin θ

(19)

+ αr sin θ.

(20)

  xr = xcRe + r cos θ cos tan−1 (β tan θ)   + βr sin θ sin tan−1 (β tan θ)

Equations (19) and (20) can be compactly represented (derivation given in the appendix) as  (21) xl = xcRe − r 1 + (β 2 − 1) sin2 θ + αr sin θ  xr = xcRe + r 1 + (β 2 − 1) sin2 θ + αr sin θ. (22) The CE is xcE =

(xl + xr ) = xcRe + αr sin θ. 2

(23)

From Fig. 3(b), we have  xc −

xc = xcRe + (α + β)r sin θ  xl + xr = βr sin θ 2 sin θ =

r xc − ( xl +x 2 ) . βr

(24) (25) (26)

As the yaw angle is estimated using the CE, this expression is named as CE model

r xc − xl +x −1 2 . (27) θCE = sin βr Using the expressions of xl (21) and xr (22)  xr − xl = 2r 1 + (β 2 − 1) sin2 θ. On solving (28), we have

  2  x 1 − x r l sin θ = ± 2 −1 . (β − 1) 2r

(28)

(29)

As the yaw angle is estimated using the difference between xr and xl , this expression is named as head range (HR) model. Thus ⎧ ⎫  2 ⎬ ⎨  xr − xl 1 θHR = ± sin−1  2 −1 . (30) ⎭ ⎩ (β − 1) 2r

The solutions are t = −tan−1 (β tan θ)

(17)

t = −π − tan−1 (β tan θ).

(18)

The ambiguity in the sign of θHR can be solved as    xl + xr sgn(θHR ) = sgn xc − . 2

(31)

2312

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 15, NO. 5, OCTOBER 2014

TABLE I R ESULTS OF THE P ROPOSED M ODELS

θCE and θHR are the different expressions of the same yaw estimate. CE model and HR model are equivalent. On equating θCE (27) with θHR (30) and solving for r, i.e.,  1 (β 2 − 1) r= {2xc − (xl + xr )}2 . (32) (xr − xl )2 − 2 β2

The face boundaries for the Boundary-corrected CE Model (Boundary CE) is given as  xl = xcRe − r 1 + (β 2 − γb2 − 1) sin2 θ + αr sin θ (37)  xr = xcRe + r 1 + (β 2 − γb2 − 1) sin2 θ + αr sin θ. (38)

Unlike the cylindrical and ellipsoidal models, which use only xl and xr for estimating r, the proposed model also makes use of xc . The first term in the above expression increases with yaw angle [refer to (28)], whereas the second term compensates for the increase in the first term. Thus, the proposed model’s r is fixed irrespective of the variation in yaw angle. From Fig. 3(a), the proposed model’s r can be visualized as half the width of frontal face, at which the yaw angle is zero. For a fixed scaling, r is constant. Thus

A correction term γb2 is introduced to the face boundaries of the CE model. The value of γb is obtained as 0.5774 by equating xc (24) to xr (38) when θ is 60◦ . Using the expressions of xl (37), xr (38), and xc (24), the yaw angle estimate and the half-face width r for this model can be derived as

xl +xr x − c 2 θBoundary_CE = sin−1 (39) βr  (β 2 − γb2 − 1) 1 (xr − xl )2 − {2xc − (xl + xr )}2 . r= 2 β2

r=

FW . 2

(33)

(40)

Here, F W is the frontal face width.

The results of the proposed models are summarized in Table I. A. Modifications to the CE Model Under the ellipsoidal framework, the face boundaries and the face center meet at the yaw angles ±90◦ . However, in practice, the face boundaries and the face center meet at the yaw angles ±60◦ . For head turns above +60◦ (−60◦ ), the face right (left) boundary will not be visible to the camera view. The CE model can be modified to meet this requirement by introducing a correction term to either the face center or the face boundaries. The face center for the Center-corrected CE Model (Center CE) is given as xc = xcRe + (α + β + γc )r sin θ.

 (xr −xl )2 −

The relationship between the yaw angles estimated by the different face models is analyzed here. The derivations for the relationship expressions have been given in the appendix. A. Cylindrical and Ellipsoidal Face Models The relationship between the yaw angle estimated by the cylindrical model and the ellipsoidal model is given as

(34) sin θe =

A correction term γc is added to the face center of the CE model. The value of γc is obtained as 0.1269 by equating xc (34) to xr (22) when θ is 60◦ . Using the expressions of xl (21), xr (22), and xc (34), the yaw angle estimate and the half-face width r for this model can be derived as

r xc − xl +x −1 2 (35) θCenter_CE = sin (β + γc )r 1 r= 2

IV. R ELATIONSHIP B ETWEEN FACE M ODELS

(β 2 −1) {2xc − (xl + xr )}2 . (β +γc )2 (36)

(1 + α) sin θc . (α + β)

(41)

For the anthropometric data [27], α = 0.25 and β = 1.25 sin θe = 0.833 sin θc .

(42)

From the above relationship equation, the estimated yaw angle by the ellipsoidal model will always be less than the estimated yaw angle by the cylindrical model. In the ellipsoidal model’s yaw estimation equation (41), if the α term alone is removed from the denominator (radius of rotation), the ellipsoidal model will be equivalent to the cylindrical model. Thus sin θe =

(1 + α) sin θc . β

(43)

NARAYANAN et al.: YAW ESTIMATION USING CYLINDRICAL AND ELLIPSOIDAL FACE MODELS

2313

TABLE II I NSIGHTS ON M ODELS

For the anthropometric data, α = 0.25 and β = 1.25, i.e., sin θe = sin θc .

E. Boundary CE and CE Face Models (44)

In the cylindrical model, due to improper modeling, the estimated yaw angle is always greater than the true yaw angle [27]. This error in cylindrical model is reduced by introducing an α term in the denominator of the yaw estimate expression in the ellipsoidal model.

The relationship between the yaw angle estimated by the Boundary CE model and the CE model is given as 1 sin θCE . (48) sin θBoundary_CE =  1 + γb2 sin2 θCE

V. E RROR F UNCTIONS B. Cylindrical and CE Face Models The relationship between the yaw angle estimated by the cylindrical model and the CE model is given as sin θc = 

β 1+

(β 2

− 1) sin2 θCE

sin θCE .

(45)

C. Ellipsoidal and CE Face Models The relationship between the yaw angle estimated by the ellipsoidal model and the CE model is given as sin θe =

(α + β)



β(1 + α) 1 + (β 2 − 1) sin2 θCE

sin θCE .

(46)

D. Center CE and CE Face Models The relationship between the yaw angle estimated by the Center CE model and the CE model is given as sin θCenter_CE = β

 (β + γc )

1+

(β 2



− 1) sin θCE 1 − 2



β β+γc

2 

sin θCE .

Using the above expressions of the relationship between the models, error functions can be derived wrt the yaw angle estimated by CE model. (It is also possible to arrive at error functions relative to other models.) As the yaw angle estimated by CE model is theoretically correct under the ellipsoidal framework, we can replace θCE by the true yaw angle θt in the above expressions. The error function of the cylindrical model is given as β −1  sin θt − θt . (49) θc − θt = sin 1 + (β 2 − 1) sin2 θt The error function of the ellipsoidal model is given as β(1+α) −1  θe − θt = sin sin θt − θt . (α+β) 1 + (β 2 − 1) sin2 θt (50) The error function of the Center CE model is given as (51), which is shown at the bottom of the page. The error function of the Boundary CE model is given as ⎧ ⎫ ⎨ ⎬ 1 sin θt − θt . θBoundary_CE − θt = sin−1  ⎩ 1 + γ 2 sin2 θ ⎭ b

(47)

θCenter_CE − θt = sin−1

⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩ (β + γc )



t

(52)

⎫ ⎪ ⎪ ⎪ ⎪ ⎬

β

sin θt − θt  2   ⎪ ⎪ ⎪ β ⎪ 1 + (β 2 − 1) sin θt 1 − β+γc ⎭ 2

(51)

2314

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 15, NO. 5, OCTOBER 2014

Fig. 4. Experiment 1 results. (a) Variation of radius of rotation with yaw. (b) Estimated yaw angle wrt true yaw angle. (c) Error in estimated yaw angle wrt true yaw angle.

Using the above error functions and the relationship between models, we obtain zero crossings of the estimation error, bounds on estimated yaw angle, yaw angle range for under/over estimation, estimation error bounds, and error maxima for all models wrt CE model. The observations from these models are summarized in Table II. Table II shows that the error maxima are present for cylindrical and ellipsoidal models, whereas it is absent for the proposed CE-based models. Thus, in the ellipsoidal model, the modeling error from the cylindrical model is propagated; whereas, in the proposed CE-based models, such error is absent.

TABLE III E RROR IN YAW E STIMATION ( IN D EGREES)

VI. E XPERIMENTAL R ESULTS

than the ellipsoidal model. Fig. 4(b) shows the estimated yaw angle wrt true yaw angle. The yaw estimation bounds of the cylindrical, ellipsoidal, Center CE, and Boundary CE models (mentioned in Table II) can be verified to be correct from Fig. 4(b). In addition, for near frontal faces (very smaller yaw angles), all the models performs good. Fig. 4(c) presents the estimation error plot of different models. Three observations can be made from this plot. First, as the yaw angle increases, the estimation error of cylindrical model is sinusoidal in nature. Second, for larger yaw angles, the cylindrical model outperforms other models. Third, for smaller yaw angles, the error in cylindrical model’s yaw estimate is greater than that of other models. In addition, the interpretations in Table II are verified to be correct with Fig. 4(c). Table III provide the root-mean-square error (RMSE) and maximum error of the models for different ranges of true yaw angle. From Table III, it is clear that the cylindrical model performs better than other models for larger head turns. In addition, the proposed models and the ellipsoidal model outperform the cylindrical model for smaller head turns. For the entire yaw angle range, the RMSE of Center CE model is less than that of the ellipsoidal and Boundary CE models.

Here, in order to test the insights presented about the models, we have conducted two simulation experiments. In addition, we have evaluated the performance of the proposed models on four publicly available standard head pose data sets. A. Experiment 1 In this simulation experiment, using the following equations (53)–(55), we obtained xl , xr , and xc by varying θt (true yaw angle) from −90◦ to +90◦ in steps of 1◦ :  xl = xcR − r 1 + (β 2 − 1) sin2 θt + αr sin θt (53)  xr = xcR + r 1 + (β 2 − 1) sin2 θt + αr sin θt (54) xc = xcR + (α + β)r sin θt . (55) For each value of θt , the corresponding yaw estimates of cylindrical, ellipsoidal, CE, Center CE, and Boundary CE models are computed. Here, xcR is the x-coordinate of the center of rotation in an image. We considered the image dimension as 480 × 640; thus, xcR is 320. In this experiment, we set the anthropometric parameters as α = 0.25 and β = 1.25. Half the frontal face width r (the semiminor axis of the ellipse) is set as 150 pixels. Fig. 4(a) shows the variation of the radius of rotation with the yaw angle. It can be observed that the proposed CE model’s radius of rotation is fixed throughout the span of yaw angle, whereas the radius of rotation of other models is varying with yaw angle. In addition, the Boundary CE model’s variation of radius of rotation is larger than the Center CE model and lesser

B. Experiment 2 In this simulation experiment, a 480 × 640 image is made with a vertical ellipse at the center of the image. The semiminor axis of the ellipse is 150 pixels (r), and the semimajor axis is 187.5 pixels (βr). Another 480 × 640 image is made with a vertical line at the center of the image. Both the images are rotated around the anchor point (320, 202.5). The x-coordinate of the anchor point is the center of the image, whereas the

NARAYANAN et al.: YAW ESTIMATION USING CYLINDRICAL AND ELLIPSOIDAL FACE MODELS

2315

Fig. 5. Experiment 2 results. (a) Variation of radius of rotation with yaw. (b) Estimated yaw angle wrt true yaw angle. (c) Error in estimated yaw angle wrt true yaw angle. TABLE IV E RROR IN YAW E STIMATION ( IN D EGREES)

y-coordinate is 37.5 pixels (αr) above the center of image. Thus, the yaw angle of the head orientation is simulated as the image rotation angle. The face boundaries xl and xr are obtained from the bounding box information of the rotated ellipse. The face center xc is obtained as the intersection (AND operation) point of the rotated line image and the rotated ellipse image. Thus, the ellipsoidal face model is simulated through image operations. Experiments are conducted by varying the image rotation angle θt (true yaw angle) from −90◦ to +90◦ in steps of θt . For each value of θt , the corresponding yaw estimates of cylindrical, ellipsoidal, CE, Center CE, and Boundary CE models are computed. In experiment 2, for true yaw angles above ±60◦ , the rounding and truncation errors are induced due to drawing ellipse in image, which impacts the estimation error in the models. Fig. 5 presents the results of experiment 2. Observations similar to Fig. 4 can be made from Fig. 5. Table IV provides the RMSE and maximum error of the models for different ranges of true yaw angle. Observations similar to Table III can be made from Table IV. C. Experiment 3 Here, we have conducted experiments with four standard head pose data sets, which are publicly available and widely used by the research community. The data sets are as follows: BU [28]: The Boston University (BU) head pose data set contains 45 head motion videos along with their ground truth 3-D head pose. In this data set, nine different head motions were performed by five subjects. The video frame

Fig. 6. Example images from (top) FacePix database, (middle) Pointing ’04 database, and (bottom) CAS-PEAL database.

resolution in this data set is 320 × 240. In our experiments, all the videos in this data set are used. CAS-PEAL [29]: This data set contains face images for 21 poses combining seven yaw angles (−45◦ , −30◦ , −15◦ , 0◦ , 15◦ , 30◦ , and 45◦ ) and three pitch angles (30◦ , 0◦ , and −30◦ ) for each of 940 subjects. The image resolution in this data set is 480 × 360. In our experiments, all the images in this data set are used. FacePix [30]: This data set contains 181 images for each of 30 subjects spanning −90◦ to 90◦ in yaw at 1◦ intervals. The image resolution in this data set is 128 × 128. A subset of this data set containing yaw spanning −60◦ to +60◦ is used in our experiments. Pointing ’04 [31]: The Pointing ’04 database contains 15 subjects, each of which has images at different poses, including 13 yaw poses and 7 pitch poses. The image resolution in this data set is 384 × 288. A subset of this data set containing yaw spanning −60◦ to +60◦ is used in our experiments. Sample images from three data sets are shown in Fig. 6. Here, the effectiveness of the proposed models is demonstrated by comparing them with the state-of-the-art head yaw estimation methods. A comparative study of the state-of-the-art face landmarking techniques is given in [32]. The face landmark localization method in [33] outperforms the other state-of-the-art landmark localization methods. In our experiments, for all the images,

2316

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 15, NO. 5, OCTOBER 2014

TABLE V C OMPARISON OF THE MAE ( IN D EGREES ) ON BU DATA S ET

TABLE VI C OMPARISON OF THE RMSE ( IN D EGREES ) ON BU DATA S ET

TABLE VII C OMPARISON OF THE MAE ( IN D EGRESS ) ON FACE P IX DATA S ET

the face landmark localization method [33] is applied to locate the face boundaries and the face center. For the cylindrical, ellipsoidal and proposed models, we report the results for yaw angles spanning −60◦ to +60◦ of the head turn. The proposed models’ yaw estimate is independent of the pitch angle induced due to head orientation. As the pitch angle is due to vertical head orientation, it will not affect the horizontal coordinates of the face center and face boundaries, which are used in the proposed model’s yaw computation. In the BU data set, each frame in the video is roll angle compensated (as the ground truth head roll angle is known) by image rotation. From the roll-compensated frame, the face center and boundary coordinates are extracted for yaw angle computation. The accuracy of the proposed models is analyzed using the mean absolute error (MAE). The MAE of the proposed models in comparison with the state-of-the-art methods published using the same BU data set is presented in Table V. It is shown that the proposed models provide comparable or better results wrt the compared methods. Four observations can be made from Table V. First, the proposed Center CE and Boundary CE models outperform the 3-D model-based approaches [18] and [35]. Second, the cylindrical and ellipsoidal model’s MAE values are greater than that of the Center CE and Boundary CE models. Third, both the CE model and the ellipsoidal model provide similar results. Fourth, the Center CE model outperforms the CE model and Boundary CE model. Furthermore, the accuracy of the proposed models is analyzed using the RMSE. The RMSE of the proposed models in comparison with the state-of-the-art methods published using the same BU data set is presented in Table VI. It is shown that the proposed models provide comparable or better results wrt the compared methods. Four observations can be made from Table VI. First, the proposed Center CE model outperforms the method in [21], which models the head as a 3-D elliptic cylinder and utilizes the eye location information for fine tuning the estimated head pose. Second, the Center CE model is more accurate than the cylindrical, ellipsoidal, CE, and Boundary CE models. Third, the Boundary CE outperforms the ellipsoidal and CE models. Fourth, the CE model’s RMSE lies between the RMSEs of the cylindrical and ellipsoidal models.

The effectiveness of the proposed models is compared against the state-of-the-art manifold analysis-based head yaw estimation methods using the FacePix data set. The MAE of the proposed models is compared in Table VII against the published results [9] of existing manifold analysis-based head yaw estimation methods, which use the same data set. Two observations can be made from Table VII. First, the proposed models outperform the manifold analysis-based methods, cylindrical model, and ellipsoidal model. Second, out of the three proposed models, the Center CE model outperforms the CE model and Boundary CE model. The effectiveness of the proposed models is compared against the state-of-the-art head yaw estimation methods using the Pointing ’04 data set. The MAE of the proposed models is compared in Table VIII against the published results (from [2] and [45]) of existing head yaw estimation methods which use the same data set. Three observations can be made from Table VIII. First, except for the kernel partial least squares method [45], the proposed models outperform the existing methods, cylindrical model, and ellipsoidal model. Second, out of the three proposed models, the Center CE model outperforms the CE model and Boundary CE model. Third, the accuracy of the CE model lies between that of the cylindrical and ellipsoidal models. The effectiveness of

NARAYANAN et al.: YAW ESTIMATION USING CYLINDRICAL AND ELLIPSOIDAL FACE MODELS

TABLE VIII C OMPARISON OF THE MAE ( IN D EGREES ) ON P OINTING ’04 DATA S ET

TABLE IX C OMPARISON OF THE MAE ( IN D EGREES ) ON CAS-PEAL DATA S ET

2317

TABLE X C OMPARISON OF THE M EAN P ERCENTAGE I NCREASE IN R ADIUS OF ROTATION

For each of the models, the increase in radius of rotation is analyzed over the yaw span for the FacePix and Pointing ’04 data sets. The ground truth radius of rotation data is obtained by measuring the frontal face width of each subject manually. For each subject, for each yaw angle, the percentage increase in radius of rotation against the ground truth data is computed. The mean of the percentage increase in radius of rotation over all the subjects, over the entire yaw span is computed and presented in Table X. From Table X, it is clear that the percentage increase in radius of rotation is larger for the cylindrical and ellipsoidal models compared with the proposed models. From the experiments on the four real head pose data sets, we can make conclusions about the yaw estimation accuracy of each model based on two rules. 1) Angle rule: The face boundaries and the face center should meet nearly at the yaw angles ±60◦ . 2) Radius rule: The variation in the radius of rotation should be minimum over the entire span of yaw angle. The cylindrical and the CE models do not satisfy the angle rule, as the face boundaries and the face center of these models meet at ±90◦ . The cylindrical and the ellipsoidal models violate the radius rule largely than the proposed models. As the cylindrical model does not satisfy both the rules, it gives the poor results among the models. The ellipsoidal model satisfies only the angle rule, whereas the CE model satisfies only the radius rule. From the experimental results, it can be noticed that the ellipsoidal model outperforms the CE model in all the four data sets. Hence, the angle rule is very important to be satisfied than the radius rule. The Center CE, Boundary CE, and ellipsoidal models satisfy the angle rule. The performance of these three models depends on how well they meet the radius rule. VII. C ONCLUSION

the proposed models is compared against the existing methods using the CAS-PEAL data set. The MAE of the proposed models is compared in Table IX against the published results (from [7]) of existing head yaw estimation methods which use the same data set. Three observations can be made from Table IX similar to that of Table VIII. First, the proposed Center CE and Boundary CE models outperform the existing methods. Except for the ellipsoidal model, the CE model outperforms all the other existing methods in terms of MAE. Second, out of the three proposed models, the Center CE model outperforms the CE model and Boundary CE model. Third, the accuracy of CE model lies between that of the cylindrical and ellipsoidal models.

In this paper, we have proposed three models for accurate head yaw estimation under the ellipsoidal framework. Detailed theoretical analysis of the ellipsoidal framework is carried out. Experimental results show that the RMSE and MAE error of the proposed models is lower than that of previous models. The evaluation using the standard head pose data sets has proven the accuracy of the proposed models, achieving an MAE between 4◦ and 8◦ . The accuracy of the proposed models renders them good candidates for initialization and reinitialization stages of head tracking-based head pose estimation methods. The proposed models can either be independently used for head yaw estimation or in collaboration with tracking-based methods. In the future, we will explore corrections to the CE model to more accurately mimic the human head turn.

2318

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 15, NO. 5, OCTOBER 2014

A PPENDIX A C OMPACT R EPRESENTATION OF FACE B OUNDARIES The face right boundary is given as (20), i.e.,   xr = xcRe + r cos θ cos tan−1 (β tan θ)   −1 + βr sin θ sin tan (β tan θ) + αr sin θ.

The yaw estimate of cylindrical model is given as (2), i.e., (56)

We know that     K cos tan−1 (β tan θ) − ∅ = K cos tan−1 (β tan θ) cos ∅  −1  + K sin tan (β tan θ) sin ∅. (57) On comparing the RHS in (57) with the second and third terms in the RHS of (56), i.e., K cos ∅ = r cos θ K sin ∅ = βr sin θ.

(58) (59)

On solving (58) and (59), we have  K = r 1 + (β 2 − 1) sin2 θ ∅ = tan−1 (β tan θ).

A PPENDIX C R ELATIONSHIP B ETWEEN C YLINDRICAL AND CE FACE M ODELS

(60) (61)

On equating the second and third terms in the RHS of (56) to the LHS of (57), i.e.,     r cos θ cos tan−1 (β tan θ) + βr sin θ sin tan−1 (β tan θ)   = K cos tan−1 (β tan θ) − ∅ . (62)

sin θc =

sin θe = sin θe =

xc − xcRc + αr sin θc  β)r  x −x (α + c cR c + αr sin θc r (α + β)

sin θc =

(68)

Using (2), we have sin θc + α sin θc sin θe = (α + β) sin θe =

(1 + α) sin θc . (1 + β)

βF W sin θCE . 2rc

(73)

On substituting for rc from (4), we have sin θC =

βF W sin θCE . (xr − xl )

(74)

On substituting for (xr − xl ) from (28) and F W from (33), we have sin θC = 

β 1+

(β 2

− 1) sin2 θCE

sin θCE .

(75)

A PPENDIX D R ELATIONSHIP B ETWEEN E LLIPSOIDAL AND CE FACE M ODELS On substituting for sin θc from (70) in (75), we have sin θe =

(α + β)



β(1 + α) 1 + (β 2 − 1) sin2 θCE

sin θCE .

(76)

A PPENDIX E R ELATIONSHIP B ETWEEN C ENTER CE AND CE FACE M ODELS (In order to avoid confusion between r of the Center CE model with that of the CE model, we denoted Center CE model’s r as rCenter_CE in this context). From (27) and (35), the yaw estimate of Center CE model is given as

(67) .

(72)

From (72), substituting for xc − xcRc in (71), i.e.,

A PPENDIX B R ELATIONSHIP B ETWEEN C YLINDRICAL AND E LLIPSOIDAL FACE M ODELS The yaw estimate of ellipsoidal model is given as (6), i.e., x c − x c Re sin θe = . (66) R On substituting for xcRe from (10), (3) and R from (7), we have

x c − x c Rc .

β F 2W

sin θCE =

(64)

Similarly, the face left boundary (19) can be represented as  xl = xcRe − 1 + (β 2 − 1) sin2 θ + αr sin θ (65)

(71)

(In order to avoid confusion between r of the cylindrical model with that of the CE model, we denoted cylindrical model’s r as rc in this context). From (26), (3), and (33), the yaw estimate of the proposed CE model is given as

On substituting for K and φ from (60) and (61), we have     r cos θ cos tan−1 (β tan θ) + βr sin θ sin tan−1 (β tan θ)  = r 1 + (β 2 − 1) sin2 θ. (63) On substituting (63) in (56), we have  xr = xcRe + r 1 + (β 2 − 1) sin2 θ + αr sin θ.

x c − x c Rc . rc

sin θCenter_CE =

βr sin θCE . (β + γc )rCenter_CE

(77)

On substituting for rCenter_CE from (36) and substituting for xl (21), xr (22), and xc (24) in rCenter_CE expression sin θCenter_CE

(69)



= (β +γc )

(70)

β

sin θCE .   2  β 1+(β 2 −1) sin2 θCE 1− β+γ c (78)

NARAYANAN et al.: YAW ESTIMATION USING CYLINDRICAL AND ELLIPSOIDAL FACE MODELS

A PPENDIX F R ELATIONSHIP B ETWEEN B OUNDARY CE AND CE FACE M ODELS (In order to avoid confusion between r of the Boundary CE model with that of the CE model, we denoted Boundary CE model’s r as rBoundary_CE in this context). From (27) and (39), the yaw estimate of Boundary CE model is given as sin θBoundary_CE =

r sin θCE . rBoundary_CE

(79)

On substituting for rBoundary_CE from (40) and substituting for xl (21), xr (22) and xc (24) in rBoundary_CE expression 1

sin θBoundary_CE =  1+

γb2

sin2 θCE

sin θCE .

(80)

R EFERENCES [1] Y. Dong, Z. Hu, K. Uchimura, and N. Murayama, “Driver inattention monitoring system for intelligent vehicles: A review,” IEEE Trans. Intell. Transp. Syst., vol. 12, no. 2, pp. 596–614, Jun. 2011. [2] E. M. Chutorian and M. M. Trivedi, “Head pose estimation in computer vision: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 4, pp. 607–626, Apr. 2009. [3] B. Ma, S. Shan, X. Chen, and W. Gao, “Head yaw estimation from asymmetry of facial appearance,” IEEE Trans. Syst. Man Cybern. B, Cybern., vol. 38, no. 6, pp. 1501–1512, Dec. 2008. [4] V. Pathangay, S. Das, and T. Greiner, “Symmetry-based face pose estimation from a single uncalibrated view,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recog., Sep. 2008, pp. 1–8. [5] B. Ma, X. Chai, and T. Wang, “A novel feature descriptor based on biologically inspired feature for head pose estimation,” Neurocomputing, vol. 115, pp. 1–10, Sep. 2013. [6] B. Ma, X. Yang, and S. Shan, “Head pose estimation via background removal,” Int. Sci. Intell. Data Eng., vol. 7751, pp. 140–147, 2013. [7] W. Hu, B. Ma, and X. Chai, “Head pose estimation using simple local Gabor binary pattern,” Biometric Recog., vol. 7098, pp. 74–81, 2011. [8] B. Ma, A. Li, X. Chai, and S. Shan, “Head yaw estimation via symmetry of regions,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recog., 2013, pp. 1–6. [9] J. Lu and Y. P. Tan, “Ordinary preserving manifold analysis for human age and head pose estimation,” IEEE Trans. Human-Mach. Syst., vol. 43, no. 2, pp. 249–258, Mar. 2013. [10] S. Yan, H. Wang, Y. Fu, J. Yan, X. Tang, and T. S. Huang, “Synchronized submanifold embedding for person-independent pose estimation and beyond,” IEEE Trans. Image Process., vol. 18, no. 1, pp. 202–210, Jan. 2009. [11] Y. Fu, Z. Li, J. Yuan, Y. Wu, and T. S. Huang, “Locality versus globality: Query-driven localized linear models for facial image computing,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 12, pp. 1741–1752, Dec. 2008. [12] A. Ranganathan, M. H. Yang, and J. Ho, “Online sparse Gaussian process regression and its applications,” IEEE Trans. Image Process, vol. 20, no. 2, pp. 391–404, Feb. 2011. [13] B. Ma and T. Wang, “Head pose estimation using sparse representation,” in Proc. IEEE ICCEA, 2010, pp. 389–392. [14] W. Chen, C. Shan, and G. D. Haan, “Optimal regularization parameter estimation for spectral regression discriminant analysis,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 12, pp. 1921–1926, Dec. 2009. [15] J. Tu, Y. Fu, and T. S. Huang, “Locating nose-tips and estimating head poses in images by tensorposes,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 1, pp. 90–102, Jan. 2009. [16] M. Krinidis, N. Nikolaidis, and I. Pitas, “3-D Head pose estimation in monocular video sequences using deformable surfaces and radial basis functions,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 2, pp. 261–272, Feb. 2009. [17] F. Dornaika and B. Raducanu, “Three-dimensional face pose detection and tracking using monocular videos: Tool and application,” IEEE Trans. Syst. Man Cybern. B, Cybern., vol. 39, no. 4, pp. 935–944, Aug. 2009.

2319

[18] R. O. Mbouna, S. G. Kong, and M. G. Chun, “Visual analysis of eye state and head pose for driver alertness monitoring,” IEEE Trans. Intell. Transp. Syst., vol. 14, no. 3, pp. 1462–1469, Sep. 2013. [19] Z. Yucel, A. A. Salah, C. Mericli, T. Mericli, R. Valenti, and T. Gevers, “Joint attention by gaze interpolation and saliency,” IEEE Trans. Cybern., vol. 43, no. 3, pp. 829–842, Jun. 2013. [20] M. J. Reale, P. Liu, L. Yin, and S. Canavan, “Art critic: Multisignal vision and speech interaction system in a gaming context,” IEEE Trans. Cybern., vol. 43, no. 6, pp. 1546–1559, Dec. 2013. [21] R. Valenti, N. Sebe, and T. Gevers, “Combining head pose and eye location information for gaze estimation,” IEEE Trans. Image Process., vol. 21, no. 2, pp. 802–815, Feb. 2012. [22] P. Jimenez, L. M. Bergasa, J. Nuevo, N. Hernandez, and I. G. Daza, “Gaze fixation system for the evaluation of driver distractions induced by IVIS,” IEEE Trans. Intell. Transp. Syst., vol. 13, no. 3, pp. 1167–1178, Sep. 2012. [23] P. Jiménez, J. Nuevo, L. Bergasa, and M. Sotelo, “Face tracking and pose estimation with automatic three-dimensional model construction,” IET Comput. Vis., vol. 3, no. 2, pp. 93–102, Jun. 2009. [24] S. U. Jung and M. S. Nixon, “On using gait to enhance frontal face extraction,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 6, pp. 1802– 1811, Dec. 2012. [25] E. M. Chutorian and M. M. Trivedi, “Head pose estimation and augmented reality tracking: An integrated system and evaluation for monitoring driver awareness,” IEEE Trans. Intell. Transp. Syst., vol. 11, no. 2, pp. 300–311, Jun. 2010. [26] K. Ohue, Y. Yamada, S. Uozumi, S. Tokoro, A. Hattori, and T. Hayashi, “Development of a new pre-crash safety system,” presented at the Society Automotive Engineers World Congress, (SAE) Technical Paper Series, Warrendale, PA, USA, 2006, Paper 2006-01-1461. [27] S. J. Lee, J. Jo, H. G. Jung, K. R. Park, and J. Kim, “Real-time gaze estimator based on driver’s head orientation for forward collision warning system,” IEEE Trans. Intell. Transp. Syst., vol. 12, no. 1, pp. 254–267, Mar. 2011. [28] M. La Cascia, S. Sclaroff, and V. Athitsos, “Fast, reliable head tracking under varying illumination: An approach based on registration of texturemapped 3D models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 4, pp. 322–336, Apr. 2000. [29] W. Gao, B. Cao, S. Shan, X. Chen, D. Zhou, X. Zhang, and D. Zhao, “The CAS-PEAL large-scale Chinese face database and baseline evaluations,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 38, no. 1, pp. 149– 161, Jan. 2008. [30] V. N. Balasubramanian, S. Krishna, and S. Panchanathan, “Personindependent head pose estimation using biased manifold embedding,” Eur. Assoc. Signal Speech Image Process. J. Adv. Signal Process., vol. 8, no. 1, pp. 1–15, 2008. [31] N. Gourier, D. Hall, and J. Crowley, “Estimating face orientation from robust detection of salient facial structures,” in Proc. ICPR Workshop Vis. Observ. Deictic Gestures, 2004, pp. 17–25. [32] O. Celiktutan, S. Ulukaya, and B. Sankur, “A comparative study of face landmarking techniques,” EURASIP J. Image Video Process., vol. 2013, pp. 1–27, Mar. 2013. [33] X. Zhu and D. Ramanan, “Face detection, pose estimation, and landmark localization in the wild,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., 2012, pp. 2879–2886. [34] H. Wang, F. Davoine, V. Lepetit, C. Chaillou, and C. Pan, “3-D head tracking via invariant keypoint learning,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 8, pp. 1113–1126, Aug. 2012. [35] L. P. Morency, J. Whitehill, and J. Movellan, “Monocular head pose estimation using generalized adaptive view-based appearance model,” Image Vis. Comput., vol. 28, no. 5, pp. 754–761, May 2010. [36] S. Choi and D. Kim, “Robust head tracking using 3D ellipsoidal head model in particle filter,” Pattern Recog., vol. 41, no. 9, pp. 2901–2915, Sep. 2008. [37] K. H. An and M. Chung, “3D head tracking and pose-robust 2D texture map-based face recognition using a simple ellipsoid model,” in Proc. Intell. Robots Syst., Sep. 2008, pp. 307–312. [38] J. Sung, T. Kanade, and D. Kim, “Pose robust face tracking by combining active appearance models and cylinder head models,” Int. J. Comput. Vis., vol. 80, no. 2, pp. 260–274, Nov. 2008. [39] S. Yan, H. Wang, X. Tang, J. Liu, and T. S. Huang, “Regression from uncertain labels and its applications to soft biometrics,” IEEE Trans. Inf. Forensics Security, vol. 3, no. 4, pp. 698–708, Dec. 2008. [40] Y. Fu and T. S. Huang, “Human age estimation with regression on discriminative aging manifold,” IEEE Trans. Multimedia, vol. 10, no. 4, pp. 578– 584, Jun. 2008.

2320

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 15, NO. 5, OCTOBER 2014

[41] X. He, S. Yan, Y. Hu, P. Niyogi, and H. J. Zhang, “Face recognition using Laplacianfaces,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 3, pp. 328–340, Mar. 2005. [42] S. Roweis and L. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323–2226, Dec. 2000. [43] P. Belhumeur, J. Hespanha, and D. J. Kriegman, “Eigenfaces versus fisherfaces: Recognition using class specific linear projection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 711–720, Jul. 1997. [44] M. Turk and A. Pentland, “Eigenfaces for recognition,” J. Cognitive Neurosci., vol. 3, no. 1, pp. 71–86, 1991. [45] M. A. Haj, J. Gonzalez, and L. S. Davis, “On partial least squares in head pose estimation: How to simultaneously deal with misalignment,” in Proc. IEEE Int. Conf. Computer Vis. Pattern Recog., Jun. 2012, pp. 2602–2609. [46] Z. Li, Y. Fu, J. Yuan, T. S. Huang, and Y. Wu, “Query driven localized linear discriminant models for head pose estimation,” in Proc. IEEE Int. Conf. Multimedia Expo., Jul. 2007, pp. 1810–1813. [47] J. Tu, Y. Fu, Y. Hu, and T. S. Huang, “Evaluation of head pose estimation for studio data,” in Proc. 1st Int. Workshop Classification Events, Activities Relationships Multimodal Technol. Perception Humans, 2007, pp. 281–290. [48] M. Voit, K. Nickel, and R. Stiefelhagen, “Neural network based head pose estimation and multi-view fusion,” in Proc. 1st Int. Workshop Classification Events, Activities Relationships Multimodal Technol. Perception Humans, 2007, pp. 291–298. [49] N. Gourier, J. Maisonnasse, D. Hall, and J. Crowley, “Head pose estimation on low resolution images,” in Proc. 1st Int. Workshop Classification Events, Activities Relationships Multimodal Technol. Perception Humans, 2007, pp. 270–280. [50] P. Watta, S. Lakshmanan, and Y. Hou, “Nonparametric approaches for estimating driver pose,” IEEE Trans. Veh. Technol., vol. 56, no. 4, pp. 2028– 2041, Jul. 2007. [51] A. Nikolaidis and I. Pitas, “Facial feature extraction and pose determination,” Pattern Recog., vol. 33, no. 11, pp. 1783–1791, Nov. 2000. [52] X. Fu, X. Guan, E. Peli, H. Liu, and G. Luo, “Automatic calibration method for driver’s head orientation in natural driving environment,” IEEE Trans. Intell. Transp. Syst., vol. 14, no. 1, pp. 303–312, Mar. 2013. [53] J. P. Batista, “A real-time driver visual attention monitoring system,” Pattern Recog. Image Anal. Lecture Notes Comput. Sci., vol. 3522, pp. 200– 208, 2005. [54] Q. Ji and R. Hu, “3D face pose estimation and tracking from a monocular camera,” Image Vis. Comput., vol. 20, no. 7, pp. 499–511, May 2002. [55] W. Guo, I. Kotsia, and I. Patras, “Tensor learning for regression,” IEEE Trans. Image Process., vol. 21, no. 2, pp. 816–827, Feb. 2012.

Athi Narayanan received the B.E. degree in electronics and communication from Anna University, Chennai, India, in 2005 and M.E. degree in embedded system technologies from Anna University of Technology, Coimbatore, India, in 2010. He is currently with the A-VIEW project in Amrita E-Learning Research Lab, Department of Computer Science, Amrita Vishwa Vidyapeetham, where he also works as a Senior Research Associate and leads the Image Recognition Team of the A-VIEW project in Amrita E-Learning Research Lab. In Jasmin Infotech, Chennai, as a digital signal processing (DSP) Engineer, he has developed embedded DSP multimedia codecs. In Nihon Technology, Chennai, as a Project Leader, he has led and developed Association of Radio Industries and Businesses standard color quantization algorithms for digital broadcasting in Japan. In Manatec Electronics, Puducherry, India, as an R&D team leader, he has developed real-time computer vision and pattern recognition algorithms for the India’s first indigenous 3-D wheel alignment system. He has authored several papers in peer reviewed international journals. His research interests include biblical theology, image processing, computer vision, and its applications. Mr. Narayanan holds the 50th rank in the MATLAB central exchange. He was a recipient of the Best Paper Award from the International Journal of Systemics, Cybernetics and Informatics for his work on color quantization. His biography has been selected and published in the 28th Edition of Marquis Who’s Who in the World for his research work in digital broadcasting. He was a recipient of the Best Outgoing Student Award and the Research and Development Award during the B.E. course. He was selected for the Indian National Mathematics Olympiad 2000.

Ramachandra Mathava Kaimal received the Ph.D. degree from the Mehta Research Institute (now Harish Chandra Institute), University of Allahabad, Uttar Pradesh, India, in 1978. During 1977–1979, he was a Postdoctoral Fellow with the Indian Institute of Science, Bangalore, India, and during 1981–1984, he was a Visiting Fellow with the National Institutes of Health, Bethesda, MD, USA. In 1982, he visited International Conference on Pure and Applied Mathematics, Nice, France, on a UNESCO Fellowship and in August 2008, he also visited Council for Scientific and Industrial Research, Pretoria, South Africa. During 1979–1987, he was a Faculty with Cochin University of Science and Technology, Kochi, India. From 1987 to 2009, he was the Professor and Head with the Department of Computer Science in University of Kerala, Thiruvananthapuram, India. Since 2009, he has been a Professor and Chairman with the Department of Computer Science, Amrita Vishwa Vidyapeetham, Kollam, India. He has guided more than 60 dissertations at the M.Tech level, and guided 15 doctoral dissertations in computer science and is presently guiding five scholars. His present area of interest includes computer vision, computational intelligence, algorithms, machine learning, and high-dimensional data analysis. Dr. Kaimal has edited one book, published one book chapter (CRC Press), and over 70 publications in international and national journals, including Computer Journal (U.K.), IEEE Transactions on Fuzzy Systems, Journal of Mathematical Imaging, Defense Science Journal, Sadhana, etc. Major research projects undertaken includes "Development of Methodologies based on NeuroFuzzy and Genetic Algorithms for Modeling and Control of Systems (ISRO funded 1998-2003)". He is a member of Computer Society of India.

Kamal Bijlani received the B. E. degree in electronics from Birla Institute of Technology and Science, Pilani, India, in 1982 and the M.S. degree in computer science from Michigan Technological University, Houghton, MI, USA, in 1984. He is currently an Associate Professor and the Director with Amrita E-Learning Research Lab, Amrita Vishwa Vidyapeetham, Kollam, India. He is the Principal Investigator and Chief Architect with the A-VIEW project. The A-VIEW project is funded by the Ministry of Education, India (Ministry of Human Resource Development). On the A-VIEW project, he has been leading a multidimensional team of around 150 researchers, engineers, and technicians. A-VIEW won the Jury Award for the Best Innovation in Open and Distance Learning under Higher Education Category in World Education Summit, 2011. Computer World Magazine (USA) recognized Amrita E-Learning Research Lab with Computer World Honors Laureate 2012, under the Training and Education category. In addition, the laboratory wins the Educational Excellence Award at the Indo-Global Educational Summit and Expo. Prior to this, he was the Chief Executive Officer of a multimedia and gaming startup company in the USA called Into The Mystery, Inc. He has led consulting projects with Genysm, an Artificial Intelligence company (from Massachusetts Institute of Technology, Cambridge, MA, USA). He was Project Manager of modelbased software project with Cimflex Teknowledge (from Stanford University, Stanford, CA, USA). He was a Research Scientist with Honeywell Research Center, Minneapolis, MN, USA. As an advanced practitioner in complex software, multimedia, and video, he has hands-on experience in the architecture and development of several systems; He has published several papers in national and international journals. His research interests include e-learning, collaborative multimedia, video codecs, and their applications.