Recent Advances in Data Mining Techniques and Their Applications ...

Recent Advances in Data Mining Techniques and Their Applications in Hyperspectral Image Processing for the Food Industry Qiong Dai, Da-Wen Sun, Zhenjie Xiong, Jun-Hu Cheng, and Xin-An Zeng

Abstract: Hyperspectral imaging (HSI) facilitates better characterization of intrinsic and extrinsic properties of foods by integrating traditional spectral and image techniques, in which careful and sophisticated data processing plays an important role. In the past decade, much progress has been made on applying various algorithms to deal with hyperspectral images. This review first introduces the general procedure of hyperspectral data analysis and then illustrates the most typically and commonly used algorithms for denoising, feature selection, model establishment, and evaluation, as well as their applications for assessing food quality, safety, and authenticity. Finally, brief summaries for regression and classification methods are presented. This article will provide a guideline for data mining in the future development of HSI in the food field. Keywords: data mining, denoising, evaluation, feature selection, food, hyperspectral imaging, model establishment

Introduction Quality and safety are the most important issues in the food industry. The industry not only needs processing techniques such as cooling (Wang and Sun 2001), freezing (Delgado and others 2009), drying (Sun 1999; Sun and Woods 1993, 1994, 1997; Sun and Byrne 1998; Delgado and Sun 2002), and edible coating (Xu and others 2001) to enhance product quality and safety, it also needs evaluation and detection techniques for quality and safety assurance. Hyperspectral imaging (HSI), originally developed for remote sensing applications, has recently enjoyed rapid development and wide usage for nondestructive quality and safety detection in the food industry, especially for the comprehensive detection of both chemical and physical properties from a considerable variety of foodstuffs. This is because HSI integrates the advantages of conventional digital imaging or computer vision (Du and Sun 2005; Valous and others 2009) and spectroscopy, of which the unique characteristics make it possible to obtain both spatial and spectral information from a sample simultaneously (Gowen and others 2009). Recent application of HSI has been found in the speedy evaluation of physical properties (Cluff and others 2008; ElMasry and others 2012), chemical compositions (ElMasry and others 2007; Rajkumar and others 2012), microbiological attributes (Siripatrawan and others 2011; Barbin and others 2013a), MS 20140278 Submitted 2/20/2014, Accepted 4/18/2014. Authors Dai, Xiong, Cheng, and Zeng are with College of Light Industry and Food Sciences, South China Univ. of Technology, Guangzhou 510641, China. Author Sun is with College of Light Industry and Food Sciences, South China Univ. of Technology, Guangzhou 510641, China; and Food Refrigeration and Computerized Food Technology, Agriculture and Food Science Centre, Univ. College Dublin, Natl. Univ. of Ireland, Belfield, Dublin 4, Ireland. Direct inquiries to author Sun (E-mail: [email protected]). C 2014 Institute of Food Technologists®

doi: 10.1111/1541-4337.12088

contaminations (Park and others 2006; Kang and others 2011) of food products, identification for some authenticity issues (Wu and others 2013) and some other applications (Karimi and others 2012; Kokawa and others 2013; Liu and Ngadi 2013; Lorente and others 2013b; Cen and others 2014; Shahin and others 2014; Wei and others 2014) in the food industry. Generally, hyperspectral images consist of a substantial amount of information in 2 spatial dimensions and 1 spectral dimension of an object studied, resulting in a heavy computational load for analyzing the high-dimensional data. Therefore, data mining is a necessary task for extracting useful information for predicting the relevant quality attributes. Figure 1 summarized the main procedure of data mining for hyperspectral images in the food industry. The spectrum of each pixel in a hyperspectral image can be understood as a signature, which is the fundamental for grading or assessing the chemical and physical properties of the material that it represents. However, special attention should be paid that the spectrum in each pixel was easily influenced by its neighboring pixels resulting from various undesired effects such as inhomogeneous samples, environmental changes, and instrumental variations. Therefore, several denoising methods are available to remove or eliminate any irrelevant information that cannot be minimized or dealt with by regression or classification approaches (Othman 2006; Renard 2008). Furthermore, the large volume of data makes it possible to establish calibration sets for prediction purposes. This is also referred as the dimensional curse, resulting in the increases in computational load, expense of instrument, and complexity of online application as well as the decreases in the accuracy and robustness of predictions (Burger and Gowen 2011). Thus, it is of significant value to conduct dimension reduction or feature selection for further analysis and application of HSI datasets. It should

Vol. 13, 2014 r Comprehensive Reviews in Food Science and Food Safety 891

Recent advances in data mining techniques . . .

Sample

Image acquisition and correction

Reference analysis

Denoising

Regresssion or classification

Feature selection

Evaluation

Figure 1–The main procedure of data mining for hyperspectral images in the food industry.

be noted that the key wavelengths determined by several feature selection methodologies, such as regression coefficient (RC) and successive projections algorithm (SPA), have been shown to yield fairly similar, or sometimes enhanced, prediction and classification precise than the prediction model based on full wavelengths (Liu and others 2013a). Generally speaking, classification or quantification of some certain quality indexes of foods is the final and the main goal of HSI analysis; thus, an abundance of chemometric methods have been developed for this purpose. In addition, model evaluation is an important aspect to examine the performance of established models. Since the predicted values from a good model should have a close value with the reference measurement, several statistical parameters that compute the differences between the predicted values and reference values are used as the criteria to evaluate its ability of prediction or classification with acceptable errors. As for quantification, Figure 2 showed an example of using hyperspectral imaging to predict to determine mechanical properties in prawns (Dai and others 2014b). Despite the importance of data mining in using HSI technology for accurate and reliable food quality and safety evaluation and control, and many relevant data mining techniques and methods have been employed for analyzing hyperspectral images, unfortunately, there is no review detailing of data mining techniques for their applications in the food industry. Therefore, in this review article, the most typically and commonly used algorithms for data mining as well as their applications for assessing food quality, safety, and authenticity are illustrated; the advantages and disadvantages of these data mining techniques are also presented. It is hoped that this review will provide a guideline for data mining in the future development of HSI in the food field.

Pretreatment Methodologies and Applications Denoising methods Once the spectral data are extracted from hyperspectral images, they are often treated with denoising methods to reduce or eliminate undesired effects, including systematic noise and random noise, which are the results of physical, environmental, or instru-

mental changes during the procedure of data collection (Chen 2011). The denoising treatment can significantly improve the accuracy and robustness of calibration models in some cases. Several mathematical signal treatment methods such as derivative methods, standard normal variate (SNV), multiplicative scatter correction (MSC), minimum noise fraction (MNF), and Savitzky–Golay smoothing are commonly performed prior to multivariate or classification modeling. Since a universal criterion is unavailable to determine which denoising technique should be conducted, constant trial and error is the common and only choice to achieve the purpose of denoising. In several early attempts (Xing and others 2005; Park and others 2006; Qiao and others 2007), data smoothing methods by boxcar-averaging over a certain bandwidth were adapted to remove the random noise; however, some useful information could be lost along with the smoothing of noise. Recently, Barbin and others (2012a) applied MSC, SNV, 1st derivative, and 2nd derivative to reduce the impact of noise in pork spectra for predicting color parameters, pH, and drip loss. In that study, these approaches only offered a slight improvement in the robustness of prediction models when the original spectrum was used. In addition, Kamruzzaman and others (2013) reported that none of the spectral preprocessing methods (SNV, MSC, 1st derivative, and 2nd derivative) improved final predictive ability for the detection of lamb tenderness compared with raw spectra. Similar results were also obtained according to some other studies reported by ElMasry and others (2007) and Kamruzzaman and others (2012a). Recently, some successful trials have shown promising results using denoising methods. For instance, Zhu and others (2012) investigated the potential of using visible and near-infrared HSI to differentiate the fresh fillets from frozen-thawed ones in which the noise was effectively removed using the MNF method. In another study, when SNV, MSC, detrend, derivatives, and their combinations were adapted for spectral pretreatment, Feng and Sun (2013a) obtained a better prediction capacity of models when compared to the raw spectra. Furthermore, the SNV technique was found to play important roles in the satisfactory prediction of the content of N and P in oilseed rape leaves (Zhang and others 2013), which was

892 Comprehensive Reviews in Food Science and Food Safety r Vol. 13, 2014

C 2014 Institute of Food Technologists®


STEP2: Subsampling

Samples (No paticular preparation)

The second abdominal segment The fourth abdominal segment

Position and shape information of cubed samples

STEP 1: Image acquisition 380-1030nm

STEP 3:Reference analysis of TPA values

STEP 5: Identification of ROI and spectral data extraction STEP 4: Image correction

Black & white reference images

Mean spectral data

Corrected hyperspectral image (Rc)

Spectral variable selection Dimension reduction Selected effective wavelengths

Image at selected wavelengths

STEP 7: Image post-processing

Final optimal quantitative model

Image processing

Multivariate caliration

TPA values

STEP 6: Multivariate data analysis Figure 2–An example of using hyperspectral imaging to determine mechanical properties in prawns (Dai and others 2014b).

consistent with previous research by Sone and others (2012) who correctly classified the fresh Atlantic salmon fillets according to different storage atmospheres when SNV-pretreated spectra were applied. Derivative methods combined with Sawitzky–Golay were frequently used in some latest studies (Barbin and others 2012b; Feng and others 2013), which often leads to satifactory results for removing baseline shifts. As for image features, 2D Gabor filters were applied to correct and extract useful image texture features from the original hypercube in pork samples (Liu and others 2010). Further detailed description of denoising methods can be found elsewhere (Martens and Næs 1998; Næs and others 2004).

Feature selection methods As the huge size of HSI data increases the burden of computation and expenses of instruments, as well as decreases the predicting accuracy, it is desirable to conduct feature selection to identify several informative wavelengths for the real-time multispectral imaging implementation (Lorente and others 2013a). Furthermore, extensive research endeavors (Qiao and others 2007; Wu and others 2012a; Feng and Sun 2013b) have demonstrated that the identified feature wavelengths perform equally or more efficient than the usage of the full spectrum. Recently, a variety of algorithms have been available for feature selection purposes, including RCs, stepwise regression (SR), artificial neural networks (ANNs), SPA, C 2014 Institute of Food Technologists®

genetic algorithms (GAs), and competitive adaptive reweighted sampling (CARS). Especially RC is one of the most typically and commonly used methods in the identification of key wavelengths that carry the most of information for food quality assessment and safety detection (Qiao and others 2007; Talens and others 2013). Recently, special attention was paid to the usage of GA and CARS to determine the feature wavelengths for prediction (Feng and Sun 2013a; Wu and Sun 2013a). However, it is rather difficult to answer which feature of a selection algorithm is the best for extracting the most important and informative wavelengths for prediction purpose. For example, Rajkumar and others (2012) applied the HSI technique to predict banana firmness with R2P of 0.91 using 8 wavelengths (440, 525, 633, 672, 709, 760, 925, and 984 nm) via RCs resulting from the PLSR model; however, by using the same feature wavelengths, an inferior predicting ability (R2P of 0.85 and 0.87) was found for total soluble solids and moisture, respectively. Therefore, the choice of a suitable algorithm is closely related to the size of the dataset, the nature of the problem, as well as the accuracy requirement. Moreover, different combinations of important wavelengths were obtained for predicting the same parameter by using different feature selection methods. Take the water content of beef, for example: ElMasry and others (2013) found good prediction (R2P of 0.89) for water contents based on 8 wavelengths (934, 1048, 1108, 1155, 1185, 1212, 1265,


Recent advances in data mining techniques . . . and 1379 nm) selected from weighted PLS RCs whereas, 4 fea- Step 3: build the regression model based on y and t1 , the same for ture wavelengths (1187, 1224, 1352, and 1615 nm) out of 225 x and t1 : wavelengths selected by SPA were used to establish the multiple 2 (5) E0 = tˆ1 α1T + E1 linear regression (MLR) model with RCV of 0.97 (Wu and others 2012c). This might be due to the fact that the fundamental for feature selection between RC and SPA is quite different. The reader (6) F0 = tˆ1 β1T + F1 can refer to Liu and others (2013b) and Dai and others (2014a) for further information.

Methodologies for Model Establishment and Applications The final goal of developing a hyperspectral image system is to estimate or discriminate the characteristics of new samples accurately based on the established quantitative or qualitative models. Quantitative assessment, that is, multivariate regression, aims to build a correlation between a desired physical, chemical, or biological attribute of an object and its spectrum response. The most widely used multivariate regression methods in quantitative analysis, such as MLR, partial least-square regression (PLSR), ANN, support vector machine (SVM), and least-square support vector machine (LS-SVM), can be categorized into 2 groups, namely, linear regression and nonlinear regression (ElMasry and others 2009; Wang and others 2012a; Zhang and others 2013). On the other hand, qualitative models classify the tested samples into certain groups based on their respective spectra without conducting chemical determination of these regions. A wide variety of algorithms is available for the purpose of classification, such as k nearest neighbor (KNN), k means cluster (K-means), linear discriminant analysis (LDA), partial least-square discriminant analysis (PLS-DA), ANN, and SVM, in which the first 2 algorithms are unsupervised classifications, while the last 4 approaches belong to supervised classifications. Recent applications of regression algorithms and classification algorithms for analyzing the hyperspectral images of foods are illustrated in Table 1 and Table 2, respectively.

Regression methods Linear regression. MLR, based on the least squares, is one of the simplest and most efficient multivariate regression methods for data analysis. However, MLR is helpless to deal with the problem of multicolinearity, as is the case with hyperspectral data in that the number of wavelengths is more than that of samples. PLSR, developed by Wold and others (1984), is a good solution to the multicolinearity problem by using principal component analysis to reduce dimensionality and create an interpretable relationship between spectral matrix and response vector. PLSR projected both the predicted variables (data matrix Y) and the observable variables (data matrix X) into a new feature space to establish a linear regression model. Assuming that X is a M × N matrix and Y is a M × K matrix. Step 1: extract the first components from matrixes X and Y, respectively: t1 = w1T X

(1)

u 1 = v1T Y

(2)

Step 2: maximize the degree of correlation between t1 and u1 : Co v(t1 ,u 1 ) ⇒ max

(3)

w1T w = w1 2 = 1,v1T v1 = v1 2 = 1

(4)

where E0 and F0 are the observed values for X and Y, respectively, and tˆ1 and uˆ 1 are the corresponding score vectors for t1 and u1 , respectively, and E1 and F1 are the residual matrixes. Step 4: for the first iteration, repeat the above steps with residuals matrixes E1 and F1 instead of E0 and F0 , after r times of iteration, and it yields: E0 = tˆ1 α1T + · · · + tˆr αrT + Er

(7)

F0 = tˆ1 β1T + · · · + tˆr βrT + Fr

(8)

Step 5: cross-validation to minimize the error between the predicted and observed response values, as detailed below: Minimized PRESS j (h) =

n i =1

(yi j − yˆ (i ) j (h))2 ( j = 1,2, . . . , p) (9)

Currently, the MLR and PLSR are 2 of the most widely used and most powerful methods for quantitative analysis in the food field. In order to overcome colinearity problems and for the convenience of online application, MLR and PLSR are performed on a small set of feature wavelengths rather than the full spectrum. A great deal of research has demonstrated that MLR, using optimal wavelengths, could correctly predict the chemical qualities such as total soluble solids, and moisture content of banana (Rajkumar and others 2012), the adulteration of minced lamb (Kamruzzaman and others 2013), the physical parameters of salmon (Wu and others 2014), as well as the microbial contamination of pork (Tao and others 2014). Special attention has been paid to the usage of PLSR by recent studies (Barbin and others 2013a; Feng and others 2013). Especially for quality attributes, some major constituents in lamb have been predicted using a near-infrared (NIR) reflectance HSI system combined with linear regression methods (Kamruzzaman and others 2012a), in which accurate determination was obtained for lamb water and fat contents (both R2P = 0.88) with PLSR models, while a worse predicting accuracy (R2P = 0.63) was achieved for protein contents. Similarly, ElMasry and others (2013) used PLSR methods to develop quantitative models to predict the major chemical components in beef. As a result, reasonable accuracies with determination coefficients (R2P ) of 0.89, 0.84, and 0.86 were achieved for predicting water, fat, and protein contents, respectively. Recently, the good performance of established PLSR models demonstrated the excellent ability of the near-infrared HSI technique to determine the protein (R2P = 0.88), moisture (R2P = 0.91), and fat (R2P = 0.93) contents of pork (Barbin and others 2013a). As for predicting food safety parameters, PLSR was used to fit the spectral values extracted from the chicken fillets to reference Enterobacteriaceae loads, leading to a good predicting performance with a R2P of 0.87 and RMSEP of 0.45 log10 CFU/g (Feng and others 2013). In another study, Wu and Sun (2013b)



PLSR

Algorithm

MC of strawberry TSS of strawberry pH of strawberry N of oilseed P of oilseed K of oilseed WHC of beef Water of beef Fat of beef Protein of beef L* of pork a* of pork b* of pork Drip loss of pork Protein of pork Moisture of pork Fat of pork Water of lamb Fat of lamb L* of lamb Adulteration of lamb Protein of ham Water of ham Moisture of prawns L* of salmon a*of salmon b* of salmon WHC of salmon TVC of salmon TVC of chicken Enterobacteriaceae of chicken Pseudomonas of chicken TVC of pork PPC of pork TVC of beef WBSF of salmon MC of salting meat Salt content of pork aw of pork TVB-N of grass crap

Application 0.76 0.64 0.84 0.80 0.60 0.56 0.89 / / / 0.94 0.82 0.90 0.86 0.94 0.91 0.95 0.88 0.92 0.89 0.99 0.88 0.89 0.93 0.76 0.56 0.72 8 0.96 0.92 0.89 0.82 0.88 0.89 0.86 0.82 0.934 0.952 0.932 0.910

14 11 10 18 8 7 6 6 9

R C2

8 6 8 12 8 5 6 8 7 10 6 11 6 7 11 7 6 6 6 8 4 10 6 12 4 6 10 12 8 7 3

Nr of PEW


0.82 0.85 0.84 / 0.919 0.945 0.922 0.899

0.76

0.83 0.64 0.88 / / / 0.87 0.89 0.88 0.88 0.93 0.75 0.89 0.83 0.92 0.88 0.94 0.86 0.90 0.88 0.99 0.86 0.87 / 0.76 0.54 0.68 0.94 / 0.88 0.86

R C2 V

0.81 0.81 0.61 0.81 0.914 0.912 0.910 0.891

0.77

/ / / 0.65 0.50 0.56 / 0.89 0.84 0.86 0.90 0.72 0.85 0.79 0.88 0.91 0.93 0.84 0.87 / / / / 0.96 0.75 0.54 0.62 / 0.96 / 0.87

R 2P

/ / / 2.304 / / / /

/

/ / / / / / / / / / / / / / 2.7 1.2 4.0 2.35 3.66 / 8.21 / / 4.65 / / / 0.93 5.127 3.02 /

RPD

0.67 (SEC) 0.75 (SEC) 0.814 1.052 1.34 0.625 0.007 2.718%

0.55

6.72 (SEC) 0.22 (SEC) 0.08 (SEC) 0.354 0.03 0.173 0.25 (SEC%) / / / 1.21 0.40 0.45 0.99 0.32 (SEC) 0.64 (SEC) 0.34 (SEC) 0.47(SEC%) 0.33(SEC%) 1.49 1.27 0.994 0.54 0.03 2.19 1.59 2.00 4.03 0.264 0.40 0.33

RMSEC

Performances

Table 1–Applications of regression algorithms based on selected feature wavelengths for HSI data analysis in foods.

0.83 (SECV) 0.94 (SECV) 0.867 / 1.51 0.684 0.008 2.786%

0.65

5.79 (SECV) 0.21 (SECV) 0.09 (SECV) / / / 0.28 (SECV%) 0.51(SECV%) 0.66(SECV%) 0.31(SECV%) 1.36 0.67 0.49 1.11 0.36 (SECV) 0.76 (SECV) 0.39 (SECV) 0.52(SECV%) 0.37(SECV%) 1.56 1.42 1.09 0.60 / 2.24 1.59 2.16 1.14 / 0.50 0.40

RMSECV

1.0 (SEP) 1.5 (SEP) 1.29 1.106 1.49 0.762 0.007 2.807%

0.64

/ / / 0.479 0.033 0.173 / 0.46(SEP%) 0.65(SEP%) 0.29(SEP%) 1.63 0.78 0.50 1.34 0.40 (SEP) 0.62 (SEP) 0.42 (SEP) 0.57(SEP%) 0.35(SEP%) / / / / 0.026 2.41 1.45 2.11 1.13 0.28 / 0.44

RMSEP

(Continued)

Cheng and others (2013)

Pannagou and others (2014) He and others (2014) Liu and others (2014) Liu and others (2013a)

Barbin and others (2013b)

Feng and Sun (2013a)

Wu and Sun (2013a) Wu and Sun (2013b) Feng and Sun (2013b) Feng and others (2013)

Wu and others (2012b) Wu and others (2012a)

Kamruzzaman and others (2012b) Kamruzzaman and others (2013) Talens and others (2013)

Kamruzzaman and others (2012a)

Barbin and others (2013a)

Barbin and others (2012a)

ElMasry and others (2011) ElMasry and others (2013)

Zhang and others (2013)

ElMasry and others (2007)

Reference




Pork tenderness Pork E. coli Moisture of banana Firmness of banana TSS of banana WBSF of salmon WBSF of lamb Hardness of salmon Cohesiveness Adhesiveness Gumminess Chewiness MC of salting meat Firmness of grass crap Salt content of pork aw of pork N of oilseed P of oilseed K of oilseed Water of beef Moisture of prawns WHC of salmon TVC of salmon WBSF of salmon Firmness of grass carp TVB-N of grass crap Firmness of apple Firmness of apple TSS of apple Firmness of apple Drip loss of pork Ecoli. in spinach pH of pork L of pork

5 5 8 8 8 8 11 10 10 11 10 10 7 7 6 6 12 8 5 6 12 12 8 8 7 9 5 / / 10 6 / 6 6

Nr of PEW 0.99 0.99 0.87 0.91 0.86 0.77 0.77 0.55 0.43 0.52 0.55 0.55 0.949 0.937 0.962 0.942 0.88 0.60 0.56 0.984 0.991 0.974 0.981 0.84 0.951 0.918 0.86 0.79 0.81 0.78 / 0.97 / /

R C2 0.900 0.88 0.82 0.91 0.90 0.81 0.71 0.45 0.32 0.41 0.45 0.44 0.927 0.933 0.949 0.927 / / / 0.968 / / / / 0.948 0.912 0.85 0.74 0.79 / / / / /

R C2 V / / / / / / / / / / / / 0.917 0.925 0.930 0.914 0.78 0.50 0.51 / 0.984 0.924 0.967 0.82 0.941 0.902 0.83 / / 0.69 0.61 / 0.28 0.62

R 2P / / / / / 2.282 / / / / / / / / / / / / / / 7.917 4.16 5.29 2.35 / / / / / / / / / /

RPD 2.796(SEC) 0.27 (SEC) / / / 1.17 5.1 3.63 0.06 0.436 1.738 10.528 1.18 1.358 0.561 0.007 0.27 0.03 0.17 0.749 0.011 0.074 0.194 0.98 1.012 2.246% 8.26 6.1 6.66 5.95 1.12 (SEC) 0.04 0.07 (SEC) 2.09 (SEC)

RMSEC

Performances

5.702(SECV) 0.639 (SECV) / / / 1.12 5.84 4.02 0.066 0.483 1.925 11.658 1.41 1.427 0.646 0.008 / / / 1.055 / / / / 1.123 2.401% / 6.4 0.74 / / / / /

RMSECV

0.15 (SEP) 3.21 (SEP)

5.53 2.61 (SEP)

1.48 1.48 0.682 0.007 0.375 0.03 0.17 / 0.015 0.012 0.265 1.08 1.229 2.782% 9.40

/ / / / / / / / / / / /

RMSEP

Wang and others (2012a) Qiao and others (2007) Siripatrawan and others (2011)

Wu and others (2012c) Wu and others (2012b) Wu and Sun (2013a) Wu and Sun (2013b) He and others (2014) Cheng and others (2014) Cheng and others (2013) ElMasry and others (2009) Lu (2007)

Zhang and others (2013)

Liu and others (2014) Cheng and others (2014) Liu and others (2013a)

He and others (2014) Kamruzzaman and others (2013) Wu and others (2014)

Rajkumar and others (2012)

Tao and others (2014)

Reference

PLSR: partial least-square regression, MLR: multiple linear regression, ANN: artificial neural network, LS-SVM: least-square support vector machine, MC: moisture content, TSS: total soluble solids, WHC: water holding capacity, TVC: total viable count, PPC: psychrotrophic plate count.

ANN

LS-SVM

MLR

Algorithm

Application

Table 1–Continued.



Recent advances in data mining techniques . . . Table 2–Applications of classification algorithms based on selected feature wavelengths for HSI data analysis in foods. Performances Algorithm KNN K-means F-value LDA

PLS-DA

SVM LS-SVM ANN CDA Clustering QDA GDA Linear logistic regression

Application

Nr of PEW

Confusion matrix

CCR (%)

Reference

Chicken tumors Package type of salmon Pork quality Bruise detection of pears Pork quality Quality of hams Lamb muscle Sour skin infection of onions Beef tenderness Infestation in mung bean Red meat species Pork freshness Ham quality Rainbow trout freshness Chicken skin tumors Chicken skin tumors Sour skin infection of onions Fish freshness Wheat classes Apple injury Fish identification Beef tenderness Beef tenderness Parasite detection in clam Infestation in mung bean Grape seed characterization Classification of bruise apple

5 5 10 2 5 8 6 2 6 3 6 14 6 11 2 6 2 7 10 5 3 8 9 3 3 6 40

/ / / / / / √ √ √

86 88.3 83 92 100 100 100 80 75.4(validation set) 85 98.67(validation set) 100 95.2(validation set) 100 96 90 87.14 97.22 90 98.4 100 96.4 77 85 88 96.4 90

Nakariyakul and Casasent (2004) Sone and others (2012) Liu and others (2010) Lee and others (2014) Liu and others (2010) Elmasry and others (2011) Kamruzzaman and others (2011) Wang and others (2012a) Cluff and others (2013) Kaliramesh and others (2013) Kamruzzaman and others (2012a) Barbin and others (2013c) Talens and others (2013) Khojastehnazhand and others (2014) Fletcher and Kong (2003) Du and others (2007) Wang and others (2012a) Zhu and others (2012) Mahesh and others (2008) Elmasry and others (2009) Fu and others (2013) Naganathan and others (2008a) Naganathan and others (2008b) Coelho and others (2013) Kaliramesh and others (2013) Rodr´ıguez-Pulidoa and others (2013) Baranowski and others (2013)

/ √ / / / / / √ / / / √ √ √ / / √ √

KNN: k nearest neighbor, K-means: K means cluster, LDA: linear discriminant analysis, PLS-DA: partial least square discriminant analysis, SVM: support vector machine, LS-SVM: least-square support vector machine, ANN: artificial neural network, CDA: canonical discriminant analysis, QDA: quadratic discriminant analysis, GDA: general discriminant analysis.

developed a time-series HSI system for identifying surface total viable bacterial counts in salmon flesh, and the result showed that the PLSR model gave the highest R2P of 0.958 and ratio of prediction to deviation (RPD) of 5.127. Additionally, quantitative PLSR models based on several feature wavelengths were also established (R2P of 0.86 and 0.89) for predicting total viable count (TVC) and psychrotrophic plate count (PPC) of chilled pork, respectively (Barbin and others 2013b), indicating that the HSI methods combined with PLSR models were effective to predict food sample microbial counts. Nonlinear regression. As a result of sample inhomogeneous, environmental, or instrumental effects, the spectral variance and target attributes are not linearly related in most cases. Two ways are available to solve this type of problems: the first one (also the preferred way) is to transform the nonlinear regression into linear regression by variable substitution. Then, linear regression methods (such as MLR and PLSR) are applied to achieve predictive purposes. The second way is to resort to nonlinear methods, in which observational data are fitted by a way of successive approximations. In fact, the used function is a nonlinear combination of some important model parameters that are closely related to several independent variables (Seber and Wild 1989). As 2 of the most typical nonlinear methods, the basic principles of ANN and SVM, and their corresponding applications in hypercube analysis, are described below, respectively. The multilayer feed-forward neural network (FNN), the most frequently applied ANN for HSI analysis in food, is usually composed of 3 neuron layers: an input layer, 1 or several hidden layers, and an output layer. The spectral value at each wavelength is first imported into the input layer, and then the output layer yields the corresponding prediction values after some complicated transformation among hidden layers. The functions that connect different layers are based on nonlinear mapping. Besides, FNN usually has 1 or more hidden layers, which shows greater potential for dealing C 2014 Institute of Food Technologists®

with nonlinear and complex correlation problems despite the need of more training time (Bebis 1994). ANN has been previously reported to develop prediction models for quality attributes, as well as safety parameters, by several authors (ElMasry and others 2009; Siripatrawan and others 2011). Lu (2007) combined visible HSI (400 to 1000 nm) with backpropagation FNN techniques to predict apple firmness and soluble solids content with R2 = 0.76,0.79 for “Golden Delicious” apples, respectively. Similarly, the firmness values of apples were accurately predicted (Rp = 0.91) using the established ANN models based on several important wavelengths (ElMasry and others 2009). Qiao and others (2007) employed FNNs to build determination models for predicting pork quality attributes, in which the FNN models gave prediction correlation coefficients of 0.77, 0.55, and 0.86 for drip loss, pH, and color, respectively. In a recent study by Siripatrawan and others (2011), the ANN method was applied using Bayesian regularization and was successfully applied to determine the number of Escherichia coli bacteria in packaged fresh spinach. One of the most important ideas in SVM is that the input x is first projected into a high-dimensional feature space by nonlinear mapping, and in the same feature space, a linear model is created. Furthermore, the capacity of SVM is controlled by setting of parameters (metaparameters and the kernel parameters) that are independent of the dimensionality of feature space (Cortes and Vapnik 1995). The main steps of SVM are illustrated as follows: Step 1: X was mapped into a high-dimensional feature space by using a set of nonlinear transformation: g i (x),i = 1, . . . ,n, and then a linear model f (x,υ) is established: f (x,υ) =

n

υi g i (x) + k

(10)

i =1

where a is the “bias” term.


Recent advances in data mining techniques . . . Table 3–A comparision of regression algorithms. Regression Linear Nonlinear

Advantages

Disadvantages

Simple Easy to fit models Easy to determine statistical properties More efficiency Suitable for analyzing complex problems

Too abstract Powerless for predicting complex problems High computational complexity

Step 2: The quality of estimation is measured by the loss function perfect performance for the prediction of TVC in salmon samples with R2p of 0.967 and RPD of 5.288 (Wu and Sun 2013b). given by Advantages and disadvantages. In the field of regression, lin (11) ear regression is the first type of method that has been rigorously L s (y, f (x,υ)) = y − f (x,υ)−ε if y − f (x,υ)>ε 0 otherwise researched and widely used in practical applications. Advantages Step 3: SVM attempts to reduce the complexity of model by of linear regression approaches include the easy fitness of linear model, and the simple determination of statistical properties in minimizing the following function: the resulting estimations (Aalen 1989). Especially for PLSR, the m greatest advantage is the excellent ability of solving colinearity 1 min υ2 + C ξ j + ξ ∗j (12) problems. Besides, the property of easier identification of noise 2 j =1 and the good explanation of RC for each independent variable make it easy to extend its ability to predict several quality at∗ y j − f (x j ,υ) ≤ τ + ξ j tributes simultaneously. However, it is powerless for predicting the f (x j ,υ) − y j ≤ τ + ξ j distribution characteristics of unknown parameters. On the other hand, the obtained regression model between dependent variables ∗ s .t.ξ j , ξ j ≥ 0, j = 1,...,m (quality attributes) and the independent variables (spectral or image features) is too abstract and very difficult to be understood, which where C and τ are metaparameters. brings in great inconvenience for its wide applications (Geladi Step 4: The kernel function is given by 1988). n Nonlinear regression analysis, another important aspect of tra g i (x)g i x j K(x,x j ) = (13) ditional chemometric data mining, is an extension of linear regression analysis. In the practical researches of HSI in foods, the i =1 spectral (or image) features and quality attributes are nonlinearly correlated in most cases. Nonlinear regression approaches (such Step 5: The discriminate functions can be transformed as as ANN and SVM) are very effective and suitable for analyzing ns v complex problems. Like the linear regression, all of the nonlinear ∗ ∗ e j − e j K x j ,x s .t.0 ≤ e j ≤ C, 0 ≤ a j ≤ C f (x) = regression approaches have their unique merits and drawbacks. j =1 (14) For example, as a novel SVM method, LS-SVM has received special attention recently for its lower complexity, faster computing where n s v is the number of support vectors (SVs). speed, and better performance than SVM. However, the compuAs an optimized version based on the standard SVM, LS-SVM tational complexity is still very high for the cases of massive data integrates the advantages of high predicting accuracy and simple regression, since its computational load is the square of number structure. As a consequence, considerable interest has been focused of independent variables (Wu and others 2013). ANN has reon the usage of LS-SVM regression for the quantitative analysis ceived widespread applications for its low requirement of knowlof HSI data in recent years. As for the determination of quality edge, as well as its good ability for handling the nonnormally parameters, Wu and others (2012c) investigated the potential of distributed and nonlinear correlated data. However, the presence a time series HSI system for detecting the moisture distribution of less-learning, overlearning, the curse of dimensionality, and loin beef. In this study, a comparison of predicting performance cal minima become the biggest limitations for its usage in the between PLSR and LS-SVM was conducted, in which the per- hyperspectra data cube (Tu1996). A brief comparison between formance of LS-SVM (R2p = 0.968) was much better than that of linear regression and nonlinear regression methods is presented in PLSR (R2p = 0.914) when 6 important wavelengths were used. Table 3 Similar comparison was carried out in another research study (Wu and others 2012b), showing that the LS-SVM models had Classification methods higher R2p and than those of PLSR models for predicting moisture Unsupervised classifications. KNN classification is a type of content of dehydrated prawns using 12 wavelengths. In another instance-based learning method for classification, in which a samresearch, after the procedure of feature selection, Wu and Sun ple is classified according to a certain number of (k) closest training (2013a) showed that the built LS-SVM model was able to accu- samples in a feature space. In other words, the classification is derately predict the values of 4 water-holding capacity indices based cided by the majority votes from k nearest neighbors. Generally on the information at selected wavelengths. Recently, Zhang and speaking, the first stage of KNN classification is to identify the others (2013) utilized a HSI system combined with LS-SVM mod- nearest neighbors, and the second stage is to determine the class els to fit the related nutrient content of oilseed rape leaves with using those neighbors (Altman 1992). Several attempts have been the corresponding spectral data. As a result, the Rp s of LS-SVM made for the use of KNN for HSI data analysis in food safety and models for predicting N and P were 0.882 and 0.710, respectively. food quality identification. For example, in order to detect skin As for the assessment of safety attributes, LS-SVM models showed tumors rapidly and nondestructively, Nakariyakul and Casasent 898 Comprehensive Reviews in Food Science and Food Safety r Vol. 13, 2014


Recent advances in data mining techniques . . . (2004) applied KNN classifier to analyze the spectral differences between normal and tumor-containing chicken skins. In a recent study (Sone and others 2012), HSI was used to inspect the effects of different packaging on the TVC and lipid oxidation in Atlantic salmon fillets. In this study (Sone and others 2012), KNN classifier successfully classified the fillets into 3 groups according to the packaging type using 5 feature wavelengths. Another widely used unsupervised classification method is Kmeans clustering, also based on distances between samples. It attempts to dividing n samples into k clusters, in which a particular sample is categorized into the cluster containing the nearest mean (Hartigan and Wong 1979). The algorithm is alternatively conducted between 2 steps: first, assigning each sample to the cluster (or group), of which the mean owns the least within-cluster sum of squares (the squared Euclidean distance), forming the intuitively “nearest” mean; second, calculating the new means as the centroids of the samples in new clusters. Due to the advantages of simplicity and fast speed, K-means clustering is a common choice in image analysis and computer vision. However, only 1 study has reported the possibility of using K-means clustering based on HSI data for classification in foods. Liu and others (2010) applied K-means clustering for the categorization of pork quality with an accuracy of 78% using 5 PCs and of 83% using 10 PCs, respectively. Supervised classifications. In supervised classifications, LDA is very effective for handling the case in which the unequal withinclass frequencies were found. This approach guarantees the optimal class separability by minimizing the ratio of within-class variance to between-class variance in the particular dataset. The mathematical functions included in using LDA are summarized as follows (Balakrishnama and Ganapathiraju 1998): Assume X as the input vector. (1) Compute the mean vector of dataset in each class.

ai =

1 X i = 1,2 N i X∈ω i

(15)

where a i is the average vector of dataset i , ωi is the class label of dataset i , and Ni is the total of ωi . (2) The criteria for class separability is formulated by

Si =

(X − a i )(X − a i )T i = 1,2

(16)

X∈ωi

Sb = (a 1 − a 2 )(a 1 − a 2 )T

(17)

where Si and Sb are the within-class scatter matrix and the between-class scatter matrix, respectively. (3) Project the dataset on Fisher function space, and then compute the segmentation threshold y ∗ in the project space as below:

y∗ =

N1 a˜ 1 % + N2 a˜ 2 % N1 + N2

(18)

where a˜ 1 % and a˜ 2 % are the mean values of dataset 1 and dataset 2 in the project space, and N1 and N2 are the data numbers of dataset 1 and dataset 2. C 2014 Institute of Food Technologists®

(4) For the given vector X∗ , project this vector on the project space as y, and perform classification by the following functions: y > y ∗ ⇒ X ∈ ω1 (19) y < y ∗ ⇒ X ∈ ω2 As a supervised method for classifications, the performances of LDA are examined on randomly generated test data for classification. LDA has widely been used in food quality classification and safety evaluation. Liu and others (2010) employed a Gabor filterbased HSI for the determination of pork quality levels. Using LDA, perfect classification accuracy was reached at 100% based on 5 hybrid PCs. Similarly, good performance with correct accuracy of 100% for the LDA calibration model was obtained by ElMasry and others (2011a) for the rapid and nondestructive classification of cooked turkey hams with the adaption of a NIR HSI system. In another study, a NIR multispectral imaging system (based on only 6 wavelengths) combined with a LDA model correctly classified 100% of all lamb muscle types (Kamruzzaman and others 2011). Recently, Wang and others (2012b) proposed Fisher’s discriminant analysis (a variety of LDA) for detecting sour skin-infected onions using shortwave infrared HSI, which discriminated 80% of normal and sour skin-infected onions. PLS-DA, or discriminate PLS referred in the literature, is a common and typical chemometric method for supervised classification of spectroscopy. It is a compromise between the ordinary discriminant analysis and a special discriminant analysis on the important PCs of the prediction variables (Pérez-Enciso and Tenenhaus 2003). The first step of PLS-DA is to create a PLS regression model based on spectra or image variables. The second step is to classify samples from the results of PLS regression on spectra or image variables. This approach is especially suited to deal with the case when the number of variables is much larger than the samples and with multicolinearity. Therefore, PLS-DA has also been explored to analyze the HSI data in foods by 2 studies recently. Kamruzzaman and others (2012) investigated the potential of HSI combined with PLS-DA for adulteration detection in red meat. Excellent classification accuracies of 98.67%, 97.33%, and 93.33% were obtained for beef, lamb, and pork, respectively. In another research, satisfactory PLS-DA models were established to classify samples into different quality categories by using the whole spectrum and the identified optimal wavelengths (Talens and others 2013; Shao and He 2009). SVM cannot only be applied to regression problems but also to classification problems. The fundamentals of SVM classification and regression are basically in the same manner as with the regression approach. It first constructs a hyperplane (linear or nonlinear) in a high-dimensional feature space, which can be used for classification. The difference is that, once the hyperplane has been established, a discrimination step is adapted to assign new samples into which category. In other words, the samples need to be predicted are then projected into the same feature space and the category that they belong to is determined according to which side of the hyperplane that they fall on (Suykens and Vandewalle 1999; Shao and He 2009). SVM has been intensively implemented to detect skin tumors of poultry or the infection of vegetables. Fletcher and Kong (2003) first explored the possibility of using hyperspectral fluorescence imaging in combination with SVM classifier to identify chicken tumors. Using 2 important components, the system was able to distinguish over 96% of the diseased chickens. A similar system created by Du and others (2007) correctly indentified about 90%


Recent advances in data mining techniques . . . Table 4–A comparison of classification algorithms. Classification

Advantages

Disadvantages

Unsupervised

Easy to understand and implement Free from parameter estimations Effective Suitable for analyzing complex problems

Sensitive to noise Less effective Need a training set Heavy calculation load

Supervised

unwholesome diseased chickens, the decreased correct classification rate (CCR) might have been caused by the larger number of test samples. Recently, Wang and others (2012a) input 3 parameters (max, contrast, and homogeneity) of hyperspectral images to SVM models, which successfully distinguished 87.14% normal onions from sour skin-infected ones. On the other hand, LS-SVM, a reformulation of standard SVM, has the advantage of good generalization performance of SVM as well as a simpler structure and shorter optimization time of PLSR. For instance, LS-SVM classification models were developed to classify fresh and frozen-thawed fish fillets based on combined spectral and textural variables, and the average CCR of 97.22% was reached for the prediction sets (Zhu and others 2012). In addition to the typical and frequently used classification methods, more effective algorithms have been developed for their applications in HSI analysis for classification. Among these newly developed methods, ANN shows good performances, as in the cases of differentiating wheat classes (Mahesh and others 2008), and detecting chilling injury in apples (ElMasry and others 2009). Recently, Fu and others (2013) appilied back-propagation and a learning vector quantization ANN model to identify marine fish from freshwater fish, leading to correct identification accuracies of 98% and 100%, respectively. In addition, according to Baranowski and others (2013), the linear logistic regression neural network model was considered as the best classifier for detecting apple bruises. Furthermore, some variations of discriminant analysis for analyzing HSI dataset have been presented in several publications. For example, canonical discriminant analysis (CDA) was used by Naganathan and others (2008a) to determine beef tenderness into 3 categories (tender, intermediate, and tough) with an accuracy of 96.4%. An average classification accuracy of 88% was obtained using discriminant analysis (QDA) for identifying uninfested mung bean kernels (Kaliramesh and others 2013). General discriminant analysis (GDA) was conducted on the hyperspectral images data using 6 optimum wavelengths to distinguish among grape seed varieties, in which a high accuracy of 96% was achieved (Rodriguez-Pulidoa and others 2013). More recently, Coelho and others (2013) proposed a clustering model, based on 3 wavelengths, which was used to detect 85% of parasites in cooked clams. Advantages and disadvantages. As shown in Table 4 compared to unsupervised classification methods, the supervised classification algorithms generally have better performances and are more adaptable to the high complexity of data types. While using supervised classification, a training process is needed, to obtain more knowledge of issues that inversely extends the operation time. Especially for unsupervised classification approaches, advantages of KNN include ease of understanding and implementation, no parameter estimation, and, especially, suitability for multiclassification. On the other hand, as a traversal algorithm, the large volume of test samples increases the memory occupancy (Keller and others 1985). Additionally, the poor explanation for the obtained results minimizes its application in HSI. On the other hand, K-means cluster, a simple but useful classification method, is relatively scal-

able and efficient, especially for dealing with large volumes of data. It performs particularly well in cases when obvious differences are found in among-classes distance. However, the number of clusters k should be given in advance and different initial values may result in various clustering results. Moreover, a small amount of noise and isolated points can generate a significant impact on the classification results (Selim and others 1991). When both spectral and image features are analyzed, transformation is needed to unify them in the same range of metrics. As for LDA, it guarantees that the projected pattern samples have the smallest with-class distance and the largest among-class distance in the new space (Altman and others 1994). In other words, the samples can be well separated after projecting into a new feature space. However, the calculation load increases rapidly as more dimensions are taken into account. Also, the discriminant function of LDA is too simple to handle the nonlinear problems. In addition, PLS-DA not only possesses advantages of simple and intuitive features as discriminant analysis, but also combines the advantages of removing irrelevant factors as partial least squares. It does not perform well when dealing with highdimensional data and nonlinear data (Pérez-Enciso and Tenenhaus 2003). Among all the classification methods, SVM is especially proposed to solve small-sample-size problems, which possess good generalization performance for analyzing high-dimensional and nonlinear datasets. Meanwhile, it avoids the architecture selection and local minima problems of ANN. As the traditional SVM employs binary classification, the risk of prediction increases when more classes are involved (Suykens and Vandewalle 1999).

Evaluation Methodologies and Applications Evaluation methods for regression models To perform either quantitative or qualitative analyses using HSI, models must be trained or calibrated using multivariate analysis and then evaluated for prediction performance. Samples are generally divided into 3 datasets, namely, calibration set, validation set, and prediction set. Calibration set is conducted by selecting a representative calibration sample (which spans the values of attributes under investigation), and it is used to ascertain the model coefficients that explain spectral variation with the real properties of these training samples. Validation set aims to find a good and robust set of parameters that need to be given in advance for model establishment in which cross-validation is a common choice. Prediction set is applied to estimate the prediction ability of the established model by using a new set of samples. Currently, the quality of regression models is primarily evaluated by the following 3 statistical criteria, namely, the determination coefficients (R2 ), root mean square error (RMSE), and RPD. The first 2 statistical parameters are defined as follows: 2

(ya c t − y¯ a c t ) y pr e d − y¯ pr e d 2 R =

(20) 2

(ya c t − y¯ a c t )2 y pr e d − y¯ pr e d


RM SE =

(y pr e d − ya c t )2 n

(21)



where n is the number of spectra (samples), ya c t is the actual 2 value, and y pr e d is the predicted value. RC2 , RCV , and R2P are the determination coefficients for calibration sets, cross-validation sets, and prediction sets, respectively, and RMSEC, RMSECV, and RMSECP are the RMSE for calibration sets, cross-validation sets, and prediction sets, respectively. The R2 essentially represents the degree of explanation of independent variables (spectral values) to the responsible variable (the attribute that needs to be predicted), while RMSE values give the average uncertainty that can be expected for predicting future samples. It is always expected that the predicted values should have the same or close value with the measurement results. In other words, for the statistical criteria of R2 , the obtained R2 should be as close as 1 (indicating parallel), and the predicted line coincides with the measured line, while the RMSE is as close as 0 for 3 sample sets. However, in actual research, the accuracy of a multivariate regression model is considered to be excellent when R2 is 0.90 or higher (Cuadrado and others 2005). As shown in Table 1, a good model 2 , which is slightly larger should have a similar and high RC2 and RCV than R2P . If a high RC2 and low R2P are obtained, then there might be a problem of overlearning in the calibration set. Therefore, rather than calculating the ordinary statistical criteria from the established calibration or validation models, it is more significant to evaluate the model predicting ability in the prediction set (ElMasry and others 2013). The statistical criteria R2 and RMSE are widely employed to assess the performance of established regression models for predicting various attributes in foods. For quality attributes, ElMasry and others (2012) used HSI to predict the L*, b*, pH, and 2 tenderness values and achieved RCV of 0.88, 0.81, 0.73, and 0.83 and RMSECV of 1.21, 0.57, 0.06, and 40.75, respectively. Additionally, Barbin and others (2013b) used NIR HSI to predict pork composition and the cross-validated PLS models established had 2 an good predicting ability to determine the pork protein (RCV = 2 2 0.88), moisture (RCV = 0.87), and fat (RCV = 0.95). For safety parameters, Feng and others (2013) showed that 3 optimal wavelengths identified from weighted coefficients were competent for determining Enterobacteriaceae loads in chickens with R2 of 0.89, 0.86, and 0.87 and RMSEs of 0.33, 0.40, and 0.45 log10 CFU/g for calibration, cross-validation, and prediction sets, respectively. Some other research studies (Feng and Sun 2013a; Wu and Sun 2013b) also referred to correlation coefficient (R) for calibration, cross-validation, or prediction (RC , RCV , or RP , respectively). R is the root of R2 , ranging from −1 to 1. In essence, R measures the degree of linear dependence between 2 variables that is free of causality, while the R2 describes the interpretation degree of explanatory variables to the corresponding dependent variables that measure causal relationships. Instead of using RMSE, several publications adapted the standard error (SE) coupled with R2 as the criteria to evaluate the model performances. For instance, the PLSR model for predicting water holding capacity (WHC) (expressed as drip loss) in beef samples by using 6 important wavelengths 2 gave RCV of 0.87 and SECV of 0.38% (ElMasry and others 2011). Kamruzzaman and others (2012a) adapted SE as the criterion to estimate the performance of the PLSR model based on selected feature wavelengths, in which R2P of 0.87 and 0.84 and SEP of 0.35% and 0.57% were achieved for determining lamb fat and water, respectively. In a recent study by Barbin and others (2013b), the best PLS regressions were obtained with R2P of 0.81 and 0.81 and SEP of 1.0 and 1.5 for log (TVC) and log (PPC) content in pork, respectively.


RPD, another effective statistical criterion, has recently received increasing applications for model evaluations. RPD is defined as RPD = STD/SEP

SE P =

n

i =1

( yˆ i − yi − b i a s )2 n −1

(22)

(23)

1 n ( yˆ i − yi ) (24) i =1 n where n is the sample number in prediction sets, yˆ and y are the predicted and measured reference values in prediction sets, y¯ i is the average value in prediction sets, and STD is the standard deviation (SD) of prediction sets. By combining the ratio of SD and RMSE, RPD presents a relative predictive performance of the established model more directly and efficient than when either R2 or RMSE is used separately. Generally, the higher the RPD value is, the better and more robust the model is, and vice versa. According to Barlocco and others (2006) and Guy and others (2011), the value of RPD above 2 indicates that a good performance of calibration is obtained, while a RPD value greater than 3 is considered sufficient for a particular analytical purposes. However, Nicola¨ı and others (2007) proposed the different view that a RPD value between 1.5 and 2 indicated that the established model was able to separate the response variables with large difference; a value between 2 and 2.5 implied that the model was possible to distinguish some coarse predicting values; and a value between 2.5 and 3 or larger meant that a robust model with excellent accuracy was obtained. Several studies applied RPD as an important evaluation index for assessing the performance of established models. For example, Wu and others (2012b) used RPD to evaluate the performance of the SPA-LS-SVM model and indicated that the model had a high R2P of 0.984, RMSEP of 1.502, and RPD of 7.917 for moisture content prediction in dehydrated prawns using HSI. In another study, Kamruzzaman and others (2012a) developed a PLSR model and resulted in RPD values of 3.24, 3.91, and 1.73 in cross-validation set and 2.63, 3.20, and 1.71 in the validation set for detecting lamb water, fat, and protein contents, respectively, indicating that HSI could predict fat more precisely than water and protein. Similar results were obtained by Barbin and others (2013a) who employed the PLSR model and obtained the RPD of 1.2, 4.0, and 2.7 for predicting pork moisture, fat, and protein, respectively. Recently, Feng and Sun (2013a) reported that the model based on Kubelka– Munck spectra was excellent with an indicative high RPD value of 3.02 for detecting TVC in raw chicken fillets. Additionally, using a wavelength range of 400 to 1000 nm, the CARS-PLSR model was considered as the best model for TVC assessment in salmon flesh, leading to a R2P of 0.985 and RPD of 5.127 (Wu and Sun 2013b). bia s =

Evaluation methods for classification models The CCR is the most commonly applied and useful method to evaluate the performance of classifiers, which is calculated as the percentage of correctly classified samples among all samples, and therefore CCR has widely been used in HSI studies. For example, Li and others (2011) used the two-band ratio and PCA combined with a simple threshold method and obtained the best CCR of 93.7% for detecting orange surface defects. As for freshness detection, Zhu and others (2012) applied CCR to evaluate


Recent advances in data mining techniques . . . the performance of LS-SVM classifiers based on visible and nearinfrared HSI. In this study (Zhu and others 2012), satisfactory average CCR of 97.22% was achieved for the prediction set using combined spectral and textural variables. Recently, Barbin and others (2013a) employed LDA models to correctly classify 91% fresh pork and 96% spoiled pork. In addition, Kamruzzaman and others (2011) obtained high classification accuracy of 100% to discriminate lamb muscles based on the selected optimum wavelengths. As for disease detection, Chao and others (2002) applied a multispectral imaging system for detecting chicken skin tumors. The established model was able to identify 91% and 86% of normal and tumor tissue, respectively. Similarly, Kim and others (2006) presented a hyperspectral fluorescence imaging system for detecting poultry skin tumors, achieving a classification rate of 98.2% based on neural network using 4 feature images. Confusion matrix can also be used to express the performance of classification models (Naganathan and others 2008b). A confusion matrix contains the exact number of 1 category that has been wrongly classified or correctly classified, as well as the total number of each category in which the correctly classified samples are illustrated on the diagonal. The confusion matrix not only gives the CCR of whole samples, but also gives detailed information on how a certain class is wrongly categorized, which is helpful for identifying the reason of misclassification. For example, according to the confusion matrix reported by Xing and others (2005), 4 samples from 62 sound samples were misclassified into the bruised group, and 9 samples from 66 bruise samples were classified as sound group. Confusion matrix was also applied by Naganathan and others (2008a) for detecting beef tenderness, in which 107 samples were correctly classified from 111 samples, and only 1 tender sample was wrongly classified into the intermediate group, while 3 intermediate samples were wrongly classified into the tender group. Finally, Barbin and others (2012b) showed the classification results in the form of a confusion matrix, and they indicated that only 3 RFN (reddish-pink, firm, and non-exudative) samples were wrongly classified as PSE (pale/pinkish-gray, soft, and exudative) group and the 3 misclassified samples played an important role for further identification of misclassification causes.

Conclusions and Future Trends Although HSI facilitates better characterization of intrinsic and extrinsic properties of food samples by integrating traditional spectral and image techniques, it often requires careful and sophisticated data mining and processing in order to realize the prediction of quality attributes and safety parameters. In this review, a general introduction of hyperspectral data mining, including denoising, dimensionality reduction, model establishment, and evaluation, was presented together with their corresponding applications. Special focus is placed on the methods for model establishment for either qualitative or quantitative prediction. Moreover, the limitations of these approaches are also discussed. With the increasing interest in online quality monitoring in the food industry, improvements in methods and techniques are needed to effectively address the issue of data mining in hyperspectral images. Due to the negative influence caused by the undesired variation during the data collection as well as the interference from redundant and noisy wavelengths in food hyperspectral images, the feature extraction and dimension reduction in hyperspectral images play an important role in data mining. In the past few years, much progress has been made in applying various algorithms to deal with hyperspectral images. However, the most frequently used methods for feature extraction and dimen-

sion reduction from hyperspectral images are some traditional and typical methods, of which the disadvantages are averse to finding the real relationship between food samples and the property that needs to be predicted. Some advanced and efficient approaches that are used in computer vision (such as kernel-based feature extraction algorithm and some popular dimensionality reduction algorithm) should be introduced to the food field to improve the ratio of signal to noise of hyperspectral images. Besides, as machine learning combines the advantages of high precision and the capabilities of dealing with complex data, it has been successfully implemented in optical images and remote sensing images, which shows great potential of using machine learning methods to mining food hyperspectral images. Taking the artificial intelligence as an example, the rule based on the relationship between spectral information and chemical predictors makes it possible to improve the accuracy and robustness of the existing simple linear or nonlinear regression or classification approaches. Currently, only a few applications and research endeavors for analyzing food hyperspectral images are available in this regard and it is a valuable direction. In addition, more advanced feature selection methods with high efficiency and high accuracy should be proposed to extract the most important features from HSI. Moreover, combining multiple methods for model establishment and evaluation can be a new trend for improving the accuracy of model prediction and classification.

Acknowledgments The authors gratefully acknowledge the financial support from Guangdong Province Government (China) through the program of “Leading Talent of Guangdong Province (Da-Wen Sun).” This research was also supported by the Natl. Key Technologies R&D Program (2014BAD08B09). Specially thanks to Dr. Dan Liu and Dr. Hongbin Pu from South China Univ. of Technology for their valuable suggestions.

Nomenclature HSI MC TSS WHC TVC PPC SNV MSC MNF RCs SR ANNs SPA GAs CARS PLSR MLR KNN K-means LDA PLS-DA SVM LS-SVM CDA QDA GDA


Hyperspectral imaging Moisture content Total soluble solids Water holding capacity Total viable count Psychrotrophic plate count Standard normal variate Multiplicative scatter correction Minimum noise fraction Regression coefficients Stepwise regression Artificial neural networks Successive projections algorithm Genetic algorithms Competitive adaptive reweighted sampling Partial least-square regression Multiple linear regression k nearest neighbor K means cluster, Linear discriminant analysis Partial least-square discriminant analysis Support vector machine Least-square support vector machine Canonical discriminant analysis Quadratic discriminant analysis General discriminant analysis C 2014 Institute of Food Technologists®

Recent advances in data mining techniques . . . R2 R RMSE RPD CCR

Determination coefficients Correlation coefficient Root mean square error Ratio of prediction to deviation Correct classification rate.

Cortes C, Vapnik V. 1995. Support-vector networks. Mach Learn 20(3):273–97. Cuadrado MU, Luque de Castro MD, Juan PMP, Gómez-Nieto MA. 2005. Comparison and joint use of near infrared spectroscopy and Fourier transform mid infrared spectroscopy for the determination of wine parameters. Talanta 66:218–24. Dai Q, Cheng JH, Sun D-W, Zeng, XA. 2014a. Advances in feature selection methods for hyperspectral image processing in food industry applications: a review. Crit Rev Food Sci (just-accepted) http://www.ncbi.nlm.nih.gov/pubmed/24689555. References Aalen OO. 1989. A linear regression model for the analysis of life times. Stat Dai Q, Cheng JH, Sun D-W, Zeng XA. 2014b. Potential of hyperspectral imaging for non-invasive determination of mechanical properties of prawn Med 8(8):907–25. (Metapenaeus ensis). J Food Eng 136:64–72. Altman EI, Marco G, Varetto F. 1994. Corporate distress diagnosis: Delgado AE, Sun, D.-W. 2002. Desorption isotherms for cooked and cured comparisons using linear discriminant analysis and neural networks (the beef and pork. J Food Eng 51(2):163–170. Italian experience). J Bank Finance 18(3):505–29. Delgado AE, Zheng L, Sun D-W. 2009. Influence of ultrasound on freezing Altman NS. 1992. An introduction to kernel and nearest neighbor rate of immersion-frozen apples. Food Bioprocess Technol 2(3):263–270. nonparametric regression. Am Stat 46(3):175–85. Du CJ, Sun D-W. 2005. Comparison of three methods for classification of Balakrishnama S, Ganapathiraju A. 1998. Linear discriminant analysis—a pizza topping using different colour space transformations. J Food Eng brief tutorial. Institute for Signal and information Processing, Mississippi 68(3):277–287. State. Du Z, Jeong MK, Kong SG. 2007. Band selection of hyperspectral images for Barbin DF, ElMasry G, Sun D-W, Allen P. 2012a. Predicting quality and automatic detection of poultry skin tumors. IEEE Trans Autom Sci Eng sensory attributes of pork using near-infrared hyperspectral imaging. Anal 4(3):332–9. Chim Acta 719:30–42. ElMasry G, Iqbal A, Sun D-W, Allen P, Ward P. 2011a. Quality classification Barbin DF, Elmasry G, Sun D.-W., Allen P. 2012b. Near-infrared of cooked, sliced turkey hams using NIR hyperspectral imaging system. J hyperspectral imaging for grading and classification of pork. Meat Sci Food Eng 103(3):333–44. 90(1):259–68. ElMasry G, Sun D-W, Allen P. 2011b. Non-destructive determination of Barbin DF, ElMasry G, Sun D-W, Allen P. 2013a. Non-destructive water-holding capacity in fresh beef by using NIR hyperspectral imaging. determination of chemical composition in intact and minced pork using Food Res Intl 44:2624–33. near-infrared hyperspectral imaging. Food Chem 138(2–3):1162–71. ElMasry G, Sun D-W, Allen P. 2012. Near-infrared hyperspectral imaging Barbin DF, ElMasry G, Sun D-W, Allen P, Morsy N. 2013b. for predicting colour, pH and tenderness of fresh beef. J Food Eng Non-destructive assessment of microbial contamination in porcine meat 110(1):127–40. using NIR hyperspectral imaging. Innov Food Sci Emerg Technol 17:180–91. ElMasry G, Sun D-W, Allen P. 2013. Chemical-free assessment and mapping of major constituents in beef using hyperspectral imaging. J Food Eng Barbin DF, Sun D-W, Su C. 2013c. NIR hyperspectral imaging as 117(2):235–46. non-destructive evaluation tool for the recognition of fresh and frozen-thawed porcine longissimus dorsi muscles. Innov Food Sci Emerg ElMasry G, Wang N, ElSayed A, Ngadi M. 2007. Hyperspectral imaging for Technol 18:226–36. nondestructive determination of some quality attributes for strawberry. J Food Eng 81(1):98–107. Baranowski P, Mazurek W, Pastuszka-Wo´zniak J. 2013. Supervised classification of bruised apples with respect to the time after bruising on the ElMasry G, Wang N, Vigneault C. 2009. Detecting chilling injury in red basis of hyperspectral imaging data. Postharvest Biol Technol 86:249– delicious apple using hyperspectral imaging and neural networks. 58. Postharvest Biol Technol 52(1):1-8. Barlocco N, Vadell A, Ballesteros F, Galietta G, Cozzolin D. 2006. Predicting Feng Y-Z, Sun D-W. 2013a. Near-infrared hyperspectral imaging in tandem intramuscular fat, moisture and Warner-Bratzler shear force in pork muscle with partial least squares regression and genetic algorithm for nonusing NIR spectroscopy. Anim Sci 83:111–6. destructive determination and visualization of Pseudomonasloads in chicken fillets. Talanta 109:74–83. Bebis G. 1994. Feed-forward neural networks. Potentials IEEE 13(4):27–31. Feng Y-Z, Sun D-W. 2013b. Determination of total viable count (TVC) in Burger J, Gowen A. 2011. Data handling in hyperspectral image analysis. chicken breast fillets by near-infrared hyperspectral imaging and Chemometr Intell Lab 108(1):13–22. spectroscopic transforms. Talanta 105:244–9. Cen H, Lu R, Ariana DP, Mendoza F. 2014. Hyperspectral imaging-based Feng Y-Z, Elmasry G, Sun D-W, Scannell AG, Walsh D, Morcy N. 2013. classification and wavebands selection for internal defect detection of Near-infrared hyperspectral imaging and partial least squares regression for pickling cucumbers. Food Bioprocess Technol 7(6):1689–1700. rapid and reagentless determination of Enterobacteriaceae on chicken fillets. Chao K, Mehl PM, Chen YR. 2002. Use of hyper- and multi-spectral Food Chem 138(2–3):1829–36. imaging for detection of chicken skin tumors. Appl Eng Agric 18(1): Fletcher JT, Kong SG. 2003. Principal component analysis for poultry tumor 113–9. inspection using hyperspecteral fluorescence imaging. Proceedings of the Chen G. 2011. Denoising of hyperspectral imagery using principal International Joint Conference on Neural Nerworks. p 149–53. component analysis and wavelet shrinkage. IEEE Trans Geosci Remote Fu Y, Guo P-Y, Xiang L-Z, Bao M, Chen X-H. 2013. Research on marine 49(3):973–80. and freshwater fish identification model based on hyper-spectral imaging Cheng J-H, Qu J-H, Sun D-W, Zeng X-A. 2014. Visiable/near-infrared technology. Proceedings of ISPDI 2013—Fifth International Symposium hyperspectral imaging prediction of textural firmness of grass carp on Photoelectronic Detection and Imaging. International Society for Optics (Ctenopharyngodon idella) as affected by frozen storage. Food Res Intl and Photonics. p 891015. 56:190–8. Geladi P. 1988. Notes on the history and nature of partial least squares (PLS) Cheng J-H, Sun D-W, Zeng X-A, Pu H-B. 2013. Non-destructive and modelling. J Chemometr 2(4):231–46. rapid determination of TVB-N content for freshness evaluation of grass carp Gowen AA, Taghizadeh M, O’Donnell CP. 2009. Identification of (Ctenopharyngodon idella) by hyperspectral imaging. Innov Food Sci Emerg mushrooms subjected to freeze damage using hyperspectral imaging. J Food Technol 21:179–87. Eng 93(1):7–12. Cluff K, Naganathan GK, Subbiah J, Lu R, Calkins CR, Samal A. 2008. Guy F, Prache S, Thomas A, Bauchart D, Andueza D. 2011. Prediction of Optical scattering in beef steak to predict tenderness using hyperspectral lamb meat fatty acid composition using near-infrared reflectance imaging in the VIS-NIR region. Sens. Instrum Food Qual Saf 2:189–96. spectroscopy (NIRS). Food Chem 127:1280–6. Cluff K, Naganathan GK, Subbiah J, Samal A, Calkins CR. 2013. Optical Hartigan JA, Wong MA. 1979. Algorithm AS 136: a K-means clustering scattering with hyperspectral imaging to classify longissimus dorsi muscle algorithm. Appl Stat 100–8. based on beef tenderness using multivariate modeling. Meat Sci 95:42–50. He H-J, Wu D, Sun D-W. 2014. Potential of hyperspectral imaging Coelho PA, Soto ME, Torres SN, Sbarbaro DG, Pezoa JE. 2013. combined with chemometric analysis for assessing and visualising tenderness Hyperspectral transmittance imaging of the shell-free cooled clam Mulinia distribution in raw farmed salmon fillets. J Food Eng 126:156–64. edulis for parasite detection. J Food Eng 117:408–16.



Recent advances in data mining techniques . . . Kaliramesh S, Chelladurai V, Jayas DS, Alagusundaram K, White NDG, Fields PG. 2013. Detection of infestation by callosobruchus maculatus in mung bean using near-infrared hyperspectral imaging. J Stored Prod Postharvest Res 52:107–11. Kamruzzaman M, ElMasry G, Sun D-W, Allen P. 2011. Application of NIR hyperspectral imaging for discrimination of lamb muscles. J Food Eng 104(3):332–40. Kamruzzaman M, ElMasry G, Sun D-W, Allen P. 2012a. Non-destructive prediction and visualization of chemical composition in lamb meat using NIR hyperspectral imaging and multivariate regression. Innov Food Sci Emerg Technol 16:218–26. Kamruzzaman M, Barbin D, ElMasry G, Sun D-W, Allen P. 2012b. Potential of hyperspectral imaging and pattern recognition for categorization and authentication of red meat. Innov Food Sci Emerg Technol 16:316–25. Kamruzzaman M, Elmasry G, Sun D-W, Allen P. 2013. Non-destructive assessment of instrumental and sensory tenderness of lamb meat using NIR hyperspectral imaging. Food Chem 141(1):389–96. Kang S, Lee K, Son J, Kim MS. 2011. Detection of fecal contamination on leafy greens by hyperspectral imaging. Procedia Food Sci 1:953–9. Karimi Y, Maftoonazad N, Ramaswamy HS, Prasher SO, Marcotte M. 2012. Application of hyperspectral technique for color classification avocados subjected to different treatments. Food Bioprocess Technol 5(1):252–64. Keller JM, Gray MR, Givens JA. 1985. A fuzzy k-nearest neighbor algorithm. IEEE Trans Syst Man Cybernet 15(4):580–5. Khojastehnazhand M, Khoshtaghaza MH, Mojaradi B, Rezaei M, Goodarzi M, Saeys W. 2014. Comparision of visible-near infrared and short wave infrared hyperspectral imaging for the evaluation of rainbow trout freshness. Food Res Intl 56:25–34. Kim I, Xu C, Kim MS. 2006. Poultry skin tumor detection in hyperspectral images using radial basis probabilistic neural network. In Advances in Neural Networks-ISNN 2006. Berlin, Heidelberg: Springer. p 770–6. Kokawa M, Sugiyama J, Tsuta M, Yoshimura M, Fujita K, Shibata M, Araki T, Nabetani H. 2013. Development of a quantitative visualization technique for gluten in dough using fluorescence fingerprint imaging. Food Bioprocess Technol 6(11):3113–23. Lee W-H, Kim MS, Lee H, Delwiche SR, Bae H, Kim D-Y, Cho B-K. 2014. Hyperspectral near-infrared imaging for the detection of physical damages of pear. J Food Eng 130:1–7. Li J, Rao X, Ying Y. 2011. Detection of common defects on oranges using hyperspectral reflectance imaging. Comput Electron Agric 78(1):38–48. Liu D, Qu J H, Sun D-W, Pu HB, Zeng X-A. 2013a. Non-destructive prediction of salt contents and water activity of procine meat slices by hyperspectral imaging in a salting process. Innov Food Sci Emerg Technol 20:316–23. Liu D, Sun D-W, Zeng X-A. 2013b. Recent advances in wavelength welection techniques for hyperspectral image processing in the food industry. Food Bioprocess Technol 7(2):307–23 Liu D, Sun D-W, Qu JH, Zeng X-A, Pu HB, Ma J. 2014. Feasibility of using hyperspectral imaging to predict moisture content of procine meat during salting process. Food Chem 152:197–204. Liu L, Ngadi MO, Prasher SO, Gariépy C. 2010. Categorization of pork quality using Gabor filter-based hyperspectral imaging technology. J Food Eng 99(3):284–93. Liu L, Ngadi MO. 2013. Detecting fertility and early embryo development of chicken eggs using near-infrared hyperspectral imaging. Food Bioprocess Technol 6(9):2503–13. Lorente D, Aleixos N, Gomez-Sanchis J, Cubero S, Blasco J. 2013a. Selection of optimal wavelength features for decay detection in citrus fruit using the ROC curve and neural networks. Food Bioprocess Technol 6(2):530–41. Lorente D, Blasco J, Serrano AJ, Soria-Olivas E, Aleixos N, Gómez-Sanchis J. 2013b. Comparison of ROC feature selection method for the detection of decay in citrus fruit using hyperspectral images. Food Bioprocess Technol 6(12):3613–19. Lu R. 2007. Nondestructive measurement of firmness and soluble solids content for apple fruit using hyperspectral scattering images. Sens Instrum Food Qual Saf 1(1):19–27. Mahesh S, Manickavasagan A, Jayas DS, Paliwal J, White NDG. 2008. Feasibility of near-infrared hyperspectral imaging to differentiate Canadian wheat classes. Biosyst Eng 101(1):50–7. Martens H, Næs T. 1998. Mulitivariate calibration. New York: Wiley. Næs T, Isaksson T, Fearn T, Davies T. 2004. A user-friendly guide to multivariate calibration and classification. Chichester, UK: NIR Publications.

Naganathan GK, Grimes LM, Subbiah J, Calkins CR, Samal A, Meyer GE. 2008a. Visible/near-infrared hyperspectral imaging for beef tenderness prediction. Comput Electron Agric 64(2):225–33. Naganathan GK, Grimes LM, Subbiah J, Calkins CR, Samal A, Meyer GE. 2008b. Partial least squares analysis of near-infrared hyperspectral images for beef tenderess prediction. Sens Instrum Food Qual Saf 2:178–88. Nakariyakul S, Casasent D. 2004. Hyperspectral feature selection and fusion for detection of chicken skin tumors. Proceedings of Optical Technologies for Industrial, Environmental, and Biological Sensing. International Society for Optics and Photonics. p 128–39. Nicola¨ı BM, Beullens K, Bobelyn E, Peirs A, Saeys W, Theron KI, Lammertyn J. 2007. Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: a review. Postharvest Biol Technol 46(2):99–118. Othman H. 2006. Noise reduction of hyperspectral imagery using hybrid spatial-spectral derivative-domain wavelet shrinkage. IEEE Trans Geosci Remote 44(2):397–408. Pannagou EZ, Papadopoulou O, Carstensen JM, Nychas GE. 2014. Potential of multispectral imaging technology for rapid and non-destructive determination of the microbiological quality of beef fillets during aerobic storage. Intl J Food Microbiol 174(17):1–11. Park B, Lawrence KC, Windham WR, Smith DP. 2006. Performance of hyperspectral imaging system for poultry surface fecal contaminant detection. J Food Eng 75(3):340–8. Pérez-Enciso M, Tenenhaus M. 2003. Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach. Hum Genet 112(5–6):581–92. Qiao J, Wang N, Ngadi MO, Gunenc A, Monroy M, Gariepy C, Prasher SO. 2007. Prediction of drip-loss, pH, and color for pork using a hyperspectral imaging technique. Meat Sci 76(1):1–8. Rajkumar P, Wang N, Eimasry G, Raghavan GSV, Gariepy Y. 2012. Studies on banana fruit quality and maturity stages using hyperspectral imaging. J Food Eng 108(1):194–200. Renard N. 2008. Denoising and dimensionality reduction using multilinear tools for hyperspectral images. IEEE Trans Geosci Remote 5(2):138–42. Rodr´ıguez-Pulidoa FJ, Barbin DF, Sun D-W, Gordillo B, González-Miret ML, Heredia FJ. 2013. Grape seed characterization by NIR hyperspectral imaging. Postharvest Biol Technol 76:74–82. Seber GAF, Wild CJ. 1989. Nonlinear regression. New York: Wiley. p 657–60. Selim SZ, Alsultan K. 1991. A simulated annealing algorithm for the clustering problem. Pattern Recognit 24(10):1003–8. Shahin MA, Symons SJ, Hatcher DW. 2014. Quantification of mildew damage in soft red winter wheat based on spectral characteristics of bulk samples: a comparison of visible-near-infrared imaging and near-infrared spectroscopy. Food Bioprocess Technol 7(1):224–34. Shao Y, He Y. 2009. Measurement of soluble solids content and pH of yogurt using visible/near infrared spectroscopy and chemometrics. Food Bioprocess Technol 2:229–33. Siripatrawan U, Makino Y, Kawagoe Y, Oshita S. 2011. Rapid detection of Escherichia coli contamination in packaged fresh spinach using hyperspectral imaging. Talanta 85(1):276–81. Sone I, Olsen RL, Sivertsen AH, Eilertsen G, Heia K. 2012. Classification of fresh Atlantic salmon (Salmo salar L.) fillets stored under different atmospheres by hyperspectral imaging. J Food Eng 109(3):482–9. Sun, D-W. 1999. Comparison and selection of EMC ERH isotherm equations for rice. J Stored Prod Res 35(3):249–264. Sun, D-W, Byrne C. 1998. Selection of EMC/ERH isotherm equations for rapeseed. J Agric Eng Res 69(4):307–315. Sun, D-W, Woods JL. 1994. Low-temperature moisture transfer characterstics of barley- thin layer models and equilibrium isotherms. J Agric Eng Res 59(4):273–283. Sun, D-W, Woods JL. 1997. Simulation of the heat and moisture transfer process during drying in deep grain beds. Drying Technol 15(10):2479–2508. Sun, D-W, Woods JL. 1993. The moisture-content relative-humidity equilibrium relationship of wheat - a review. Drying Technol 11(7):1523–1551. Suykens JAK, Vandewalle J. 1999. Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300. Talens P, Mora L, Morsy N, Barbin DF, ElMasry G, Sun D-W. 2013. Prediction of water and protein contents and quality classification of Spanish cooked ham using NIR hyperspectral imaging. J Food Eng 117(3):272–80.



Recent advances in data mining techniques . . . Tao FF, Peng YK, Li YY, Chao KL, Dhakal S. 2012. Simultaneous determination of tenderness and Escherichia coli contamination of pork using hyperspectral scattering technique. Meat Sci 90:851–7. Tao FF, Peng YK. 2014. A method for nondestructive prediction of pork meat quality and safety attributes by hyperspectral imaging technique. J Food Eng 126:98–106. Tu JV. 1996. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J Clin Epidemiol 49(11):1225–31. Valous NA, Mendoza F, Sun D-W, Allen P. 2009. Colour calibration of a laboratory computer vision system for quality evaluation of pre-sliced hams. Meat Sci 81(1):132–141. Wang LJ, Sun D-W. 2001. Rapid cooling of porous and moisture foods by using vacuum cooling technology. Trends Food Sci Tech 12(5–6):174–184. Wang S, Huang M, Zhu Q. 2012a. Model fusion for prediction of apple firmness using hyperspectral scattering image. Comput Electron Agric 80:1–7. Wang W, Li C, Tollner EW, Gitaitis RD, Rains GC. 2012b. Shortwave infrared hyperspectral imaging for detecting sour skin (Burkholderia cepacia)-infected onions. J Food Eng 109(1):38–48. Wei X, Liu F, Qiu Z, Shao Y, He Y. 2014. Ripeness classification of astringent persimmon using hyperspectral imaging technique. Food Bioprocess Technol 7(5):1371–80. Wold S, Ruhe A, Wold H, Dunn WJ. 1984. The collinearity problem in linear regression: the partial least squares (PLS) approach to generalized inverses. SIAM J Sci Stat Comput 5(3):735–43. Wu D, Sun D-W, He Y. 2012a. Application of long-wave near infrared hyperspectral imaging for measurement of color distribution in salmon fillet. Innov Food Sci Emerg Technol 16:361–72. Wu D, Shi H, Wang S, He Y, Bao Y, Liu K. 2012b. Rapid prediction of moisture content of dehydrated prawns using online hyperspectral imaging system. Anal Chim Acta 726:57–66.


Wu D, Wang S, Wang N, Nie P, He Y, Sun D-W, Yao J. 2012c. Application of time series hyperspectral imaging (TS-HSI) for determining water distribution within beef and spectral kinetic analysis during dehydration. Food Bioprocess Technol 6(11):2943–58. Wu D, Shi H, He Y, Yu X, Bao Y. 2013. Potential of hyperspectral imaging and multivariate analysis for rapid and non-invasive detection of gelatin adulteration in prawn. J Food Eng 119(3):680–6. Wu D, Sun D-W. 2013a. Application of visible and near infrared hyperspectral imaging for non-invasively measuring distribution of water-holding capacity in salmon flesh. Talanta 116:266–76. Wu D, Sun D-W. 2013b. Potential of time series-hyperspectral imaging (TS-HSI) for non-invasive determination of microbial spoilage of salmon flesh. Talanta 111:39–46. Wu D, Sun D-W, He Y. 2014. Novel non-invasive distribution measurement of texture profile analysis (TPA) in salmon fillet by using visible and near infrared hyperspectral imaging. Food Chem 145:417– 26. Wu J, Peng Y, Li Y, Wang W, Chen J, Dhakal S. 2012d. Prediction of beef quality attributes using VIS/NIR hyperspectral scattering imaging technique. J. Food Eng 109:267–73. Xing J, Bravo C, Jancsók PT, Ramon H, De Baerdemaeker J. 2005. Detecting bruises on ‘Golden Delicious’ apples using hyperspectral imaging with multiple wavebands. Biosyst Eng 90(1):27–36. Xu SY, Chen XF, Sun D-W. 2001. Preservation of kiwifruit coated with an edible film at ambient temperature. J Food Eng 50(4):211–216. Zhang X, Liu F, He Y, Gong X. 2013. Detecting macronutrients content and distribution in oilseed rape leaves based on hyperspectral imaging. Biosyst Eng 115(1):56–65. Zhu F, Zhang D, He Y, Liu F, Sun D-W. 2012. Application of visible and near infrared hyperspectral imaging to differentiate between fresh and frozen–thawed fish fillets. Food Bioprocess Technol 6(10):2931–7.