Application of Rough Set Theory for Evaluating Polysaccharides Extraction Shuang Liu1 , Lijun Sun2 , Yurong Guo2 , Jialin Gao2 , and Lei Liu3 1
Xinhua School of Finance and Insurance Zhongnan University of Economics and Law, Wuhan, P.R.China 2 College of Food Engineering and Nutritional Science Shaanxi Normal University, Xi’an, P.R. China 3 College of Veterinary Medicine Gansu Agricultural University, Lanzhou, P.R. China {liushuanglaura,guoyurong730,sxsd2011,liuleigs}@163.com,
[email protected] Abstract. This paper reports an application of rough set theory for evaluating polysaccharides extraction from apple pomace. The importance of factors affecting polysaccharides yield is analyzed by the means of attribute reduction. The significances of four factors are obtained. It is found that extraction time and ratio of solution to sample are the dominant factors. The results show that rough set theory can effectively analyze and evaluate factors influencing polysaccharides yield from apple pomace and can be used in the analysis of extraction of functional components in foods. Keywords: Rough set theory; polysaccharides; extraction yield; influencing factors.
1
Introduction
Polysaccharides are active ingredients in plants that have many health benefits for people such as moderating cholesterol levels, lowering the blood glucose levels and antitumor effects [3]. One million tons of apple pomace, a rich source of polysaccharides, is produced every year in China. Extracting and making use of these polysaccharides effectively can bring added values of apple. Many methods of extraction of polysaccharides from apple pomace have been studied. Methods of analyzing the influencing factors on extraction yield include orthogonal test, two orthogonal universal revolving combination design, and response surface methodology. However, each of these methods has its drawbacks. In recent years, analyzing statistical and experimental data using rough set theory has been applied in many fields, such as disease diagnosis [14], economics [2], and other areas [12]. On the other hand, it has not been applied to analyze active components in food. This paper reports a rough set analysis of four factors affecting polysaccharides extraction yield from apple pomace. It is the first time that rough set theory is used in this area. Initial results show that rough set theory is, in fact, effective.
Corresponding author.
J.T. Yao et al. (Eds): RSKT 2011, LNCS 6954, pp. 354–359, 2011. c Springer-Verlag Berlin Heidelberg 2011
Application of Rough Set Theory for Evaluating Polysaccharides Extraction
2
355
Attribute Significance and Attribute Reduction
This section reviews the main ideas of attribute reduction in rough set theory [4,10,18,19]. In order to analyze and process data, rough set theory uses an information or a decision table to represent knowledge [10]. Formally, a decision table is given as follows: S =< U, C, D, V, f >, (1) where U is a finite nonempty set of objects, C is a finite nonempty set of condition attributes, D is a finite nonempty set of decision attributes, V is a nonempty set of values for attributes in C ∪ D, and f : U × (C ∪ D) −→ V is a complete information function that maps an object of U to exactly one value in V . In a decision table, different attributes have different significances. In order to identify the significances of some attributes or attributes sets, one may observe changes in the classification effectiveness when they are removed. If substantial changes are observed, then one may conclude that the attributes are significant. Suppose C and D are condition attributes and decision attributes, respectively. For a subset of attributes C ⊆ C, let γC (D) denote the effectiveness of C for classifying decisions defined by D. Examples of such measures include Pawlak measure of attributes dependency [10] and information-theoretic measures [15,16]. The significance of an attributes subset C ⊆ C about D is defined as (2) σ(C ) = γC (D) − γC−C (D). When C = {a}, the significance of attribute a (a ∈ C) about D is given by: σ(a) = γC (D) − γC−{a} (D).
(3)
According to the significance of individual attributes, one can rank attributes during an attribute reduction process. An attribute reduct is a minimal subset of attributes that has the same power or functionality of the entire set of attributes [10]. There are two basic search strategies of attribute reductions, the addition strategy and the deletion strategy [18]. The addition strategy starts with the empty set and consecutively adds one attribute at a time until we obtain a reduct, or a superset of a reduct. The deletion strategy starts with the full set and consecutively deletes one attribute at a time until we obtain a reduct [18]. By considering the properties of reducts, the deletion strategy always results in a reduct [21]. On the other hand, algorithms based on a straightforward application of the addition strategy only produce a superset of a reduction [6,9,11]. In order to solve this problem, many authors considered a combined method by reapplying the deletion strategy on the superset of a reduct produced by the addition strategy [1,13,20].
3
Analysis of Factors of the Extraction Yield of Polysaccharides
We apply rough set methodology for analyzing factors related to the extraction yields of polysaccharides and compare it with other methods.
356
3.1
S. Liu et al.
Results of Orthogonal Tests
Several studies report the extraction yield of polysaccharides from apple pomace [7]. Table 1 summarizes the results of the orthogonal experimental data of the four factors and three levels in the extraction of apple pomace polysaccharides. It can be seen that the changes in the extraction yield of polysaccharides vary greatly between different processes (6%-9%). The energy consumption and costs are not the same for different processes. Therefore, it is important to identify key factors affecting the extraction yield to in order to optimize the extraction process. Table 1. Result of Orthogonal Experiment of Polysaccharides Extraction from Apple Pomace Test # Ultrasonic Power Temperature Time(min) Ratio of Solution (W) (◦ C) to Sample (ml/g) (%) 1 160 60 60 20 2 160 70 90 40 3 160 80 120 60 4 180 60 90 60 5 180 70 120 20 6 180 80 60 40 7 200 60 120 40 8 200 70 60 20 9 200 80 90 60
3.2
Yields 6.01 9.19 8.07 7.12 8.81 7.14 9.26 6.75 8.22
Construction of a Decision Table
For rough set analysis, the first step is to construct a decision table to represent the data. The rows of the table stand for samples, i.e., different processes, the columns stand for attributes, and the attributes are divided into condition attributes (factors that affect the extraction yield) and decision attributes (the extraction yield of polysaccharides). For our analysis, we have U = {e1, e2, . . . , e8}, C= {ultrasonic power, extraction temperature, extraction time, ratio of solution to sample}, and D = {extraction yield of polysaccharides}, where C1 = ultrasonic power, C2 = extraction temperature, C3 = extraction time, and C4 = ratio of solution to sample. The second step is to discrete the data. There are many ways of discretization. For example, discretization intervals can be provided by experts in the field, or computed by a system automatically. We can also use a simple method based on equal width intervals [8]. From the data in Table 1, we can compare two consecutive data of an attribute, if the latter data increases, it is encoded 2; if the latter data remains unchanged, it is encoded 1; if the latter data decrease, it is encoded 0. The results are given in Table 2.
Application of Rough Set Theory for Evaluating Polysaccharides Extraction
357
Table 2. Decision Table of Polysaccharides Extraction from Apple Pomace Test Number C1 e1 1 e2 1 e3 2 e4 1 e5 1 e6 2 e7 1 e8 1
3.3
C2 2 2 0 2 2 0 2 2
C3 2 2 0 2 0 2 0 2
C4 2 2 1 0 2 1 0 2
D 2 0 0 2 0 2 0 2
Rough Set Analysis
We perform a rough set analysis from two perspectives, namely, constructing an attribute reduct and computing attribute significance. For constructing an attribute reduct, we adopt the deletion strategy discussed by Yao et al. [18]. Firs, we compute the positive region of the partition defined by the decision attributes with respect to various subsets of condition attributes. Let U/B denote the partition induced by a subset of attributes B ⊆ C and let P OSB (D) denote the positive region of D defined by the partition induced by attributes B. The positive region can be computed as follows: CX, (4) P OSC (D) = X∈U/D
CX = {Y ∈ U/C : Y ⊆ X}
(5)
For a detailed discussion, see Pawlak’s book [10]. For Table 2, we have following relationships: P OSC−{C1} (D) = P OSC (D), P OSC−{C2} (D) = P OSC (D), P OSC−{C1,C2} (D) = P OSC (D), P OSC−{C1,C2,C3}(D) = P OSC (D), P OSC−{C1,C2,C4}(D) = P OSC (D).
(6)
Therefore, it can be concluded that an attribute reduct of C about D (i.e., a relative reduct) is {C3, C4}, the core of C about D (i.e., relative core) is also {C3, C4}. Attributes C1 and C2 are redundant condition attributes. In other words, the extraction time and ratio of solution to sample are the most important factors affecting the extraction of apple pomace polysaccharides. A new probabilistic dependency measure can be used to optimize and evaluate attribute-based representations through computation of probabilistic measures of attribute reduct, core and significance factors [22]. Combining the method provided by Ziarko [22], we obtain the following significances of condition attributes C1, C2, C3, and C4 about D:
358
S. Liu et al.
σ(C1) = γC (D) − γC−{C1} (D) = 5/8 − 5/8 = 0, σ(C2) = γC (D) − γC−{C2} (D) = 5/8 − 5/8 = 0, σ(C3) = γC (D) − γC−{C3} (D) = 5/8 − 0 = 5/8, σ(C4) = γC (D) − γC−{C4} (D) = 5/8 − 4/8 = 1/8.
(7)
From the significances of the condition attributes (influencing factors), we can see the order of the significances of factors affecting polysaccharides extraction yield: extraction time > ratio of solution to sample > extraction temperature = ultrasonic power. The results are consistent with the orthogonal experimental results. In summary, rough set theory analysis enables us to determine the main factors affecting the extraction yield of polysaccharides. Since experiment design of the extraction of polysaccharide from apple pomace should take cost and time into account, it is critical to focus on the optimization of the two factors of extraction time and ratio of solution to sample.
4
Conclusions
There are numerous influencing factors on the extraction yield of apple pomace polysaccharides. In this paper, we reports a rough set analysis by treating the extraction yield of polysaccharides as the decision attributes and the influencing factors as condition attributes. It is found that extraction time and ratio of solution to sample form an attribute reduct. We may conclude that they are main influencing factors. This finding conforms to the results obtained via orthogonal experiment analysis. The initial results of rough set analysis suggest that rough set theory can be used to evaluate and predict apple pomace polysaccharides extraction process. As a future work, we plan to investigate applications of rough set theory in decision-making and decision support [5,17] in the domain of food science. Acknowledgment. This project was funded by China Agriculture Research System (CARS-28).
References 1. Bazan, J.G., Nguyen, H.S., Nguyen, S.H.: Rough set algorithms in classification problem. In: Polkowski, L., Tsumoto, S., Lin, T.Y. (eds.) Rough Set Methods and Applications, pp. 49–88 (2000) 2. Cao, Y., Chen, X.H., Wu, D.D.: Early warning of enterprise decline in a life cycle using neural networks and rough set theory. Expert System with Applications 38, 6424–6429 (2011) 3. Crystal, L.J., Tina, M.D., Lisa, K.T.: Pectin induces apoptosis in human prostate cancer cell: correlation of apoptotic function with pectin structure. Glycobiology 17, 805–819 (2007)
Application of Rough Set Theory for Evaluating Polysaccharides Extraction
359
4. Fu, Q.: Methods of Data Processing and Agricultural Applications. Peking (2006) 5. Herbert, J.P., Yao, J.T.: Criteria for choosing a rough set model. Computers and Mathematics with Applications 57, 908–918 (2009) 6. Jensen, R., Shen, Q.: A rough set-aided system for sorting WWW bookmarks. In: Zhong, N., Yao, Y., Ohsuga, S., Liu, J. (eds.) WI 2001. LNCS (LNAI), vol. 2198, pp. 95–105. Springer, Heidelberg (2001) 7. Li, J.Y., Guo, Y.R.: Optimization of ultrasonic wave-assisted extraction process of maluspumila polysaccharides from apple cold-break peel pomace. Academic Periodical of Farm Products Processing 9, 30–32 (2010) 8. Liu, H., Rudy, S.: Feature selection via discretization. IEEE Transaction on Knowledge and Data Engineering 9, 642–645 (1997) 9. Miao, D., Wang, J.: An information representation of the concepts and operations in rough set theory. Journal of Software 10, 113–116 (1999) 10. Pawlak, Z.: Rough Sets, Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991) 11. Shen, Q., Chouchoulas, A.: A modular approach to generating fuzzy rules with reduced attributes for the monitoring of complex systems. Engineering Applications of Artificial Intelligence 13, 263–278 (2000) 12. Wang, F., Hasbani, J.G., Wang, X.: Identifying dominant factors for the calibration of a land-use cellular automata model using Rough Set Theory. Computers, Environment and Urban Systems 35, 116–125 (2011) 13. Wang, J., Wang, J.: Reduction algorithms based on discernibility matrix: the ordered attributes method. Journal of Computer Science and Technology 16, 489–504 (2001) 14. Wang, P.C., Su, C.T., Chen, K.H.: The application of rough set and Mahalanobis distance to enhance the quality of OSA diagnosis. Expert System with Applications 38, 7828–7836 (2011) 15. Yao, Y.Y.: Probabilistic approaches to rough sets. Expert Systems 20, 287–297 (2003) 16. Yao, Y.Y.: Information-theoretic measures for knowledge discovery and data mining. In: Karmeshu (ed.) Entropy Measures, Maximum Entropy and Emerging Applications, pp. 115–136. Springer, Berlin (2003) 17. Yao, Y.Y.: The superiority of three-way decisions in probabilistic rough set models. Information Sciences 181, 1080–1096 (2011) 18. Yao, Y.Y., Zhao, Y., Wang, J.: On Reduct Construction Algorithms. In: Wang, G.Y., Peters, J.F., Skowron, A., Yao, Y. (eds.) RSKT 2006. LNCS (LNAI), vol. 4062, pp. 297–304. Springer, Heidelberg (2006) 19. Zhang, W.X., Wu, Z.W., Liang, J.Y.: Theory and method of rough set. Peking (2001) 20. Zhao, K., Wang, J.: A reduction algorithm meeting users’ requirements. Journal of Computer Science and Technology 17, 578–593 (2002) 21. Ziarko, W.: Rough set approaches for discovering rules and attribute dependencies. In: Kloesgen, W., Zytkow, J.M. (eds.) Handbook of Data Mining and Knowledge Discovery, Oxford, pp. 328–339 (2002) 22. Ziarko, W.: Probabilistic approach to rough set. International Journal of Approximate Reasoning 49, 272–284 (2008)