provide higher level of efficiency in the manufacturing industry through automation. Intelli-Stat is introducing a new product called Intelli-Stat Analytics software, ...
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp2717-2722 © Research India Publications. http://www.ripublication.com
Data Mining Framework for Test Time Optimization in Industrial Electronics Manufacturing Enterprise Thai Chuan Chee School of Computer Sciences, Universiti Sains Malaysia (USM), 11800, Penang, Malaysia. Ahmad Suhaimi Baharudin School of Computer Sciences, Universiti Sains Malaysia (USM), 11800, Penang, Malaysia.
Kamal Karkonasasi* School of Computer Sciences, Universiti Sains Malaysia (USM), 11800, Penang, Malaysia,
framework which uses an automated data mining process to automate the analysis, eliminating days of engineering efforts to perform the analysis manually. The literature review, research problem and questions are provided in Section 2 and research methodology is presented in Section 3. In Section 4, the results and discussion are illustrated in section 4. Finally, the conclusion is brought in section 5.
Abstract Intelli-Stat introduces a product that is able to discover previously unknown patterns from electronics test data to help electronics manufacturing enterprises to produce products faster. The product offered by Intelli-Stat is software that can provide the intelligence to engineers on how to optimize the test time in manufacturing. The software is installed in the test machine itself so that it can analyze the pattern and provide on the spot recommendation to engineers. A study was conducted to review the state of the art of data mining applications in manufacturing. A decision tree based data mining technique for electronics product was selected to form the Intelli-Stat data mining framework that is able to meet the requirement of Intelli-Stat. An experiment was conducted to validate the concept of the framework and a visual prototype was built to demonstrate the capability of the software.
Literature Review In The pain of having too much data in the manufacturing sector can be illustrated from the viewpoint of the manufacturing workers. Manufacturing workers are publishing test and measurement data more voluminously, in a variety of forms, and many times at high velocities [1]. Whenever there is any issue with the manufactured product, nothing could be more frustrating for the workers to know that the key to unlock a particular problem is available, but that it is lost somewhere in a mountain of data [2]. Manufacturers have created mountains of valuable data, but all too often, only a small percentage is actionable [3]. This challenge is only compounded as manufacturers struggle to keep up with the onslaught of information being pushed into their organization [3]. This voluminous data has been collected by a touch from a button on a voltmeter, the click of an optical camera, the beep of a bar code scanner or from some of the multitude of other test measurement and inspection routines that are daily occurrences in modern factories. When it comes to controlling quality of products, the oldest approach is to test all units produced and repair the defective ones [4]. In a linear manufacturing test flow, units are put in the test queue and then the units are tested one after another. If all units passed the test, the flow is smooth. If some units failed, the failed units have to be put aside in order not to interrupt the smooth flow. When many units failed, a bottleneck is said to have occurred. Failed units are piling up and not enough good units are produced. To clear the bottleneck, engineers must make the failed units pass by
Keywords: Data Mining Framework, Test Time Optimization, Industrial Electronics Manufacturing Enterprise
Introduction Intelli-Stat is a software company that is being established to provide higher level of efficiency in the manufacturing industry through automation. Intelli-Stat is introducing a new product called Intelli-Stat Analytics software, a new software that extend the successful trends of analytics in the field of banking, finance, insurance and telecommunication to provide users in the field of manufacturing with insights into their production quality. The software relies on a data mining module in the background to perform its analytics function. The underlying module is called the Intelli-Stat data mining framework. The framework allows users to uncover common patterns that make their products failed in their manufacturing process. Tests are run on every units of product to verify its performance before the product is shipped to customers. However, the tests may fail for many reasons which are often hidden or difficult to surface without tedious statistical analysis. The unique selling point of Intelli-Stat is its
2717
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp2717-2722 © Research India Publications. http://www.ripublication.com our target market. Various pattern detection techniques will be investigated to discover potential opportunities to introduce innovation element that can reduce the user’s pain. After suitable techniques are shortlisted, the research will use a product-oriented approach to organize the techniques so as to provide Intelli-Stat with an intelligent data mining software framework. The framework will be used by Intelli-Stat to guide the new product development effort. Secondary data (rather than primary data) will be used to answer the research questions. A review of the state of the art will be conducted to show how the research topic is related to the work that has gone before. The review will attempt to discover explicit recommendations from previous research that can be used to guide the research objectives. The research was carried out in five stages. The five stages are conducted in the order listed in Table 1. In the preliminary study stage (stage 1), a pilot study is conducted to narrow down the scope of the research topic. A small-scale prototype in the form of screenshot is shown to the respondents of the study as a trial run to iron out fundamental problems with the research design. This prototype contains preliminary information that will be refined in the next research stage. The outcome of the pilot study is used in the knowledge acquisition stage (stage 2). Data mining methodologies are explored to understand the strengths and weaknesses of each methodology. Existing applications of data mining in related industry are also studied to see how the methodologies have been used and how Intelli-Stat can leverage from the data mining models that have been demonstrated to work. A software framework based on data mining principles is proposed for the electronics manufacturing industry. In stage 3, the framework is validated against a small set of simulated data to obtain the output. The format of the input data is based on an example of an electronics product. The format contains some structural information that provide clue to how would the actual data possibly look like. With some common sense on the characteristics of consumer electronics product, simulated data points are generated randomly to mimic common patterns in the manufacturing database of the product. As the input data is fed through the framework, theoretical explanation will be illustrated to show the concept of operation of the framework. In stage 4, a visual prototype is designed to present the stepby-step process of presenting the output from the framework to the user.This visual prototype is more comprehensive than the small-scale prototype in stage 1. This visual prototype has more screenshots and the content of each screenshots has been refined after reviewing the current state of the art in stage 2 and stage 3. One representative customer of Intelli-Stat was interviewed to obtain feedback on the visual prototype. The representative customer will validate that the prototype and the framework has the potential to fulfill their manufacturing requirements. The goal is to get a market testing of the business idea of this research from the viewpoint of a potential customer. Data mining is a process comprising several steps. Figure 1 shows the CRISP-DM (Cross-Industry Standard Process for Data Mining) data mining process obtained from Smart Vision Europe (2013). The CRISP-DM process provides a structured approach to planning a data mining project. China
troubleshooting why the units failed. Troubleshooting is performed unit by unit, which is time-consuming. If the failure data is stored into a database, data analysis can be performed before troubleshooting starts. Engineers first extract data from the database and put it into Microsoft Excel format. Date pre-processing is performed to remove noises and to make the data more understandable. Next, selected data is arranged into a tabular format in rows and columns [5]. The table usually contains many rows and columns. Engineers have to use eye inspection coupled with some Excel shortcuts to try to find how many patterns exist in the table. This is a cognitive process that requires the eye to scan the rows from top to bottom. Patterns found are moved to a second table so that the first table will have fewer rows to look for patterns. At the end of this manual and iterative process, the result is produced to show how frequent each pattern occurred. Depending on the size of the table, the time taken to produce the result can range from a few hours to a few days. The outcome of the data analysis effectively sorts the units into groups based on patterns discovered. Troubleshooting can then be performed group by group, rather than unit by unit. The bottleneck can then be cleared faster. Constantly changing markets demand more differentiated products within shorter delivery times, especially in industrial electronics [6]. With increasing competition in the market, organizations can improve their operations by implementing advanced manufacturing practices [7]. Data collection is carried out to compile the result of testing over different time period, over different batches or under different test conditions. However, product engineer and test engineer typically use the data in the manufacturing test and perform only basic filtering and simple analysis. This is because they do not have enough time to run the data analysis as described in the previous paragraph. This rarely addresses the root cause of why the speed of a test is slow. As a result, efforts to shorten the test time often result in only incremental speed improvement. This causes the manufacturing test to run longer than what the market pressure is demanding. Engineers also use more time to troubleshoot the product quality issues due to data overload. Looking back at the macro level, manufacturing companies often faced the problem of not able to ramp up the production volume of a newly launched product as quickly as they want. From the knowledge exploration, a suitable conceptual model that can model the historical test data needs to be chosen to serve as a foundation to the software framework. This research attempts to answer the following research question and research objectives: What is the suitable data mining technique that can be used to detect test pattern? and What is the suitable data mining framework that can be used to detect test pattern? This research tried to identify various data mining techniques and its practical application in the industry and tried to recommend a suitable data mining framework for industrial electronics manufacturing enterprises that utilizes pattern detection techniques.
Research Methodology This research begins with an exploratory research approach to gain better insight into the problem faced by the customers in
2718
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp2717-2722 © Research India Publications. http://www.ripublication.com Telecom has used the CRISP-DM process to apply data mining tools to predict customer churn [8]. The first step in the process is to understand the business problem. The second step is to translate the business problem into a data mining application. In the domain of data mining, this can be translated to model the relationship between customer demographic profile and past spending behavior in order to predict the probability of spending. Regression can be used for predicting the dependent variable (signup rate) using the independent variables (profile, spending behavior). The independent variables are called predictor variables [9]. The third step is to prepare the data for mining. In many situations, the needed data are available but exist in different database or different format. Data transformation may also need to be applied on the data to make the data more relevant for analysis.
Data Mining Methods Various data mining techniques and tools has been explored in order to choose a suitable technique for use in Intelli-Stat. The techniques are described below. Association analysis: This analysis looks for groupings or patterns among a set of items. It is the discovery of association rules showing attribute-value conditions that occur frequently together in a given dataset [10]. For example, market basket analysis refers to a technique that generates probabilistic statements such as: if customers purchase coffee, there is a 0.2 probability that they also purchase bread. Many applications of association analysis are exploratory in nature, with a view to better understand groupings and patterns in the data [11]. Clustering: Cluster analysis is a technique to group large data into smaller subsets called clusters. An automatic clustering method is used by data miners for dimensionality reduction, cluster sampling [9]. Each cluster demonstrates some similar and predictable characteristics. Logistic Regression: Logistic regression extends the idea of multiple linear regression to the situation where the dependent variable, y, is discrete. Logistic Regression tries to estimate the posterior probability of a data using a maximum likelihood procedure. Normally, logistic regression is applied in binary classification problems. In this case, the data used to calculate the posterior probability is supposed to obey the Bernoulli distribution, where each data is drawn independently. Decision Tree: A decision tree has a tree-like structure, with a top level node branching into leaf nodes. The expansion of decision tree can result in a tree with multiple levels. The top level node is called the internal nodes. A decision tree classifies data in a top-down manner, starting from the root node and keep moving down according to the outcome of the tests at internal nodes, until a leaf node is reached and a class label is assigned. The simplest decision tree is a binary tree. A binary tree is a tree ordered such that each successor of a node is distinguished either as a left or a right child; no node has more than one left child or more than one right child. Otherwise it is a multivariate tree. Many algorithms are available to construct decision trees. Common algorithms are CHAID (chi-square automatic interaction detection), C4.5 pruning algorithm [12], C5.0 (based on Interactive Dichotomizer-3 with cross-validation and boosting capabilities) and CART (classification and regression tree). Artificial neural networks: Artificial neural networks [13] tries to mimic how human nervous system works. Artificial neural networks (ANN) consist of topologies and corresponding weights. The learning procedure of ANN is to tune weights intelligently after their topologies are designed. ANN are considered universal approximators of data because it is a "black box" model [14]. For example, ANN relies heavily on the weights in the topologies rather than a structured breakdown that is easy to understand in the case of a decision tree. ANN becomes more accurate with more data, as the ANN learns and adjusts with every data presented to it.
Table 1: Research Methodology 1. Preliminary Study • • • •
Review literatures on the problem faced by the industry and understand the current state of knowledge. Market and competitor analysis. Build a small-scale prototype. Conduct a pilot study based on information gathered.
2. Knowledge Acquisition • Perform a directed search of published works, including academic journals and trade publication that discusses the theory and presents empirical results in the area of data mining. • Shortlist and compare various pattern detection methods. • Choose an approach to build the data mining software framework 3. • • 4. • 5. • • • •
Experiment Identify the format of data Run the framework against simulated data Visual Prototype Development Design the screenshots of the software Data Collection Identify a representative sample that can represent the population Conduct interviews with identified sample Refine the screenshots of the software Get validation of the visual prototype from the identified sample
Figure 1: CRISP-DM Data Mining Process
2719
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp2717-2722 © Research India Publications. http://www.ripublication.com
Method Selection Criteria One major selection factor is the type of response variable involved, that is, whether the variable is categorical or continuous. If the response variable is categorical, then a data mining method based on classification modeling approach should be used. If the response variable is continuous, then a regression modeling approach should be used. Table 2 shows the summary of the modeling approaches, which is adapted from the work of Myatt and Johnson [14]. Other than the response variable type, data mining method may be selected on the basis of timeconsumption, error handling, accuracy, stability, complexity, or comprehensibility. Since the result usually differ widely with regard to these characteristics, finding the perfect technique is unlikely. Previous work has shown that the single best data mining technique that would be suitable for all datasets just does not exist. In most cases today, finding the perfect technique is a relatively pragmatic job. In selecting whichmethod to use, other practical considerations may need to be considered [15]. Practical considerations could be the availability of software and hardware at a particular point in time. In case when many methods are capable of mining the same data, the practitioner may choose one method that is deemed preferred or practical by him or her. Table 2: Summary of different data mining methods Method
Model Type
Independent variables
Comments
Linear regression
Regression
Any numeric
Assumes a linear relationship, easy to explain, quick to build Assumes the existence of mutually exclusive groups with common variances
Discriminant analysis
Classification
Any numeric
Logistic regression
Classification
Any numeric
Neural networks
Regression or classification
Any numeric
Decision tree
Regression or classification
Any
Figure 2: Intelli-Stat data mining framework
To illustrate the pattern detection feature of the framework, an experiment is conducted on a hypothetical customer. Consider Nokia Communications Sdn Bhd. Nokia makes ten units of phone per month, as shown in Table 3. Four tests are performed to verify that the phone is working properly, namely USB, WIFI, display and audio. If any of the tests failed, the overall status of the phone is deemed to fail. If all the tests passed, the overall status is deemed pass. The passing rate (yield) of the phone is 50 percent (five passed out of ten units). Conversely, the failure rate is 50 percent. Nokia is encountering a bottleneck in its manufacturing because many units of phones cannot be delivered to the customer. Nokia has to troubleshoot the failure reason for the defective units and fix the defects so that the bottleneck can be cleared. Nokia wants to know the reason behind the high failure rate. If there is a dominant pattern of failure, Nokia hope that by just fixing the dominant pattern instead of all the patterns, it can achieve a quick boost of the yield from the low of 50 percent to a much higher percentage. Nokia does not have data mining expert in house, so Nokia relies on Intelli-Stat fully automated data mining software to discover the patterns of failure. Using eye inspection, it is easy to deduce from the 10 rows x 4 columns table (Table 3) that the pattern “Fail, Fail, Pass, Pass” is the most frequently occurring (dominant) pattern of failure. “Fail, Fail, Pass, Pass” occurred two times in unit 1 and unit 2, whereas the three patterns in unit 3, 4 and 5 occurred only one time. Unit 6 to unit 10 are ignored because
Will calculate a probability Easy to explain Black box model Explanation of reasoning through use of a graphical top down tree
Intelli-Stat data mining framework In this study, we construct a data mining conceptual framework to analyze E&E manufacturing data specifically for the industrial electronics sector. We propose a decision tree approach, utilizing CART to explore the engineering data, and to infer the possible causes of fault in the manufacturing process. Figure 2 shows the Intelli-Stat data mining framework for industrial electronics manufacturing.Prior work that has a close resemblance to this framework is the work by Rietman et al. [16]. The system presented by Reitman et al. produces Pareto charts, which have a similar intention with Intelli-Stat. The system has several yield metrics, which is similar to the yield calculator module in Figure 2. The system has sensitivity analysis capability, which is called customer failure tolerance threshold in Figure 2.
2720
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp2717-2722 © Research India Publications. http://www.ripublication.com the overall status is pass (not fail). Intelli-Stat utilizes data mining methods as a software algorithm that can make deduction on the patterns in the same way. The first step in the data mining process is to perform data pre-processing. Intelli-Stat works by first performing data cleansing on the input data. The units that passed all tests will not be analyzed so unit 6 to unit 10 are deleted. The cleansed data is shown in Table 4.
Data from Nokia database has been transform into useful knowledge. Table 5 shows the pattern list and statistics (knowledge) mined from Nokia's manufacturing data. Table 5 is also the key value that Intelli-Stat delivered to its customers. Table 5 shows that the dominant pattern is pattern 1. Pattern 1 accounts for 40 percent of the failure of Nokia's phone and the pattern occurred two times. Referring to Table 3, this also means that 2 out of 10 rows are due to failure pattern1. Recall that the current yield of Nokia is 50 percent, which is 5 failures out of 10 units. Based on Table 5, IntelliStat recommends to Nokia to fix the dominant pattern. Nokia asked their engineers to focus their time on troubleshooting and fixing failure pattern 1. After failure pattern 1 is fixed, the failure occurrence will decrease from the current 5 failures to 3 failures. Consequently, the passing rate will increase from 5 units to 7 units. This will give a quick boost to the yield from 50 percent to 70 percent (7 out of 10 units).
Table 3: Manufacturing data from Nokia database Test Name
Overall Status
USB
Wifi
Display
Audio
Phone Unit 1
Fail
Fail
Pass
Pass
Fail
Phone Unit 2
Fail
Fail
Pass
Pass
Fail
Phone Unit 3
Fail
Pass
Fail
Fail
Fail
Phone Unit 4
Fail
Fail
Pass
Fail
Fail
Phone Unit 5
Fail
Fail
Fail
Pass
Fail
Phone Unit 6
Pass
Pass
Pass
Pass
Pass
Phone Unit 7
Pass
Pass
Pass
Pass
Pass
Phone Unit 8
Pass
Pass
Pass
Pass
Pass
Phone Unit 9
Pass
Pass
Pass
Pass
Pass
Name
Pattern
Pattern Code
Pass
Pattern 1
Fail, Fail, Pass, Pass
FFPP
40
2
Pattern 2
Fail, Fail, Fail, Pass
FFFP
20
1
Pattern 3
Fail, Fail, Pass, Fail
FFPF
20
1
Pattern 4
Fail, Pass, Fail, Fail Total
FPFF
20
1
100
5
Phone Unit 10
Pass
Pass
Pass
Pass
Table 5: Pattern list generated by Intelli-Stat framework
Table 4: Nokia database after data cleansing
%
Frequency
TestName USB
Wifi
Display
Audio
PhoneUnit1
Fail
Fail
Pass
Pass
PhoneUnit2
Fail
Fail
Pass
Pass
PhoneUnit3
Fail
Pass
Fail
Fail
PhoneUnit4
Fail
Fail
Pass
Fail
PhoneUnit5
Fail
Fail
Fail
Pass
Without insights provided by Intelli-Stat, Nokia may choose to randomly fix any failure on a first come first serve basis. A random pick is likely to provide a smaller increase in the yield (less than 70 percent). For example, Nokia may choose to fix failure pattern 2 instead. Pattern 2 only occurred once. After failure pattern 2 is fixed, the failure occurrence will decrease from 5 failures to 4 failures. Consequently, the passing rate will increase from 5 units to 6 units. This gives a yield of 60 percent (6 out of 10 units), which is less than the 70 percent yield achieved by fixing pattern 1. Similarly, choosing to fix pattern 3, 4 or 5 will also result in a 60 percent yield, since Pattern 3, 4 and 5 also occurred once. The case study shows some empirical result on electronics manufacturing to validate the practical viability of the IntelliStat data mining framework in the application to mobile phone. The proposed framework combines traditional statistical methods and data mining techniques to analyze the data. Following the conceptual framework, binary decision tree approach is proposed to solve the problem facing the low yield diagnosis. The result can help electronics engineers identify root causes when certain issues occur and provide information for decision makers to understand how to overcome the problem by using knowledge produced by the framework. The proposed framework can be fully automated by computer to produce the result of analysis in a matter of minutes. The data analysis process performed by human may take a few hours to a few days. This demonstrates that architecture based on data mining can be used as a powerful tool for emulating cognitive process of human analysts. To apply the proposed framework to a different market segment, modification to some parts of the framework will be
Results and Discussion Intelli-Stat constructs a decision tree based on Table 4. The type of decision tree chosen is CART (classification and regression tree), which is a binary decision tree. Binary decision tree is sufficient in this case because pass/fail is a dichotomous value. Each node in the tree only has two branches (pass or fail). For each node, the statistics on the number of pass/fail occurrence (n) and the percentage of the occurrence are calculated. For example, the top node (Node 0) summarizes the "Audio" column in Table 4 as having 5 rows of data, of which 3 passed and 2 failed. The nodes below the top node are called the child nodes. The child nodes contain a subset of data of their parent nodes. For example, node 1 contains 3 rows out of 5 rows from its parent (node 0). Node 2 contains the remaining 2 rows out of 5 rows from its parent (node 0). Some nodes have no data (n=0) and thus can be ignored. These nodes are node 7, 9, 11 and 14. The nodes of interest in this data modeling step are the terminal nodes (nodes 7 to 14). Excluding the nodes with no data, the remaining terminal nodes that are of interest are node 8, 10, 12 and 13. Each node in node 8, 10, 12 and 13 represents one unique pattern of failure for the phone.
2721
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 4 (2016) pp2717-2722 © Research India Publications. http://www.ripublication.com Ali Bagheri for his help and contribution in preparing and publishing this paper.
needed in some cases. This is due to the variations of the customer input data across different market segment. The framework assumes that the input variable and response variable are categorical data. If the variables are continuous data instead of categorical data, adjustment to the decision tree would be necessary since the decision tree proposed in the framework is optimized for binary data. An example of a modified framework for semiconductor market segment can be found in the framework proposed by Rietman [16].The data mining and knowledge discovery from database can be integrated into an end-user application to form an end-to-end solution.
References [1] [2] [3]
[4]
Conclusion In Intelli-Stat is offering data mining software designed with the needs of the manufacturing enterprises in mind. It turns the raw data that the customers store in their database into valuable information that can be used to speed up the time to market of their new products. The applicability of the framework with its data mining method shows that all research questions have been answered and all research objectives have been met by this research. The first research question is answered by exploring the techniques used for pattern detection in the data mining area. Techniques such as association analysis, clustering, logistic regression, artificial neural networks and decision tree were studied to understand their capabilities and limitations. The limitation on logistic regression is that it assumes the data to obey the Bernoulli distribution, whereas the distribution of data in this research is unknown. The decision tree provides a structured breakdown consisting of a hierarchy. Patterns can be derived by tracing the path traversed in the decision tree. Each node in the decision tree can store statistical information which can further describe the frequency of occurrence of a pattern. Therefore, the decision tree is selected as a suitable technique to detect test patterns in electronics manufacturing. The second research question is answered by first exploring the existing applications of data mining in manufacturing. Emphasis is given to applications which use the decision tree as part of their framework since the decision tree has been identified as a suitable technique in the first research objective. The objectives of the studies were centered around providing some insights behind problematic processes, such as identifying fault influencing factors, fault detection, fault diagnosis, finding defect patterns, finding failure patterns and finding the cause of low yield. After evaluating the design consideration of the framework used in the studies, the IntelliStat data mining framework is created to suit the purpose of detecting patterns in industrial electronics.
[5]
Bradicich, T., &Orci, S. (2012). The Moore's Law of Big Data. Instrumentation Newsletter, Q4 2012. Adams, L. (2001). Mining the world of quality data. Quality, 40(8), 36-40. Bhardwaj, S. (2010). Addressing Information Overload on the Shop Floor. Express Intelligent Enterprise. Köksal, G., Batmaz, İ., &Testik, M. C. (2011). A review of data mining applications for quality improvement in manufacturing industry. Expert Systems with Applications, 38(10), 13448-13467. Giudici, P., &Figini, S. (2009). Applied Data Mining for Business and Industry: Wiley Publishing.
[6]
[7]
[8]
[9] [10]
[11]
[12] [13] [14]
[15]
Acknowledgement The authors would like to thank the Universiti Sains Malaysia for the USM Research University Grant (RUI)[Account Number: 1001/PKOMP/811251], USM Short Term Research Grant [Account Number: 304/PKOMP/6312103], and Collaborative Research in Engineering, Science & Technology Center (CREST) for providing funding and support for this research. Special Thanks to Mr. Mohammad
[16]
2722
Helo,P.(2004).Managingagilityandproductivityinthee lectronicsindustry. Industrial Management & Data Systems, 104(7), 567-577. Xue, C. G., Liu, J. J., & Cao, H. W. (2013). Research on competition diffusion of the multiple-advanced manufacturing mode in a cluster environment. The Journal of the Operational Research Society, 64(6), 864-872. Liu, L., & Ding, H. (2011, 6-8 May 2011). Modeling China Telecom customer churn prediction based on CRISP_DM. Paper presented at the E -Business and E - Government (ICEE), 2011 International Conference on. Chattamvelli,R.(2009).Data Mining Methods. Oxford, UK: Alpha Science International Ltd. Peng, J. T., Chien, C. F., & Tseng, T. L. B. (2004). Rough set theory for data mining for fault diagnosis on distribution feeder. Generation, Transmission and Distribution, IEE Proceedings, 151(6), 689-697. Liao, T. W., & Triantaphyllou, E. (2007). Recent Advances In Data Mining of Enterprise Data: Algorithms and Applications (Vol. 6). Singapore: World Scientific Publishing Co. Pte. Ltd. Quinlan,J.R.(1993).C4.5:programs for machine learning (Vol.1):Morgan Kaufmann. Khanna, T. (1990). Foundations of neural networks: Addison Wesley. KohHianChye.(2005).Data Mining Applications for Small and Medium Enterprises. Singapore: Nanyang Technological University. Myatt, G. J., & Johnson, W. P. (2009). Making sense of data II: A practical guide to data visualization, advanced data mining methods, and applications: John Wiley & Sons. Rietman, E. A., Whitlock, S. A., Beachy, M., Roy, A., & Willingham, T. L. (2001). A system model for feedback control and analysis of yield: A multistep process model of effective gate length, poly line width, and IV parameters. Semiconductor Manufacturing, IEEE Transactions on, 14(1), 32-47.