Data-Mining-Based System for Prediction of Water ... - Semantic Scholar

14 downloads 63768 Views 573KB Size Report
Abstract—Fault monitoring and prediction is of prime impor- tance in process .... intelligence tools. ... iteration, and then it represents the “best” leaf as an IF–THEN ..... [11] H. Seo, J. Yang, and J. Choi, “Building intelligent systems for mining.
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 53, NO. 2, APRIL 2006

593

Data-Mining-Based System for Prediction of Water Chemistry Faults Andrew Kusiak, Member, IEEE, and Shital Shah

Abstract—Fault monitoring and prediction is of prime importance in process industries. Faults are usually rare, and, therefore, predicting them is difficult. In this paper, simple and robust alarm-system architecture for predicting incoming faults is proposed. The system is data driven, modular, and based on data mining of merged data sets. The system functions include data preprocessing, learning, prediction, alarm generation, and display. A hierarchical decision-making algorithm for fault prediction has been developed. The alarm system was applied for prediction and avoidance of water chemistry faults (WCFs) at two commercial power plants. The prediction module predicted WCFs (inadvertently leading to boiler shutdowns) for independent test data sets. The system is applicable for real-time monitoring of facilities with sparse historical fault data. Index Terms—Alarm system, data merging, data mining, fault prediction, hierarchical decision making, power plant, time lag, water chemistry.

I. I NTRODUCTION

E

ARLY prediction of faults is of prime importance in electric power and other process industries. Faults are defined as adverse events (situations) with significant impact on the system status [1], [2]. Faults may cause a meaningful shift in the system operations even when the values of the individual parameters are acceptable. Generally, any process remains normal most of the time, whereas faults usually make up a small fraction of the operating range [1]. Accurate prediction of faults is difficult due to unbalanced volume of data for normal and faulty states. In an extreme case, no prior fault data could be available. This lack of data can be handled by merging data sets [3], [4] or knowledge [5], [6]. The data-merging approach is important in the process industry due to imprecise data, sensor failures, lack of historical data, rarity of faults, and so on. Data mining deals with the discovery of interesting and previously unknown patterns in the data sets [7]. Data mining has been applied to various applications, e.g., semiconductor manufacturing [8], [9], e-prognosis and process diagnosis [10], and information extraction from web pages [11]. Data-mining algorithms emphasize a specific data vector rather than the population of vectors, providing an avenue for customization [12]. For example, a decision-tree (DT) algorithm [13] extracts knowledge representing the underlining patterns in data. The Manuscript received June 3, 2004; revised July 29, 2005. Abstract published on the Internet January 25, 2006. The authors are with the Intelligent Systems Laboratory, Department of Mechanical and Industrial Engineering, University of Iowa, Iowa City, IA 52242-1527 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TIE.2006.870706

knowledge extracted by a data-mining algorithm is used to predict outcomes (e.g., faults) for a data vector (set of parameter values). The data in the process industry are noisy and temporal, and, therefore, predicting a fault based on a limited number of parameters and data points is not reliable. Making predictions based on the data over a certain time lag can mitigate these shortcomings. This concept is similar to the statistical process control (SPC) [14] and predictive maintenance methodologies. An SPC system generates alarms based on the time lag performance of the system. In this paper, data merging and learning along with a hierarchical decision-making approach is proposed to generate alarms. The background of the water chemistry faults (WCFs) and the fault literature are presented in Section II. A general approach for the development of fault prediction is presented in Section III. The proposed approach has been demonstrated to predict faults due to water chemistry problems at two commercial power plants (Section IV). Conclusions and future research directions are discussed in Section V. II. B ACKGROUND A power plant involves various systems, e.g., coal/fuel feeder system, condenser system, turbine system, and feedwater system. The fuel (coal) is fed into the boiler by the coal feeders to generate heat that is conducted through the boiler walls to water (boiler water system) flowing through the boiler tubes. The water becomes steam flowing to a turbine system where the heat energy is converted into electrical energy. The used steam flows through the secondary heaters either to heat the steam flowing to the turbine or to preheat the water flowing into the boiler. From the secondary heaters, the steam is condensed in a cooling tower and then converted back into water. This water is then checked for impurities before feeding (feedwater system) it back into the boiler system. There is a loss of water during the cycle, which is substituted by river water. The river water is initially treated to maintain acceptable levels of impurities. A fault in any of these systems may lead to temporary shutdowns and, in some cases, may have catastrophic consequences. Thus, monitoring various systems to predict impeding problems is of importance. Water-treatment regimes control the steam purity, deposits, and corrosion to produce high-quality steam [15]–[17]. The water contains impurities such as dissolved gases (e.g., oxygen, carbon dioxide, and nitrogen), sand, silt, particles of organic matter, and ions of dissolved mineral impurities (e.g., bicarbonate, carbonate, manganese, sodium, silica, chloride, iron, phosphate, and sulfate) [15]. The temperature and pressure

0278-0046/$20.00 © 2006 IEEE

594

at which the boiler operates affect the behavior of various impurities, leading to the formation of scales (calcium carbonate, magnesium hydroxide, magnesium silicate, iron oxide, calcium phosphate, magnesium phosphate, and calcium sulfate), corrosiveness, and saturation. The water impurity can lead to rapid degradation of the equipment, lower availability, inefficient operations, and other unwanted events [18]. The parameters collected for the feedwater system are dissolved oxygen, feedwater pH, oxygen scavenger concentrations, conductivity, and so on [15]. Typically, the data for the feedwater system are collected at condensate pump discharge, condensate polisher outlet, deaerator inlet and outlet, and economizer inlet. The data points for the boiler water system are phosphate concentrations [19], silica concentrations [19], boiler water pH, and so on [15]–[17]. Water pH imbalances can severally hamper the boiler operations causing corrosion, scales, and water tube damages [15]. Appropriate water-alkalinity levels will ensure proper chemical reactions and hence normal operations of the boiler. The feedwater pH is generally maintained at slight alkalinity (pH in the range of 8.3–10), whereas the boiler water pH is more alkaline (pH in the range of 9–11) to prevent acidic outbreaks. Oxygen reacts with magnetite (Fe3 O4 , a protective coating) or cuprous oxide (Cu2 O) to form ferric oxide (i.e., rust) or cupric oxide. The corrosive effect of the dissolved oxygen in the boiler operations is undesirable [15]. At higher temperatures, nodules of corrosion products and sites are formed on the boiler tubes. The depositions inhibit heat transfer across the tube boundaries and reduce the efficiency of the boiler. Deaeration is a mechanical process used to remove the dissolved oxygen. It is followed by the addition of chemical oxygen scavengers such as hydrazine (N2 H4 or N2 H6 CO), chelant, and carbohydrazide. The performance of hydrazine [15] is enhanced at increased temperature and pH, but excessive hydrazine may lead to the formation of ammonia, a by-product that aids corrosion process. A phosphate treatment is commonly used to remove dissolved mineral impurities [15]–[17], [19]. The phosphate ion in trisodium phosphate (Na3 PO4 ) is extremely effective for conditioning the boiler water in an alkaline pH range. Phosphate reacts with scale-forming compounds to form soft sludge that is easier to remove. Other treatments are chelants and polymer treatments, caustic treatments, all-volatile treatments, and oxygenated treatments [15]. The requirements for satisfactory steam purity, minimum corrosion, and minimum deposition are interlinked, forming a multiobjective function. In addition, the chemicals added to prevent the WCF might interact with each other or exhibit different properties at different operating conditions. Consequently, the development of an optimal water-treatment plan involves a tradeoff. Thus, the monitoring of all the water chemistry parameters [15], [20] is imperative not only in terms of their individual operating ranges but also in their interactions. Traditionally, water samples are retrieved three to five times a day from the water systems and tested for various impurities as well as pH values [15]. This system monitors the trends but not in real time. For in-process intervention, continuous mon-

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 53, NO. 2, APRIL 2006

itoring of the water chemistry parameters is essential. Unscheduled shutdowns are costly, because productivity is lost, and, thus, a system that can predict in advance the developing WCF is needed. Simulation [21], mathematical [22], and expert system [23]–[25] models can be employed to monitor the water chemistry of the feedwater and boiler water systems. Nagy et al. [26] studied the primary circuit water chemistry data at the VVER power plant. Based on the flowchart of the operations, they evaluated the formation kinetic of corrosion products and radioactive substances. A correlation analysis (correlation coefficient and function) was applied to determine the process interdependencies, whereas the random/ deterministic process nature was evaluated by the range-perscatter method. A gray-system modeling methodology (using differential equations) for the prediction and characterization of boiler water R values (sodium/phosphate mole ratio of the boiler water) was developed by Li et al. [27]. The methodology can potentially produce effective system diagnosis but fails to incorporate the interaction effects. Expert systems are widely used in monitoring WCFs [23]–[25]. These systems incorporate the domain knowledge of the possible fault scenarios and the parameter operating ranges. A diagnostic knowledge-based system PROP was developed for online monitoring and detection of the cycle water pollution phenomena [24]. Hardware and software architecture and verification and validation approaches were presented. Sopocy et al. [25] proposed a cycle chemistry and demineralizer operation advisor system with online and laboratory analyses to examine parameter trends. This system includes a modular expert-system architecture, user interface, information transfer protocols, and sensor validation techniques. The expert system was tested on the EPRI Research Project 2712-3 data. The expert diagnostic system discussed in [23] minimized costs that are related to chemical control with increased confidence in the diagnosis quality of water chemistry conditions. The SMART chemWORKS [21] is a computer system for monitoring WCFs in nuclear plants. It is a real-time system involving data transfer, application logic, off-site alert/alarm, and user interface modules. The application logic utilizes the client data, simulator models, scenario library, and artificial intelligence tools. The simulators are the theoretical models that could act as virtual sensors. A pattern-recognition technique compared the incoming data vectors to the defined scenario library. The libraries consist of upset scenarios of chemical concentrations, pH, and conductivities. These upset scenarios have been either observed or based on the domain knowledge. The system produces alarms based on the correlation of process data with the library scenarios. The existing expert systems are important decision-making tools. However, the process dynamics, the interactions between parameters, and the multiparameter analysis are not adequately explored by the expert systems. None of the existing expert systems sufficiently addresses the scarcity of data, the timebased patterns, and the complex parameter interactions. The existing systems are not able to adjust the granularity of the inputs and predictions, which is essential for an unstructured and ill-defined process environment. Thus, there is a need to develop a simple but robust alarm system based on the latest

KUSIAK AND SHAH: DATA-MINING-BASED SYSTEM FOR PREDICTION OF WATER CHEMISTRY FAULTS

Fig. 1.

Alarm system architecture.

Fig. 2.

Fault and normal operating cycles.

developments in data mining especially for applications with sparse or nonexisting fault history. III. P ROPOSED A PPROACH The proposed system (Fig. 1) is modular, and it consists of data prepossessing, learning, knowledge base, prediction, alarm generation, and display modules. This approach can be applied to the prediction of the water chemistry and other faults. A. Data Preprocessing The retrieved raw data are preprocessed (see Fig. 1) for inconsistencies such as bad values, sensor errors, and sensor calibrations. The data from various facilities may not be compatible, because the sensors used could be of different types or could be calibrated differently. Thus, the heterogeneous data are transformed into a uniform format, i.e., using identical measuring units for all parameters. A data historian is scanned to identify the fault states of interest such as WCF shutdowns and boiler shutdowns. Once such fault states are identified, data are retrieved for prior x time intervals. The retrieved data are labeled as stable and volatile to form a decision parameter. Considering various transition stages (i.e., moderately stable, transition, moderately volatile, and so on) to the decision parameter value could further enrich the information content. Data labeling is performed by analytical approaches such as clustering and/or domain expertise. The labeled data set describes a complete fault cycle. The fault cycle generally begins with the stable period (i.e., normal operations) and then transits into the volatile period data vectors that lead to

595

a shutdown (Fig. 2), whereas the normal cycle represents stable operating conditions from startup to shutdown, with no extreme fault (WCF) events. If multiple fault cycles are available, then each cycle forms a separate data set. For sparse fault cycles, all the data from the individual facilities are merged. A stratified sampling procedure is used to create multiple data samples. B. Learning and Knowledge Base Modules Multiple preprocessed data samples are provided to the learning module (Figs. 1 and 2) for knowledge discovery. A data-mining algorithm, e.g., the standard DT algorithm [13], is applied to each individual data set. The DT algorithm uses information gain to create a classifier following the depth-first strategy. In this paper, the PART decision list from WEKA (with default settings) is used. It employs the separate-and-conquer strategy to build a partial tree at each iteration, and then it represents the “best” leaf as an IF–THEN rule [28], e.g., Rule 1: IF Boiler Water Silica > 0.047971 AND Dissolved Oxygen > 0.017694 THEN Decision = Stable [41 052/1 (Total number of examples represented by this rule/Number of misclassified examples)]. Each rule includes a premise and a conclusion and is assigned a strength. The rule strength describes the number of observations that matches the rule condition and the conclusion. The learning module extracts rule sets (knowledge bases) from the preprocessed data. These rule sets describe the patterns characterizing the stable and volatile periods (faults). The learning module is important, because the other modules depend on it. The quality of the knowledge bases is evaluated with a tenfold

596

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 53, NO. 2, APRIL 2006

Fig. 3. Learning module.

Fig. 4. Prediction module.

cross-validation scheme [29] and testing. The knowledge bases are also checked for consistency with the domain logic (e.g., the first-principle models). Information is integrated across different facilities by either data or knowledge merging. With f being the number of facilities and Sf the number of data sets per facility, the possible two-data-set merging results in S1 ∗ S2 data sets. Thus, for two facilities with ten data samples from each, a total of 100 merged data sets are formed (Fig. 3). The logic of merged data sets can be easily extended to more than two facilities. A data-mining algorithm is used to extract knowledge from each merged data set. For example, each facility has its own operating zone (solution space). A solution space from one facility may contribute or reinforce the existing properties of another facility. Merging data from different facilities enrich the solution space. For example, the overlapping space regions may be strongly reflected in the rules (higher rule strength), whereas the distinct regions may produce rules indicating the presence of distinct conditions. This type of global rule formation may provide tighter bounds of the parameters in the rules. Thus, if a region in facility 1 is represented by X > 5 and another distinct region for facility 2 is represented by X < 8, then the actual distinct region for facility 1 would be 5 < X < 8 in the merged data set. Such information may be difficult to obtain without data merging. In addition, merging data is beneficial in areas where the fault history is sparse. Thus, the extracted rules not only represent faults but also take into account the operating conditions of different facilities. The identified operating and fault

zones obtained through data merging are effective in controlling the process. C. Prediction and Alarm Modules The similarity among all knowledge bases is analyzed. The identical knowledge bases are merged, and the relative strength of each new knowledge base is calculated. Thus, if four knowledge bases are identical, they form a single knowledge base with a relative strength of four. The unique knowledge bases (i.e., rules) form the prediction module (Figs. 1 and 4). 1) Level 1 Decision Making: The preprocessed test data vector is evaluated by the different knowledge bases. Each rule in a knowledge base is applied to the test data vector to construct a binary outcome, indicating the occurrence of rule violations (0 = violation and 1 = perfect match) (Table I). The aggregate totals for the stable and volatile decisions are computed. Applying a simple majority voting scheme produces a prediction (stable or volatile) for each test data vector. Confusing signals from various rules that result into a tie are predicted as a “shift.” This forms the level 1 decision making. 2) Level 2 Decision Making: The level 1 decision making generates predictions for each individual knowledge base, whereas the level 2 decision making integrates the predictions from all the knowledge bases (per facility). The relative strength of each knowledge base is multiplied by the binary outcome representing the predictions from level 1. For example, for time interval 1 in Table II, the prediction of KB1 from level 1 is stable, and, therefore, St-L1 = 1, V-L1 = 0, and S-L1 = 0.

KUSIAK AND SHAH: DATA-MINING-BASED SYSTEM FOR PREDICTION OF WATER CHEMISTRY FAULTS

597

TABLE I EXAMPLES FOR DECISION MAKING AT LEVEL 1

TABLE II EXAMPLES FOR DECISION MAKING AT LEVEL 2

The strength of KB1 is four, resulting into a weight of St-L1 for KB1 = 1 ∗ 4 = 4. The weights of the stable, volatile, and shift decisions (corresponding to the water chemistry status) are computed by summing the product of the relative strength and the binary outcome for each knowledge base. For example, for time interval 2 the weight of the stable status is 4(St-L2 = 1 ∗ 4 + 0 ∗ 1 + 0 ∗ 2 + 0 ∗ 2 + 0 ∗ 1), for volatile is 6(V-L2 = 0 ∗ 4 + 1 ∗ 1 + 1 ∗ 2 + 1 ∗ 2 + 1 ∗ 1), and for shift is 0(S-L2 = 0 ∗ 4 + 0 ∗ 1 + 0 ∗ 2 + 0 ∗ 2 + 0 ∗ 1). The level 2 decision-making logic listed below is applied to the status weight. It is based on the simple majority rule borrowed from soft computing. Priority was set for stable states (more than half of the total decisions) followed by volatile states (stable states less than half and volatile states greater than shift states). This logic represented common operations belief. Other voting/performance evaluation schemes could be employed. At time interval 1 in Table II, all knowledge bases predicted the outcome as stable (St-L2 = 10, V-L2 = 0, and S-L2 = 0) without any rule violations, resulting in a level 2 “stable” decision. At time interval 5 (Table II), the outcome was predicted as volatile (St-L2 = 0, V-L2 = 10, and S-L2 = 0) by all knowledge bases, resulting in a level 2 “volatile” decision. Time intervals 2 and 3 represent the various scenarios leading to a

level 2 volatile decision. Time interval 4 (Table II) represents a situation where neither stable nor volatile decision was reached (St-L2 = 0, V-L2 = 0, and S-L2 = 10), and, consequently, the conservative level 2 “shift” decision was assigned. 3) Level 3 Decision Making: The level 2 decision making provides an aggregate prediction. The temporal nature of the data is considered in the proposed system. Depending on the application type, the required sensitivity, and the response time, z previous predictions are compared with the user-defined alarmissuance template. If any of the patterns defined in the template matches the immerging patterns in the z previous predictions, an alarm is issued. This modified heuristic-based voting approach forms the third level of decision making (Table III). For example, if the probability of ten consecutive level 2 volatile decisions (in z = 10 previous predictions) is small, then such information can be used to label the developing situation as “extremely abnormal.” Alternatively, if the required time lag is six time intervals (Table III), then the decision-making logic for level 3 is listed next. If for two or more out of six time intervals the level 2 decisions were volatile, then the system status is labeled “abnormal” (see Table III for this and two other scenarios). The decision-making logic at each level is such that as soon as any

598

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 53, NO. 2, APRIL 2006

TABLE III DECISION-MAKING LOGIC AT LEVEL 3

TABLE IV SCENARIOS FOR DECISION MAKING AT LEVEL 3

TABLE V PARAMETERS AND THEIR OPERATING RANGES

condition is satisfied, the prediction is made and the test data vector exits the decision-making logic at that level. Some of the scenarios occurring at level 3 are illustrated in Table IV. For example, scenario 1 fails to match the first and second conditions of the level 3 logic. However, it matches the third condition. Thus, the system status at time interval 6 is “normal” (see Table IV). The thresholds for the levels 2 and 3 decision-making logic are instances selected (by the domain experts) from many possible values. A user can alter the levels 2 and 3 decision-making logic to meet the required sensitivity. The prediction module may use knowledge bases from the same facility, different facilities, and merged data sets. Thus, for knowledge bases from n facilities and m knowledge bases obtained from merged data sets, the number of alarms is n ∗ m. Various schemes could be used to determine the final alarm. Alarm fusion is of particular importance in cases where the historical fault data is not available. D. Interface Module The alarms generated by the proposed system can be displayed at various locations, e.g., the control room. Textual, graphical, local, and remote display and operating scenarios could be considered. Three- to five-level color-coded alarms can be used to alert the developing situations to the operating

personnel. Recommended control actions can assist operations personnel. The visual display could include fault history, trends, synoptic chart of the plant, and so on. The level of visual details can be determined based on the application domain, process complexity, and user requirements. The visual displays could be enhanced by the drill-down–roll-up capability (similar to those used in data warehouses).

IV. E XPERIMENTAL R ESULTS A. Data Sets and Preprocessing Module The WCF data used in this paper were collected at two 250-MW power plants, labeled PP0 and PP1. The continuous monitoring water systems used hydrazine and phosphate treatments to reduce/maintain the water impurities. The parameters retrieved from the data historian were based on the literature (see Section II) and domain experts. The parameters are measured continuously by sensors at various points of the water system. The parameters and their operating ranges used in this paper are provided in Table V. The dynamics of the process are illustrated by the graphs in Fig. 5. Most of the parameters are in the acceptable operating range even during the WCF. However, combinations of parameters are responsible for the WCF.

KUSIAK AND SHAH: DATA-MINING-BASED SYSTEM FOR PREDICTION OF WATER CHEMISTRY FAULTS

Fig. 5.

599

Visual representation of parameters. TABLE VI TRAINING AND TEST DATA SETS FOR PP0 AND PP1

A training WCF cycle was selected based on domain expertise and the confirmed reasons for the shutdowns at each power plant. The training data set collected at minute-long intervals for PP0 (Table VI) comprises two sections. The first section represents the stable period, whereas the second corresponds to the volatile period. The domain expert labeled the data set (PP0_train1) into stable and volatile portions. Four different test data sets (Table VI) (PP0_test_1 to PP0_test_4) were retrieved from the historian at 5-min intervals. Each test data set symbolized a complete normal operating cycle. Clustering analysis was used to label the training data set from PP1 (Table VI), collected at 1-min intervals. The k-means clustering algorithm [30] from the WEKA software with default settings and k = 20 was applied to this data set. Of the 20 clusters, two unique clusters (clusters 5 and 20) consisting of the data vectors preceding the shutdown were identified. As the shutdown was due to the WCF (based on the shutdown log), these vectors reflect the emergence of the WCF. Thus, all the data vectors following the smallest time stamp in clusters 5 or 20 were labeled as volatile, whereas the remaining data vectors were labeled as stable. Of the 44 517 data vectors, 2396 data vectors were labeled as volatile. The test set (Table VI) consists of two WCF (PP1_test_2 and PP1_test_3) and one normal-cycle (PP1_test_1) data sets for PP1. The PP1_test_3 data set consists of less than 24 h of data immediately following the PP1_test_2. This may indicate that the problems due to WCF in PP1_test_2 were not completely resolved (thus resulting in a short operating period reflected in PP1_test_3).

Power plants PP0 and PP1 used similar water chemistry systems and were compatible (Fig. 1). The training and testing data sets were preprocessed in the same way. Parameter readings that were above a certain threshold (e.g., above ten standard deviations), sensor calibration reading, sensor error messages, and so on were considered as unknown and were denoted as “?.” B. Learning and Knowledge Base Modules A stratified sampling approach was used to generate sampletraining data sets. The PP0_train_1 and PP1_train_1 were merged to form the PP0_PP1_train_1 data set. A similar stratified sampling approach was applied to the merged data set. The DT algorithm [13] was applied to the training data sets to develop 30 “digital experts” involving all knowledge bases (Fig. 1). The quality of each knowledge base was evaluated by a tenfold cross-validation. The classification accuracy of the knowledge bases generated from the PP0, PP1, and PP0_PP1 training data sets was 99.9% along with high sensitivity and specificity (Table VII). The true negative, i.e., the true stable predictions were above 99.9%, whereas the false stable predictions were below 0.43%. The rule sets capture the operating states and the parameter interactions (e.g., rule 1 represents the interaction between boiler water silica and dissolved oxygen for the stable state). The rule sets (knowledge bases) were analyzed to ensure that the dominant outcome (representing the stable process) rules do not suppress the minority outcome (volatile states) rules.

600

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 53, NO. 2, APRIL 2006

TABLE VII QUALITY MEASURES USED FOR THE EVALUATION OF THE KNOWLEDGE BASES

TABLE VIII DATA-MINING RESULTS FOR TEST DATA SETS USING THREE KNOWLEDGE BASES

Identical rules across various knowledge bases were identified. A sample rule set is provided next. Rule 1: IF Boiler Water Silica > 0.047971 AND Dissolved Oxygen > 0.017694 THEN Decision = Stable [41 052.85/1.0]. Rule 2: IF Hydrazine > 4.788139 AND Condenser Cation Conductivity ≤ 0.195456 THEN Decision = Stable [3265.43/0]. Rule 3: IF Feedwater pH > 8.9165 THEN Decision = Stable [28.64/0]. Rule 4: IF Dissolved Oxygen ≤ 0.039903 AND Boiler Water Silica ≤ 0.060429 THEN Decision = Volatile [601.0/0]. Rule 5: IF Boiler Water Silica ≤ 0.07628 THEN Decision = Volatile [17.0/0]. The quality of the knowledge bases (KB_PP0, KB_PP1, and KB_PP0_PP1) was further evaluated with the test data sets. As the DT algorithm requires an outcome, i.e., decision parameter, for performing the testing procedure, all the test data vectors were assumed to represent a stable process. This was done to avoid preconception regarding the nature of the test data sets. The knowledge bases KB_PP0, KB_PP1, and KB_PP0_PP1 were tested on the PP0 test data sets (PP0_test_1 to PP0_test_4). The knowledge bases produced high testing accuracy (above 95%) and true stable conditions (Table VIII) for the PP0 test data sets. Thus, the three knowledge bases accurately confirmed the normal cycles of the PP0 test data sets. Similarly, KB_PP0 was tested on the PP1 test data sets (PP1_test_1 to PP1_test_3), resulting in high testing accuracy. However, this high accuracy fails to reflect the true conditions for the WCF cycle test data sets (PP1_test_2 and PP1_test_3) (Table VIII). The scheme of assigning stable decision for all test

data vectors should have ideally decreased the testing accuracy for the WCF cycle test data sets. The results indicate that the knowledge derived from PP0 is insufficient to accurately predict the dynamics of PP1. The KB_PP1 and KB_PP0_PP1 knowledge bases tested on the PP1_test_1 test data sets have led to a high testing accuracy with low false volatile predictions, thus substantiating the normal cycle. These knowledge bases produced lower accuracy with higher false volatile predictions for PP1_test_2 and PP1_test_3 (Table VIII) as expected. The analysis of the predictions for these test data sets shows that the volatile predictions were clustered towards the end of the WCP cycle. This increased the confidence in the knowledge bases generated from the PP1 and PP0_PP1 (merged) data sets. C. Prediction and Alarm Modules The knowledge bases of the learning module include five to seven rules each, whereas the knowledge bases obtained from the merged data set produced ten to fifteen rules per knowledge base. These rules were coded with Excel macros. The level 1 logic is similar to that described in Section III-C. Ten knowledge bases per power plant with another ten knowledge bases from the merged data sets produced a total of 30 predictions for each test data vector. The level 2 decision making consolidates the predictions made by different types of knowledge bases, i.e., PP0, PP1, and PP0_PP1, independently. The level 2 logic is similar to the one described in Section III-C. Thus, each type of knowledge base, i.e., individual and merged data, produces predictions for each time interval. A separate alarm is generated for each knowledge base (i.e., KB_PP0, KB_PP1, and KB_PP0_PP1) using level 2 decisions.

KUSIAK AND SHAH: DATA-MINING-BASED SYSTEM FOR PREDICTION OF WATER CHEMISTRY FAULTS

601

TABLE IX SUMMARY OF PREDICTIONS FOR SEVEN TEST DATA SETS

TABLE X SAMPLE ABNORMAL OPERATIONS FOR NORMAL AND WCF CYCLES

The time lag chosen for the decision making in level 3 is 30 min. The alarm logic used for the PP0 test data set is analogous to that described in Section III-C and Table III. The time lag for these data sets was set to six time intervals (here, time interval = 5 min), i.e., the actual time lag of 30 (6 ∗ 5) min. The abnormal alarm was issued if two out of six time intervals were predicted as volatile, which is equivalent to ten (i.e., 5 ∗ 2) volatile decisions in 30 min. The decision-making logic for the PP1 test data sets (1-min intervals) is presented in Table III. The conditions for issuing an alarm are proportional to those described in Section III-C. This adjustment was necessary to compensate for the data collection at different frequencies (1- and 5-min intervals). The level 3 logic was developed using an Excel macro. The main facets of the macro were the ability to modify the alarm template, store the system status changes, and real-time execution. These features permitted fine tuning the alarm system as well as analysis of abnormal operating periods. The analysis of the changes in system status allowed for the prediction of abnormal operating durations. The knowledge based on KB_PP0 and KB_PP0_PP1 were tested with the PP0_test_1 data set. They predicted only normal operations corroborating that the PP0_test_1 was a normal cycle (Table IX). However, KB_PP1 issued alarms for time duration as long as 5 h. These alarms are indicative of potential WCFs that are of less intensity, and corrective measures could have been taken to resolve the problem. Another possibility

is that the knowledge bases and/or the level 3 logic require additional refinements. The data sets PP0_test_2 (Table IX) and PP0_test_4 (Tables IX and X) exhibit similar behavior to that of PP0_test_1 for all knowledge bases. However, PP0_test_3 (Table IX) generated alarms for all the three knowledge bases. These alarms were primarily due to the consecutive shift decisions for more than 30 min. This indicates that some operating situations were not represented by the knowledge bases. These situations may be normal but due to the conservative approach, an alarm was issued. As data on subsequent faults become available, the number of such false alarms can be minimized. The KB_PP0 and KB_PP0_PP1 knowledge bases produce fewer alarms as compared to KB_PP1 for the PP0_test_3 data sets. The prediction module based on KB_PP1 and KB_PP0_PP1 knowledge bases applied to the PP1_test_1 data set did not predict many abnormal operating conditions lasting for less than 5 h (Table IX). The alarms issued were due to potential WCFs of less intensity, and corrective measures could have been taken to solve the emerging problem. The KB_PP0 knowledge base predicted a complete normal cycle, as no alarm was issued. The PP1_test_2 and PP1_test_3 test data sets produced alarms for all the three knowledge bases indicative of WCF cycles (Tables IX and X). Analysis of the data showed that most alarms were issued due to volatile conditions. Furthermore, the alarms were issued mostly before the shutdowns (for a duration of 16–18 h) for PP1_test_2. These facts strongly suggest a

602

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 53, NO. 2, APRIL 2006

WCF-related shutdown with faults initiating as early as 16–18 h before the shutdown. The PP1_test_3 data set followed similar pattern to that of PP1_test_2. The warnings were issued 6–7 h before the shutdown. The abnormal operations constituted around a third of the total data vectors for PP1_test_3. The prediction module was effective in determining WCF as well as normal cycles. The performance of the prediction module is better with the knowledge derived from the original facility or the combined facilities (which include the original facility). This allows the knowledge bases to capture inherent facility-specific operating dynamics. Alarms were issued due to the conservative policy of assigning abnormal operating conditions for consecutive shift decisions. It signifies that the alarm system is an evolving system that will update itself as new fault cycles become available. Furthermore, there is a need for further investigation to determine the optimal level 3 logic. The alarm-fusion process takes into account the quality of each knowledge base based on the training accuracy, testing accuracy, and online predictions. In future research, a model (e.g., an analytical hierarchical process model) will be formulated to determine an optimal alarm-fusion policy for each facility. The prediction module produces three levels of system status (i.e., normal, abnormal, and transition), whereas the original data set had only two levels (i.e., stable and volatile). Thus, the system is capable of fine tuning the alarm type based on the limited input information. D. Display Module The generated alarms were displayed in addition to the current parameter values (i.e., test data vector), the current level 2 decision, and z time lag level 2 decisions. To assist the decision maker, a visual representation of the conditions required to generate an alarm is incorporated. The current system status is displayed with a three-level color-coded scheme. The red color signifies abnormal conditions, the yellow color indicates transition conditions, and the green color represents normal operating conditions. The interface provides the current conditions and the potential problems. Further development is needed to facilitate the interface between facilities, hardware and software, control modules, and so on. V. C ONCLUSION A simple, robust, and real-time system for early prediction of faults due to unbalanced water chemistry was developed. Data-mining algorithms produced easy-to-interpret multiple rule sets, which were employed by the hierarchical decisionmaking algorithm to predict faults. The third level of decision making captures the temporal fault patterns, thus increasing the chances of predicting abnormal operating conditions and issuing advance warnings. The developed approach is effective despite the sparse fault data and allows for controlling the granularity of the required inputs and predictions (n inputs, i.e., decision parameter values, and can predict m outcomes, i.e., the system status and alarms). The modular architecture of the developed system allows incorporating alternative knowledge-

generation modules such as other data-mining approaches, simulation, analytical models, and domain knowledge. The alarm system was successfully applied to the data from two commercial power plants to monitor WCFs. The system effectively identified normal and faulty operating conditions. Independent test data sets were used to validate the developed system. The system was able to identify abnormal states approximately 6 and 16 h prior to actual WCF-related shutdowns for two test data sets. The decision-making algorithm included in the system can be further modified to improve the quality of the predictions. Data transformations such as sodium to phosphate ratios, conductivity and silica ratios, and discrete value pairs may augment the information content. Other boiler parameters such as heattransfer rate of the surface of the boiler, boiler efficiency, cooling rate, and river-water conditions can provide additional information regarding the water chemistry and can be included in the data-mining model. This could further enhance the alarm accuracy and confidence in the system. R EFERENCES [1] R. Narayanswamy, J. L. Metz, and K. M. Johnson, “Intelligent data elimination for a rare event application,” in Proc. SPIE—Int. Society Optical Engineering, San Diego, CA, 1998, vol. 3460, pp. 906–917. [2] S. Tsumoto, “Chance discovery in medicine—Detection of rare risky events in chronic diseases,” New Gener. Comput., vol. 21, no. 2, pp. 135– 147, Feb. 2003. [3] Z. Zhang, J. Salerno, M. Regan, and D. Cutler, “Using data mining techniques for building fusion models,” in Proc. SPIE—Int. Society Optical Engineering, Orlando, FL, 2003, vol. 5098, pp. 174–184. [4] D. E. Brown, “Data mining, data fusion, and the future of systems engineering,” in Proc. IEEE Int. Conf. Systems, Man and Cybernetics, Hammamet, Tunisia, 2002, vol. 1, pp. 26–30. [5] V. S. Subrahmanian, “Amalgamating knowledge bases,” ACM Trans. Database Syst., vol. 19, no. 2, pp. 291–331, Jun. 1994. [6] R. R. Yager, “General approach to the fusion of imprecise information,” Int. J. Intell. Syst., vol. 12, no. 1, pp. 1–29, Jan. 1997. [7] U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Advances in Knowledge Discovery and Data Mining. Cambridge, MA: MIT Press, 1996. [8] A. Kusiak, “Decomposition in data mining: An industrial case study,” IEEE Trans. Electron. Packag. Manuf., vol. 23, no. 4, pp. 345–353, Oct. 2000. [9] ——, “Rough set theory: A data mining tool for semiconductor manufacturing,” IEEE Trans. Electron. Packag. Manuf., vol. 24, no. 1, pp. 44–50, Jan. 2001. [10] H. Bae, S. Kim, Y. Kim, M. H. Lee, and K. B. Woo, “e-Prognosis and diagnosis for process management using data mining and artificial intelligence,” in Proc. Industrial Electronics Conf. (IECON), Roanoke, VA, 2003, vol. 3, pp. 2537–2542. [11] H. Seo, J. Yang, and J. Choi, “Building intelligent systems for mining information extraction rules from web pages by using domain knowledge,” in Proc. IEEE Int. Symp. Industrial Electronics, Pusan, Korea, 2001, vol. 1, pp. 322–327. [12] A. Kusiak, J. A. Kern, K. H. Kernstine, and T. L. Tseng, “Autonomous decision-making: A data mining approach,” IEEE Trans. Inf. Technol. Biomed., vol. 4, no. 4, pp. 274–284, Dec. 2000. [13] R. Quinlan, C 4.5 Programs for ml. San Mateo, CA: Morgan Kaufmann, 1992. [14] D. A. Leonard, Statistical Process Control. New York: Industrial Press, 1996. [15] B. Buecker, Power Plant Water Chemistry: A Practical Guide. Tulsa, OK: PennWell Corp., 1997. [16] R. P. Charles, Internal Water Treatment for Industrial Boilers. Accessed Dec. 3, 2003. [Online]. Available: http://www.awt.org/members/ publications/analyst/2001/winter/internal%20water.htm [17] Boiler and Heat Exchange Systems, Inc., Boiler Water Chemistry. Accessed Dec. 3, 2003. [Online]. Available: http://www.bhes.com/ frbb4waterchemistry.htm

KUSIAK AND SHAH: DATA-MINING-BASED SYSTEM FOR PREDICTION OF WATER CHEMISTRY FAULTS

[18] R. Papilla, P. G. Whitten, and G. Essig, “Design and implementation of a continuous water chemistry monitoring and control system,” American Society of Mechanical Engineers, Power Division (Publication) PWR, v 15, Recent Innovations and Experience With Plant Monitoring and Utility Operations, pp. 41–44, 1991. [19] Y. Li, Y. Muo, and H. Xie, “Simultaneous determination of silicate and phosphate in boiler water at power plants based on series flow cells by using flow injection spectrophotometry,” Anal. Chim. Acta, vol. 455, no. 2, pp. 315–325, 2002. [20] B. Buecker, “On-line water-chemistry monitoring: A valuable tool to prevent boiler corrosion,” Corros. Prev. Control, vol. 41, no. 4, pp. 84–88, 1994. [21] C. R. Powicki, “Water chemistry off-site and on-line,” EPRI J., vol. 24, no. 4, pp. 24–31, 1999. [22] O. V. Endrukhina and V. N. Voronov, “Mathematical modeling of the water chemistry of a thermal power station under unsteady conditions,” Teploenergetika, no. 7, pp. 63–66, 2003. [23] M. N. Shvedova, V. G. Kristi, S. V. Zakharova, F. V. Nikolaev, and V. B. Benediktov, “Expert system for diagnostics and status monitoring of NPP water chemistry condition,” in Proc. Int. Conf. Nuclear Engineering (ICONE), Arlington, VA, 2002, vol. 4, pp. 93–98. [24] M. Gallanti, L. Tomada, and R. Tarli, “Intelligent on-line system for cycle chemistry diagnostics in power plants,” in Proc. Amer. Power Conf., 1990, vol. 52, pp. 356–362. [25] D. M. Sopocy, J. A. Montanus, O. Jonas, J. K. Rice, A. Agogino, B. Dooley, and D. S. Murthy, “Development of an on-line expert system: Cycle chemistry and demineralizer operation advisor,” ASTM Special Tech. Pub. 1102, pp. 52–65, 1991. [26] G. Nagy, P. Tilky, A. Horvath, T. Pinter, and R. Schiller, “Kinetic and statistical analysis of primary circuit water chemistry data in a VVER power plant,” Nucl. Technol., vol. 136, no. 3, pp. 331–341, 2001. [27] Y. C. Li, F. Zhang, C. Z. Yang, and W. H. Bai, “The use of “grey system” analysis methodology for automated boiler water chemistry control in electric power plants,” Anti-Corros. Methods Mater., vol. 48, no. 2, pp. 96–98, 2001. [28] E. Frank and I. H. Witten, “Generating accurate rule sets without global optimization,” in Proc. 15th Int. Conf. Machine Learning, J. Shavlik, Ed., Madison, WI, 1998, pp. 144–151. [29] M. Stone, “Cross-validatory choice and assessment of statistical predictions,” J. R. Stat. Soc., vol. 36, no. 2, pp. 111–147, 1974. [30] I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques With Java Implementations. San Francisco, CA: Morgan Kaufmann, 2000.

603

Andrew Kusiak (M’89) is a Professor in the Department of Mechanical and Industrial Engineering, University of Iowa, Iowa City. He has published numerous books and technical papers in journals sponsored by professional societies such as AAAI, ASME, IEEE, IIE, ESOR, IFIP, IFAC, INFORMS, ISPE, and SME. He speaks frequently at international meetings and conducts professional seminars. He is also a Consultant for industrial corporations. His interests are in the applications of computational intelligence in automation, manufacturing, product development, and healthcare. Dr. Kusiak is an IIE Fellow. He serves on the editorial boards of numerous journals, edits book series, and is the Editor-in-Chief of the Journal of Intelligent Manufacturing.

Shital Shah received the B.E. degree in production engineering from the Veermata Jijabai Technological Institute, Bombay, India, in 1997, and the M.S. degree in industrial engineering from the University of Alabama, Tuscaloosa, in 2001. He is currently working toward the Ph.D. degree in industrial engineering at the University of Iowa, Iowa City. His interests are in computational intelligence, data mining, informatics, operations research, simulation, and decision support systems. He is in the process of publishing various data-mining and informatic-related research papers. Mr. Shah is a member of Alpha Pi Mu and IIE.

Suggest Documents