Autonomous Mining for Alarm Correlation Patterns based on Time-shift Similarity Clustering in Manufacturing System Yan Chen
Jay Lee
University of Cincinnati NSF Center for Intelligent Maintenance Systems Cincinnati, Ohio, United States of America
[email protected]
University of Cincinnati NSF Center for Intelligent Maintenance Systems Cincinnati, Ohio, United States of America
[email protected]
Abstract— Current alarm systems employed in manufacturing applications are ambiguous in terms of indicating the root causes of process disturbances, which causes many difficulties for decision making. As manufacturing systems increase in complexity and scale, the continued reliance on human operators for alarm information management is impossible. A computeraided information management system would increase analytical capability for alarm analysis. To this effect, an autonomous data mining method is introduced to search historical alarm logs for the correlations that can represent causal relationships, which can aid alarm management and system improvement. A hierarchical clustering method is used to carry out the correlation pattern search. Moreover, the similarity function of this method is designed to identify certain pre-defined correlation patterns. This method is validated in a vertical turning machine center alarm system application. The proposed method can discover a large number of alarm correlations, which are usually neglected by operators, and manage the alarms in the way that clarify process disturbance and enable rapid root cause analysis Keywords-manufactuirng process, alarm correlation patterns, diagnosis, similairty-based clustering, frequent pattern mining, Machine tool
I.
INTRODUCTION
To advance the automation of manufacturing process control, there is a tendency of growing needs for the automation of process anomaly management [1]. Traditionally when encountering anomalies, the controller automatically raises warnings or alarms. It provides immediate information so that operator can quickly evaluate what is occurring and take appropriate actions to remedy the anomaly situation. In most of manufacturing system, with hundreds of sensors, actuators or switches being connected to programmable logics controller (PLC), some alarms are fired by checking the threshold passing of measured variables. They are directly passed to controllers and displayed to operators as alarming messages. PLC also monitors system status variables by doing simple logic calculation and raises alarms when changes occur on them. The
978-1-4244-9827-7/11/$26.00 ©2011 IEEE
function of all alarms is to help correct human mistakes and careless operations during the manufacturing process. However with the increase of system complexity and scale, systematical analysis is needed to manage the alarm information described above. First, since a large amount of fired alarms actually do not indicate critical anomaly but only monitor process status, operator will need more clarified ranking of alarm significance to prioritize alarm messages as to make quick decisions and focus on the hidden root causes of anomalies. Second, lots of status-monitoring alarms provide boundaries to define the critical anomaly, though they do not pinpoint problems directly. If operators can recognize inherent correlation between these alarms, the alarm messages with clarified information will make diagnosis decision more accurate and timely. Third, in some complex systems, timedelayed failure propagation will continually cause ‘phantom alarms’ downstream, even if the upstream variables have been restored and the alarms have been deactivated [2]. So operators’ quick identification of the root alarms will reduce confusion caused by phantom alarms. All above situations require more analytical capability of alarm management. The total reliance on human operators for alarm information management has been proved to become very difficult [3], Therefore, comprehensive recognition is reached to develop autonomous analytical solution to support operators for quick response and fusion of accumulated alarm information [4]. The paper targets at manufacturing system and researches how data mining oriented alarms correlation analysis can manage overwhelming alarm information by adding more analytic capability to the system for the purpose of assisting operators decision making. The organization of the paper is as follows. In section II, selected alarm management methods are reviewed and compared with proposed method. In section III, we first overview the proposed approach step by step and then define the basic alarm occurrence patterns the paper will focus on. Then we give some related definitions of the alarm model which will be used during the frequent pattern search such as
alarm events, occurrence variables and matricees. In section IV, the two steps of frequent alarm pattern search are introduced in detail. The proposed search method convertss the problem of frequent pattern search into multiple-diimension vector clustering with defined dissimilarity thresholld. In section V, proposed method is applied to 40 days’ alarm m log of a large scale vertical turning center with Siemens 840D controller. Finally conclusion is drawn and future workk is discussed in section VI. II.
RELATED WORK AND PROBLEM DEFINE
A large number of projects and research hhave been carried out to improve the alarm system. Besides earlyy study of human factors’ impacts on alarm system [5], somee researchers [6] investigated alarm system problem and suummarized fault management as cognitive activities. Other reesearch, from an engineering perspective, realized necessity to separate real alarm signal from other types of abnormal eevents associated with operation modes and status [7]. Afterr the separation, remaining alarm signal should be more critical message describing direct disturbance of process. Also lots of research and successful appliccations have been developed in the field of alarm events analysiss for the purposes of fault detection and diagnosis. The state oof art study [8-9] categorizes advanced alarms message processsing methods into nine categories. For example, causality based methods utilized alarms to discover that a fault condition propagates through the process; hierarchical based methods differeentiate alarms in hierarchy such as plant, system, equipment levvel and so on. The study [9] introduced several categories of aalarm correlation systems. In the study, alarm correlation is deefined as either a reciprocal relation among alarm messages or an evolved process where they have parallelisms. From daata mining points of view, the reciprocal relation and paralleelisms of alarms occurrences also can be considered as sequenntial patterns. The frequent sequential patterns search, or associaation rules search, as one of data mining tasks, has been studied in different areas with a wide diversity of data types [10-12]. Foor example, in the area of telecommunication service and com mputer network, alarms usually are defined by MIB variablees from network monitoring. And association rules based data mining methods are used to discover alarm correlation amongg large volume of event messages as to diagnose and localize nnetwork overload and IP duplicate failure. [13-15]. In fact, compared with expensive physiccal model based alarm events analysis, for most complexx manufacturing systems, data mining techniques realized offline analysis with real-world practice instead of design logics. It is able to give an insight of current alarming mechanism andd optimize it by reducing its ambiguity. The study [16] prroposed a fuzzy clustering and ranking algorithm to decreasee the quantity of alarms by clustering the measured variable of alarms and select primary alarm in the clusters. Instead of measuured variables, in this paper we directly cluster alarms iinformation, the correlations between different alarms are targeeted as significant patterns used to sort the alarm informationn. A time-shift similarity based clustering method is propoosed to discover defined sequential patterns of multiple alarms. These patterns discover connection between machine elemennts statuses, alarm
redundancy as well as normal/abno ormal operation procedures. Some patterns themselves are indiccations of potential failure. Some indicate the possibility to t improve the alarming mechanism. Others can be used to monitor m PLC sensor failure. Although in traditional data min ning tasks and applications, most of association rules are for discrete d events data and the mining intend to proceed on the database with one by one transaction, there are also alternativ ve methods to do mining for numerical attributes and use oth her database formats. For instance, the study in [17] proposed the system to mine association rules based on two dimeensional attribute matrix and cluster mined quantitative asso ociation rules for brief representation. Similarly in this pap per alarm correlation mining will be designed on matrices with h one dimension of binned time intervals and the other of attriibutes index. This design is more suitable for manufacturing sy ystem that has no discrete transaction during continuous sched duled production. III. AUTONOMOUS MINING SIMILARITY CLUSTERING
BASED
ON
TIME-SHIFT
In this study, every occurrence of o any alarm is treated as an event. Consecutively observed even nts of same alarm consist of a variable of time series. If even nts of different alarm are observed at the same point of timee, they will generate a time series cross sectional (TSCS) data d matrix. Since alarm correlation is interpreted as sequen ntial patterns, the mining of frequent sequential patterns will be implemented on the TSCS matrix step by step.
Figure 1. Proposed method
As Figure 1 shows, the proposed method searches frequent patterns with following six steps: 1.
Convert alarm log into o TSCS format as Multiple Alarm Matrix (definitiion 3 in next section) with defined time scales and d alarm indices.
2.
Conduct hierarchical cllustering on MAM to group alarms with similar occcurrence pattern.
3.
Select qualified clusterrs as recognized candidates for frequent patterns
During clustering, the alarms that possess the greatest similarity will be merged into a single s cluster. This cluster actually forms the sequential patterrn of these alarms that has higher occurrence frequency than n other possible patterns formed by random selected alarm ms. Then if the distance
threshold is defined in term of the dissimilaarity, the clusters qualified will be selected out as candidates of ffrequent patterns. 4.
Validate candidates patterns com mbing engineering knowledge if there is any
5.
Conduct the search for condittional alarms of every selected pattern in step 3
6.
Validate conditional alarm candiddates
In next section, related definitions will bbe introduced in detail as well as the modified hierarchical clusttering method. A. alarm correlation patterns Three typical alarm correlation patterns (ass Figure 2 shows) will be focused in the study. They are denotedd as X, Y, and Z in the chart and generated by three types oof alarm, whose events are expressed as a filled circle (B), blannk circle (A), and cirque(C) respectively. The correlations of thhe three patterns are ordered chronologically as shown in seccond row of the chart. For parallel concurrent pattern X, once A (B) happens, B (A) happens at same time. The delay between the two occurrences is zero. In the time series the seequence at which these alarms occur should be linearly correlateed. For Y type of serial pattern, once A occurs, after k time stepps, B will happen. So shifting one of two occurrence sequencee along the time, there will be one shift steps k that makes the shifted two time series have strong linear correlation. In nonn-series and nonparallel pattern Z, the conditional alarm C proceeds to the occurrence of B without time constraint aand only creates condition for serial occurrence between alarm A and B. Here in serial pattern, the time delay beetween alarms is assumed to be invariant with time. This assuumption is more acceptable in manufacturing system witth less human interruption and uncertainty than other sociaal system human behavior dominated. For example, in paper coonverting process, the alarm sensors are usually distributed alonng the production line at multiple, specific locations. Defe fects caused by unobservable process disturbances can geneerate propagated alarms on several locations. The correlatiion between the multiple occurrences of different alarms can characterize features of a disturbance. Since the alarm sennsor locations are fixed, if the alarms were caused by same distuurbance, the time delay between their occurrences should onnly reflect their locations and converting speed, which are ttime invariant to some degree.
Figure 2. Three correlation patterns of alarm ooccurrences
B. Related definitions In the paper, the alarm model for manufacturing system can be defined by following definitions: Definition 1: Alarm event is deefined as: ei,tj=; Where “i” stands for alarm index and tj is i the time. tj belongs to a defined time windows expressed as Tw that starts from Ts and ends at Te. It can be divided in nto unified time intervals: (t1,t2,…,tj,…,tn). g is realized by discovering In [10], frequent pattern mining frequent item-set from transaction database. d The frequent items is defined as a k-item set α from a set of all possible items, which is frequent if it occurs mo ore than *|D| times in a database D. is user defined miniimum support threshold. |D| is total number of transaction in the database D. In most manufacturing systems, there is no apparent transactions information. Alarms occur becaause of the violation of machining requirements or detectio on of a process disturbance. Therefore it is more understandablee to format alarm events as time series. The time information wiill create extra dimension in the database for patterns mining. Definition 2: Alarm Occurrencee Variable (AOV) is a time series as xi = (ei,t1,ei,t2 ,…, ei,tn); it exp presses event counts of No.i alarm at unified time space (t1,t2,…tn). For example, during 6 hours th hree alarms occur as table I shows. Therefore, AOV of No.1 alaarm can be denoted as: x1 = (0, 0, 1, 1, 1, 1) TABLE I
ALARM CO OUNT TABLE
Alarm counts
1h
2h
3h h
4h
5h
6h
#1 alarm
0
0
1
1
1
1
#2 alarm
0
0
1
1
0
1
#3 alarm
1
0
0
0
0
0
Definition 3: Multiple Alarms Matrix (MAM) is a time series cross section data panel. It includes i AOVs of different alarms along same time scale (t1,t2,… …tn); It is denoted as: ,
, ,
,
For example, a part of the conv verting production includes 18 types of alarms. The collected allarm events log has the time window of 1435 minutes. AOVs of 18 alarms can be organized as MAM and plot in the spectrum shown in Figure 3. X axis is the time scale (15 minutes per index x/interval), Y is the index of alarm types. Filled unit in the plot iss the count of corresponding alarm events.
Second, new clustering Rt is produced by finding the closest (most similar) paiir of clusters from Rt-1 and they will be merged into a single cluster. So Rt contains N-t clusters, such h that each cluster in Rt-1 is subset of Rt. Specifically y in the study, among all possible pairs of cluster (Ci, Cj) in Rt-1, the new merged cluster is from the t pair (Cr, Cs) whose dissimilarity function g sattisfies: , , (1) , 3. Last, repeat the second step p until N-t=1. Hierarchical clustering produces a hierarchy of clusters from bottom single object clusters to top single group containing all s of elements in input X as Figure 4 shows. 2.
2 4
alarm index
6 8 10 12 14 16 18 10
20
30
40
50
60
70
800
90
time(15mins/index)
Figure 3. Alarm TSCS panel
As mentioned above, in the transaction ddatabase without time dimension, traditional association rules mining uses the frequency of candidate patterns with k items tto decide if there is possible correlation existing among them. However, if the occurrence of every alarm is scaled into tim me dimension, as shown in definition 3, the correlation of k items can be represented by the corresponding k time varriables. In a two dimension MAM, the k items pattern is streetched over time; repetition of the pattern can be reflected by linnear correlation of related time variables. Therefore, accordinng to the new organization of database, the association rules will be discovered by mining frequent patterns in the M MAM. Definition 4: Let I = (a1,a2,…,an) be a set of concerned alarm items, and a k-item set α, which connsists of k alarm items, is frequent if the dissimilarity of their A AOVs is less than λ. λ is a coefficient between 0 and 1. Due to the introduction of time dimensionn in the database (MAM), instead of frequency calculation, we use dissimilarity of two or a group of alarm variables (AOV) tto prove presence of the frequent pattern. In another words, if the groups of AOVs are similar enough, we assume the occcurrence of these alarms display some pattern. Their relationshhip defines some sequential occurrence rule. IV.
FREQUENT
PATTERN
SEARCH
BY
HIERARCHICAL
CLUSTERING
In last section, the occurrence of every alarrm is represented as a variable. Definition of frequent patternns is migrated to alarm variables’ dissimilarity. This sectionn will introduce proposed algorithm for frequent alarm patteern search using hierarchical clustering. A. Hierarchical Clustering: The Clustering is a method to organize a sett of observations into subsets so the observations in the sam me cluster have similarity to certain degree. The populaar agglomerative hierarchical algorithm includes a total off three steps to accomplish the clustering task [18]. 1. First, among input X, clustering R0 w will be initialized by assigning each item to one clusteer. If there are N items, first level consists of N clusters each containing a single element.
Figure 4. Hierarchicaal Clustering
B. Determination of similarity funcctions: In the hierarchical algorithm, two o dissimilarity functions are significant to determine how to merge m new cluster based on their proximity measurements. u to calculate “distance” First dissimilarity function is used between any two single elements off input X. Different metrics can be used such as Euclidian distaance, Mahalanobis distance and so on. Here, linear correlatiion coefficient is used to measure the dissimilarity between any a two AOVs. Hence, the distance value can have a scale rang ging from 0 to 1. Also if the occurrences of two alarms follow the frequent pattern, their AOVs should be linearly correlated. If any two AOVs are expressed d as following vectors: xi = (ei,t1,ei,t2 ,…, ei,tn) and xj = (ej,t1,ej,t2 ,… …, ej,tn). Their mean values are x and x . The distance (dissimillarity) can be calculated by linear correlation coefficient as: ,
1
(2)
The dissimilarity calculation by y correlation coefficient is able to recognize any pair alarms that t occur concurrently like parallel pattern X in Figure 2. But for f pair alarms that occur in serial sequence such as pattern Y, if directly use their correlation coefficient, the distance value would be larger than it should be because of the tiime delay between their occurrences time series. Unless com mpensate the time delay, the right distance value could be found d. In fact, the calculation of the dissimilarity becomes a local searching s for the minimum distance value among a group of can ndidates. For a pair of AOVs (xi,xj), first shift one vector along time scale step by step, during shifting caalculate their distance every
step, and then find the minimum valuee among all of candidates. So if: | , ; (3) , | |, Where k is shifting step parameters, r iis shifting range based on maximum time between any neighbbor events of the shifted alarm. Then equation (2) can be improvved as: ,
|
min
∑
1
∑
,
,
(5)
Here, Cp is new merged cluster from ((Cpi, Cpj) of last iteration, Cs is old unaffected cluster, the dissimilarity is calculated as average linkage. Until now the dissimilarity function favorring the X and Y patterns in Figure 2 has been defined. In everyy iteration step of hierarchical clustering, the alarms that have coorrelations like X and Y can be recognized and grouped togetherr. For conditional correlation pattern Z in Figure 2, alarm A preccedes to B unless C occurs before B without time constrain. So linear correlation coefficient cannot pinpoint the relationship bettween C and A or B. The pattern Z needs to be discovered in sepparate search. In Z pattern alarm A and B belong to serial patttern Y which can been discovered by previous steps, so in the second search, most possible conditional alarm for the alreaddy formed pattern Y is need to be found. First, for every serial pattern Y, one off alarms will be selected as result alarm. For example, assum me AOV xr form serial pattern Y. It is result alarm: xr = (er,t1,er,tt2 ,…, er,tn). It has time scale Tr:(t1,t2,…tn). Second, find non-zero values in xr and generate a new vector by them. The new vector can be exprressed as: xrnon = (er,v1,…,er,vk,…,er,vm) where t1≤vk≤tn , and m is the number of non-zero values in xr. Then based on xrnon, the new time scale cann be created as Trr r r non: (t1 ,…, tk ,…, tm ) , where ; 1,2, … , ; ; Therefore except the AOVs in the patteern Y, any other AOV of all alarms can be re-organized at new time scale Tr-non : ,…, , ,…, , , The kth element in the vector means totall event counts of alarm g in time interval tkr. Vector xg represeent all of g alarm events before corresponding r alarm events in xr. The old MAM is converted from … … ,
… ,
… to:
… … … … …
… ,
… ,
…
,
(4)
,
Second dissimilarity function of hierarcchical clustering measures distance between a point and a seet or two sets as following shows ,
… … … … … , 2 , … … … Here h is the number of all alarm ms. The new MAM does not include AOVs of alarms in the patteern Y. 1
∑
0 /
(6)
And the search will be condu ucted on the S matrix by selecting one row of S which has miinimum distance (calculated by equation (6) with xrnon. If the distance d is small, it proves alarm g always happened before alaarm r. It is possible to be its conditional alarm. V.
APPLICATION IN MACHINE TO OOL ALARM SYSTEM
A. CNC system
Figure 5. CNC controll system
Today, the most of machine too ols realize automatic motion control by numeric control (NC) sy ystems. As Figure 5 shows, NC system sends out command to trigger axis movement and corrects velocity error based on th he feedback from machine elements. A Programmable logical control c (PLC) matched with NC kernel conducts logical computaation and sequential control. PLC realizes machine element contrrol by digital and analog I/O communication. It monitors the perrformance of NC, machines and other interface controls system ms. It also detects process disturbance to prevent damages occcurring to the work piece and machines. If any disturbancce occurs, the machining sequence is interrupted and the drivers are stopped. The fault will be stored and displayed as an allarm in display panel. One part of PLC alarming mechanism m is designed by controller manufactures to secure normal n performance of PLC system and its I/O channels. Mosst of remaining alarms are configured by machine manufacttures. Besides PLC, other modules in the control system such as human machine communication area (HMI), NC kernel k and servo driver are also able to generate alarms or messages. The alarms configured by Machine manufacturers usually are the ones that operators frequently encounter. They are direct indicators of the operation progress and critical occurrence of process disturbance. Since they are more related to specific machine operation, the design from manufactures can’t cover all of situations later machine m end-user will face. These parts of alarms are frequently y managed and reconfigured during usage of the machine. The proposed p methods can assist to formulate PLC alarm improvemeent plan based on historical
A alarm log from 7.5 meter vertical turning center is targeted as research object. The machine is made by Dorries Scharmann and has Siemens 840D controller. It is used to machine large scale parts of certain energy generators. Totally 40 days data from real production are analyzed. In this study, we focus on the PLC alarms made by machine manufactures including three types shown in Table 2. Channel alarms indicate PLC channel status. They are usually displayed as warning or reminding message. Spindle/Axis alarms monitor spindle related status. Last part of alarms is defined by specific machine user to monitor the coolant, spindle temperature, lubrications, and other concerned variables on machine elements. In fact most of alarms by process disturbance are with index ranging from 600000 to 700000. They will bring extra machine downtime. TABLE 2
frequency. Total eight groups of patterns are discovered. So according to definition 4, we consider the eight groups of patterns are frequent. They are reported in the format as Figure 7 b) shows. clustering tree of Alarm Occurrence Variables(500000~599999) on 17th day
alarm ID
alarm logs analysis. The critical information delivered by alarms can be interpreted according to real operation.
510121 510216 510120 510230 510311 510008 510312 510221 510218 510220 510219 510223 510222 510225 510224 510126 510122 510119 510118 510127 510123 510117 510116 0
PLC ALARM
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
dissimilarity
Alarm Index Range
Alarm Source
Total Alarms Types
500000~599999
Channel alarms
25
600000~699999
Spindle/Axis alarm
4
700000~799999
Machine user defined
33
a)
As to illustrate the proposed pattern mining method, several experiment results will be explained in followings. B. Experiment one In this experiment, first, the 40 days alarms with index ranging from 500000 to 599999 are converted into MAM day by day. And the mining algorithm is applied on every day’s MAM. The discovered patterns are reported. Figure 6 shows a part of alarm matrix from 17th day. The X axis denotes time scale. Filled unit means non-zero event count during that binned time interval. Multiple Alarm Matrix (500000~599999) on 17th day 510127 510126 510123 510122 alarm ID
510121 510120 510119 510118 510117 510116 510008 150
160
170
180 190 time (2min/index)
200
210
Figure 6. Multiple Alarm Matrixes
If setup the dissimilarity threshold at 0.4 and use proposed mining algorithm to search alarm correlation patterns, the clustering result of the 17th day is shown in Figure 7 a). Here the dissimilarity threshold is selected to encompass the optimal number of pattern candidates, which occur at a relatively high
b) Figure 7. Clustering tree of Alarm Occurrence Variables a) and discoveried frequent patterns on 17th day b)
C. Experiment two Correlation patterns between channel status alarms and user defined alarms (or spindle/Axis alarms) can discover related channels status for the disturbance. Therefore, after applying the mining algorithm and setting up the same dissimilarity threshold (0.4), the result is shown in Figure 8 (or Figure 10). In the result, the 7th group indicates the active statuses of channel M24 and M25 have sequential relation with the detected events of “tool magazine change door open” by PLC I/O. In fact this frequent pattern is discovered nearly on every day’s report. The pattern happens with a sequence of 510118700040-510118-510119-700040 as Figure 9 a) shows in detail and the b) shows in long term. However, during 40 days, there are two days as Figure 9 b) shows (April 09 and 10) in which the pattern has not been followed. Maintenance report showed this was caused by broken sensors and it is repaired in two days, though it has not caused total stop of machine operation. This example proves correlation patterns between alarms can discovery some normal operation sequences. The disorder of normal pattern would be a sign of operation mistakes or sensing degradation.
D. Experiment three In this experiment, total alarm counts per day are calculated by following method. First, select the correlation frequent patterns with a dissimilarity of 0.2 from mining results of all three types’ alarms; and sum up all occurrence counts of independent alarms and primary alarms to calculate total alarm occurrence count every day; Figure 11 shows results. Here primary alarm is defined as the most frequent alarms in every pattern. Without frequent patterns mining the operators will see average 370 alarm messages per day; if filter out repetitious alarms, average 149 alarm counts are eliminated per day. average eliminated alarm count/day 149.1714 800
Figure 8. Discovered frequent pattern on 17th day (channel status and user defined alarms)
alarms seen by operator unique alarms after the orgnization
700 600
count
500 700040
400 300
alarm ID
200 510119
100 0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 day
Figure 11. Alarm count
510118 10:38
10:39
10:40
10:41
10:42
10:43
10:44
10:45
time(HH:MM)
VI.
a)
CONCLUSION AND FUTURE WORK
In summary, through autonomous mining for alarm correlation patterns using the proposed method, the following contributions can be achieved:
700040
alarm ID
0
510119
510118
04/09
04/10
04/11
04/12
04/13
04/14
04/15
time(HH:MM)
b) Figure 9. Frequent pattern of the 7th group
Another frequent pattern shown in Figure 10 indicates the spindle always stops at absurd condition and the stop always comes with another two PLC channels’ active status. This pattern narrows down the diagnostic search range for maintenance engineers during PLC logics checking.
First, some frequent correlation patterns of multiple alarms can be discovered and they are able to reflect normal operation sequence. Any changing on the pattern can indicate abnormal disturbances by sensor degradation or operation mistakes. The pattern also provides good reference for later PLC alarming mechanism improvements. Early failure warnings can then be established based on the pattern to provide intelligent online or offline warning rather than just displaying simple hardwired threshold checking alarms. Second, the knowledge accumulated from frequent patterns can facilitate decision-making. Some discovered patterns are able to clarify the boundary of abnormal process disturbance. This can help engineers to quickly target the “problem” networks in the PLC or physical machine elements. The boundary information will also help engineers to reproduce the occurrence of the disturbance or failure during the diagnosis. Third, a large number of alarms or warning messages happen every day, while some of them are related to each other; others are duplicated warnings of same disturbance. Combining repetitious alarms in within the same pattern can help maintenance engineers make more precise diagnosis decision and focus on critical problems.
Figure 10. Discovered frequent pattern of spindle disturbance
Currently, the proposed methods have been applied and tested on a machine tool system. In the future, these methods will be improved for more complicated non-serial and non-
parallel correlation patterns; one possible solution is to introduce conditional statistics testing into the definition of “frequent” pattern. This method will also be validated in complex production process. REFERENCES [1] V. Venkatsubramanian, R. Rengaswamy, K. Yin, and S. N. Kavuri, "A review of process fault detection and diagnosis Part I: Quantitative modelbased methods," Computers & Chemical Engineering, vol. 27, pp. 293-311, Mar 2003. [2] D. Leung and J. Romagnoli, "Dynamic probabilistic model-based expert system for fault diagnosis," Computers & Chemical Engineering, vol. 24, pp. 2473-2492, 2000. [3] D. D. Woods, "The alarm problem and Directed attensiton in dynaymic fault managment," Ergonomics, vol. 38, pp. 2371-2393, Nov 1995. [4] I. S. Kim, "Computerized system for online management of failures- A state of the art discussion of alarm system and diagnostic systems and diagnostic systems applied in the nuclear industry," Reliability Engineering & System Safety, vol. 44, pp. 279-295, 1994. [5] K. D. Duncan, "Scoring methods for verification and diagnostic performance in industrial fault finding problems," Journal of Occupational Psychology, vol. 48, p. 93, 1975. [6] D. D. Woods, "The Alarm Problem and Directed Attention in Dynamic Fault Management," Ergonomics, vol. 38, pp. 2371-2393, Nov 1995. [7] F. P. Lees, "Process Computer Alarm and Disturbance Analysis Review of the State of the Art," Computers & Chemical Engineering, vol. 7, pp. 669-694, 1983. [8] I. S. Kim, "Computerized systems for online management of failures - A state of the art discussion of alarm systems and diagnostic systems applied in the nuclear industry," Reliability Engineering & System Safety, vol. 44, pp. 279-295, 1994.
[9] R. Gardner and D. Harle, "Methods and systems for alarm correlation," 2002, pp. 136-140. [10] R. Agrawal, T. Imieli ski, and A. Swami, "Mining association rules between sets of items in large databases," ACM SIGMOD Record, vol. 22, pp. 207-216, 1993. [11] J. Han, H. Cheng, D. Xin, and X. Yan, "Frequent pattern mining: current status and future directions," Data Mining and Knowledge Discovery, vol. 15, pp. 55-86, 2007. [12] H. Mannila, H. Toivonen, and A. Inkeri Verkamo, "Discovery of frequent episodes in event sequences," Data Mining and Knowledge Discovery, vol. 1, pp. 259-289, 1997. [13] Q. Zheng, K. Xu, W. Lv, and S. Ma, "Intelligent Search of Correlated Alarms for GSM Networks with Model-based Constraints," Arxiv preprint cs/0204055, 2002. [14] C. Chao, D. Yang, and A. Liu, "An automated fault diagnosis system using hierarchical reasoning and alarm correlation," Journal of Network and Systems Management, vol. 9, pp. 183-202, 2001. [15] K. Hatonen, M. Klemettinen, H. Mannila, P. Ronkainen, and H. Toivonen, "TASA: telecommunication alarm sequence analyzer or how to enjoy faults in your network," 2002, pp. 520-529. [16] Z. Geng, Q. Zhu, and X. Gu, "A Fuzzy Clustering–Ranking Algorithm and Its Application for Alarm Operating Optimization in Chemical Processing," Process Safety Progress, vol. 24, No. 1, p. 67, 2005. [17] B. Lent, A. Swami, and J. Widom, "Clustering association rules," in Data Engineering, 1997. Proceedings. 13th International Conference on, 1997, pp. 220-231. [18] T. Hastie, R. Tibshirani, J. Friedman, and J. Franklin, "The elements of statistical learning: data mining, inference and prediction," The Mathematical Intelligencer, vol. 27, pp. 83-85, 2005.