Characterizing Discrete Event Timing Relationships for Fault Monitoring of Manufacturing Systems Sujit R. Das Rockwell Automation Advanced Technology 1201 South 2nd Street Milwaukee, Wisconsin 53204
Lawrence E. Holloway1 Department of Electrical Engineering and Center for Robotics and Manufacturing Systems University of Kentucky Lexington, Kentucky 40506
Abstract
The timing and sequencing relationships of changes (events) in discrete sensors and actuators can be used to determine whether a manufacturing system is operating as expected. In this paper, we present a method of learning inter-event timing relationships using observations from a correctly operating system. The observed sample statistics characteristic of correct system operation are used to create a con dence space of possible timing relationships (acceptable delay intervals) of the underlying system. Given a relative cost of false alarms vs. missed detections, the timing relationships can be chosen to minimize the worst case total of the false alarm and missed detection costs over the con dence space.
1. Introduction
Manufacturing systems commonly rely on discrete sensors and discrete actuators for control. Discrete sensors typically provide very course information about the system, such as if a part or mechanism has reached a speci c position, if uid level exceeds a certain level, or if a certain process variable (such as weight or temperature) has reached a threshold. Discrete actuators toggle devices between dierent activities, such as turning motors, pumps, or heaters o and on. Changes in the sensing and actuation signals are called events. Simple timing relationships of speci c events are commonly used to detect machine faults. Traditionally, watchdog timers have been used in the control software to detect an abnormal condition if the time between two particular events exceeds a limit. Other fault detection methods for manufacturing also rely on event timings [1, 2, 3]. 1 Please address all correspondence to L. E. Holloway at the above address or email:
[email protected], phone: (606) 257-6262 ext. 203. This work has been supported by Rockwell International, Allen-Bradley, NSF grant ECS-9308737, and the Center for Robotics and Manufacturing Systems at the University of Kentucky.
This paper addresses the problem of determining interevent time relationships from system observation. Although an analysis of system design can give nominal or worst case estimates, these often will not re ect the true system operation. The maximum response time of an actuator in a design speci cation may be significantly more than the typical response time of an actuator in practice. Nominal inter-event timings alone are not sucient to establish limits of acceptable timing variation. Allowing too narrow a range of variation may lead to false alarms, while allowing too much variation can lead to missed detections of faults. In this paper, we present a method of specifying interevent timings using past observations of the system. The timing relationships we consider are represented in terms of an interval of allowable delay times between speci c events. Our problem can be stated as follows: Given a timed sequence of events characterizing the correct operation of the system, and given pairs of events for which inter-event timings are of interest, determine the range of inter-event timings that minimizes a worstcase weighted cost of false alarms (FA) and missed detections (MD). In the following section we de ne the con dence space of possible process timing relationships as determined from sample observations. In section 3 we de ne quality measures of missed detections and false alarms and describe the notion of partitioning of the con dence space based on variation of cost of MD and FA. Section 4 analyzes the worst predicted cost of monitoring for all possible selections of timings in the available space and selects the one which gives the minimum cost. In the last section of this article, we summarize the problem and the solution.
2. Process Inter-event Timings
Consider a process that generates events over time. The occurrence of some of these events are related to each other through the dynamics of the process. For examp. 1
ple, an event representing the start of lling a bottle should be followed with some delay by an event indicating the bottle has been lled. In this paper, we consider pairs (e1 ; e2 ) of related events, where the time delay (inter-event timing) between the rst event e1 and the second event e2 under correct operation will be normally distributed with unknown mean and unknown standard deviation . For fault monitoring we characterize inter-event timing relationships as positive time intervals, represented by a center m and a width w. The set of all relationships (m; w) is the timing space denoted as U . Given an event pair (e1 ; e2 ), a timing relationship T = (m; w) indicates that we expect that for a correctly operating system, if e1 occurs at time t1 , then e2 should occur with a positive delay between m ? w=2 and m + w=2. If e2 does not occur within the expected time delay, then a fault is declared. If the process mean and standard deviation were known, then a sensible timing relationship would be m = and w = v, where v is a positive real constant that determines how far the timing relationship extends out onto the tails of the normal distribution of timing. We consider v to be a user speci ed parameter, so false alarms due to a truncation of the normal curve by selection of v shall be neglected for our discussion. Since the process inter-event timing parameters and are not known precisely, the parameters m and w cannot be determined directly. Instead, only the sample mean x and sample standard deviation s are available from n sample observations of a correctly operating process. Consider a con dence level speci ed for both the mean and the standard deviation. This con dence level is speci ed as (1 ? ), and associated with this level is a value z 2 of a standard normal curve for parameter =2 [4]. For n 30, and for a sample standard deviation of s, the process standard deviation is known (under the given con dence level) to lie within the interval (n) [4], given by
s
z 1 + p22n