Bayesian networks is a probabilistic approach to diagnostics. Theoretically .... generate. Fuse Once the toner image is attached electrostatically to the final.
BAYESIAN BELIEF NETWORK MODELING AND DIAGNOSIS OF XEROGRAPHIC SYSTEMS 1
2
Chunhui Zhong Perry Y. Li Department of Mechanical Engineering University of Minnesota 111 Church St. SE. Minneapolis, MN 55455 {zhongch,pli}@me.umn.edu
ABSTRACT In this paper, a Bayesian Belief Network (BBN) approach to the modeling and diagnosis of xerographic printing systems is proposed. First, a continuous BBN model based on physics of the printing process and field data is developed. The model captures the causal relationships between the various physical variables in the system using conditional probability distributions. Next, the continuous BBN is discretized based on the principle of maximum entropy so that it can be implemented on commercially available software, Hugin. The resulting BBN can be used for the prediction of print quality behaviors, as well as for inference and fault diagnosis. Examples of network deduction and inference are presented to illustrate the usefulness of the BBN model. 1. Introduction Our ultimate research objective is to develop control systems for xerographic printers / copiers capable of generating high quality color prints despite system faults and component degradation. Currently, a printer / copier merely shuts down when some components degrade or fail. The diagnosis and the subsequent repair require a visit by service personnel. Service calls are costly to the service providers and machine downtime is a productivity loss for the customers. Productivity loss can be minimized if the machine can remain available albeit at a degraded quality. In this case, the machine has to reconfigure its control according to the fault or degradation so that the system can operate as best as it can in the degraded state. This feature is especially important for production print jobs with tight schedule or for walk-up users who often need immediate access. The ability to diagnose faults and to determine the extent of the degradation of the components is critical to realize this feature. A printer/copier that has the ability to self diagnose can also decrease the service cost. For example, the duration of each service call can be minimized, or in the event that customers can carry out the repair themselves (e.g. by 1 2
Graduate student Assistant Professor and corresponding author
replacing some easily accessible components), the service call will not be needed at all. In the xerographic printing system, the number of sensors are finite so that system faults or component states cannot usually be directly detected and must be inferred from observations. These observations may be produced from many sets of fault conditions. The diagnostic problem is to determine the set of fault states and / or component degradation that best explain the observations. Bayesian networks is a probabilistic approach to diagnostics. Theoretically, probabilistic approaches have the advantage over deterministic approaches in which the causal relationships are encoded in crisp logic. Firstly, the conditional probability of the component failure when an evidence has been introduced can be used to indicate the confidence level of the diagnosis. In contrast, every logically consistent diagnosis has equal footing in a deterministic logical framework. Secondly, conflicting evidence (such as in the case of intermittent faults) can be accommodated in a probabilistic description, but can cause the deterministic reasoning system to fail. The system’s ability to handle conflicting evidence implies that old evidence can be discounted as new evidence is introduced. An important issue associated with a rigorous implementation of probabilistic diagnostic system is that of computation burden. Compared to other diagnostic systems [9] [10], the computation for very large BBN networks have been shown to be tractable [11]. BBN have been successfully applied in many disciplines including engineering decision support systems [12]. In this paper, we develop a Bayesian Belief Network (BBN) model of the xerographic printing process. It is intended both for the prediction of print quality and for the diagnosis of system faults and component degradation. Eventually, the diagnostics information obtained will be used for the reconfiguration of the control system to operate under degraded conditions.
Figure 1. Illustration of a xerographic making process The rest of the paper is organized as follows. In section 2, we review the physical processes that occur in a xerographic printing system. The basic idea of a BBN is presented in section 3. In section 4, the continuous variable BBN for the xerographic process is given. The methodology for the discretization of continuous BBN is given in section 5. Section 6 contains some deduction and inference examples using the proposed BBN. Section 7 contains some concluding remarks. 2.
XEROGRAPHIC PRINTING PROCESS
A digital xerographic marking process such as in a copier/printer is briefly described. The readers are referred to [1] for details of the fundamental physics. An illustration of the whole xerographic making process is shown in figure 1. The central component in a xerographic process is the photoreceptor (PR), which is usually in the form of a belt or a drum. It rotates continuously, and interacts cyclically with several stationary processes. The PR acts as a staging area where a toner image is first built up before being transferred to the paper, or to an intermediate medium. The toner image built up on the PR is eventually deposited on the output medium. The density profiles of the toner images on the PR, and subsequently on the output medium, is a key determinant of the ultimate print quality. The key process steps in xerography are: Charge When the PR is placed in the vicinity of a high voltage corotron wire, which emits charged particles via the coronal discharge process, a uniform charge density is created on the surface of the PR. The presence of a metal grid enables the regulation of the charging level, through electronics. The resulting charge level is dependent on the voltage and condition of the corotron wire, as well as the voltage on the metal grid (grid voltage). Expose The PR then travels to an exposure station. There, a binary operated raster laser beam scans the PR line by line and shines light at locations where toners are desired or not desired. Since the PR is light sensitive, area where light is shone is discharged. Thus, the
binary modulation of the laser scan modifies the discharged density profile on the PR to a profile (known as latent image) which resembles the desired image (or its negative). The depth of the discharge is affected by the laser power according to the nonlinear Photo-induced Discharged Characteristic (PIDC) of the PR. Develop In this step, toner particles are deposited on the PR. This is achieved by the interaction between the PR and the development station. Within the development housing, toner particles are precharged agitating against the magnetic carrier beads via the triboelectric process. By biasing the voltage of the development housing with respect to that of the ground plate of the PR, the charged toner particles are selectively deposited in the discharged regions (the latent image) on the PR. A toner image corresponding to the desired image is thus created. The development process depends heavily on the electric field profile in the development housing, toner charge density (tribo), as well as toners surface properties such as cohensivity. Transfer The PR is then brought in contact with an intermediate medium (a second belt or drum) or a sheet of paper so that the toner particles which were loosely attached electrostatically to the PR are deposited on the medium. This process requires that the intermediate medium or the paper be pre-charged to a sufficiently high voltage relative to that of the PR. Clean Toner particles that fail to transfer must be brushed off the PR and picked up electrostatically or mechanically. Otherwise, residual toner particles will contaminate the next image that the PR will generate. Fuse Once the toner image is attached electrostatically to the final medium (sheet of paper), the sheet must then go through a fuser where high temperature and pressure melts the plastic coating of the toner particles, permanently attaching the toners on the paper. Current xerographic systems are equipped with sensors that monitor the printing process. These include electrostatic voltmeters (ESVs) which measure the charge on the PR after the charging and
T
C
Control variable
Vg
ν Vi
Measured variables
λ P0
Vs
Expose Di
Vp
V Q/M
Toner dynamics
ν
d
DMA
Q
Develop Vb
Q
Transfer and fuse Do
Q Figure 2. Bayesian belief network model for a xerographic printing system exposure steps, PR toner area coverage sensor (TAC / ETAC) which measures the proportion of area covered by toner after development.
Several parameters in the various xerographic processes can be manipulated to control toner deposition on the PR and the transfer of the toner image to paper. Certain amount of redundancies in the set of actuators exists. For example, an increase of the corotron grid voltage in the charging station have a similar effect to an increase in laser power in the Exposure station, or an increase to the bias voltage in the Development station. Similarly, if the xerographic process for one primary color separation degrades, there is the opportunity to alter the levels of the toner deposition for the other primary colors to achieve as close a match as possible between the ranges of the designed colors and of the output colors. Therefore, fault in a subsystem or its degradation in performance can be partially compensated by the proper actuation of another subsystem to produce similar overall affects on the output image. 3. Bayesian Belief Networks (BBN) The xerographic process can be described using a set of system variables, such as PR charged voltage, scorotron grid voltage, toner density etc. Joint probabilities of these variables describe the interrelationships between them. A Bayesian Belief Network (BBN) is a compact representation of the joint probability distribution of the various system variables [6]. Formally, it is an acyclic directed graph (DAG) with nodes connected by arcs. The nodes are random variables whose values represent the observed or unobserved system variables. The arcs represent the causal relationship between
variables. They are quantified by the conditional probabilities that a child node attains a certain value given values of all its parent nodes. The diagnostic inference process is to determine the combination of various system variables that can generate the observed values of some of the nodes. It is performed by the application of Bayes rule in probability theory. BBN can be used to represent the generic knowledge of a domain expert, and to function as a computational architecture for storing factual knowledge and manipulating the flow of knowledge in the network structure. The graph structure (hence the assignment of the causal relationship between variables) in the network significantly reduces the storage required for the joint probability distribution, and the computational burden associated with the inference process. A Simple Example: In the Charge process discussed above, assume a simplified case that the exiting voltage of the PR, Vs , is only dependent on the input voltage Vi and the corotron grid voltage Vg .
Vg
Vi V Vs
Figure 3. Simple BBN model of a charging system
Consider the BBN in Fig.3 for diagnosing the cause for the exiting voltage Vs is too high. Charging physics dictate that Vs will be high when 1) the input voltage Vi is too high; or 2) the corotron grid voltage Vg is high. To capture this, the BBN for charging consists of three nodes: Vi , Vg and Vs . The two arcs show that the state of Vs depends causally on the states of Vg and Vi . For simplicty, we assume that each of these three nodes can be in one of two discrete states: "High" or the "Low". To complete the BBN, the causal relationships must be quantified by specifying the conditional probability of each child node given the values of the parent nodes. In this example, the only child is Vs . Hence we need to specify P( V s | Vi , Vg ), which can be entered as a table (Table: 1). Vi ( High) Vs
Vg (High)
Vi (Low )
Vg ( Low)
Vg (High)
Vg (Low)
High
.95
.85
.90
.02
Low
.05
.15
.10
.98
Table 1. P( Vs | Vi , Vg ) Also, the prior probabilities of all the ancestor nodes (i.e. nodes with no parents) must also be specified: P( Vi = High )=0.1,
P( Vi = Low )=0.9;
P( Vg = High )=0.1,
P( Vg = Low )=0.9.
Notice that the network structure in Fig. 3, the conditional probability table (Table 1), and the prior probabilities of the ancestor nodes are sufficient to completely specify the joint probability of the nodes in the BBN. For problems with many nodes, the joint probability table grows exponentially. However, by storing only the conditional probabilities (each of which does not involve too many nodes), BBN significantly reduces the storage necessity. Consider a simple diagnostic problem. Suppose we observe that the exiting voltage is too high, we would like to infer whether the input voltage is high, whether the grid voltage is low, or whether the input voltage is high and the grid voltage is low. To answer these questions, we consider the following conditional probabilities: P( Vi = High | Vs = High ) P( Vg =High | Vs = High) P( Vi = High and Vg = High | Vs = High) If these conditional probabilities are sufficiently high, one can be confident that the answers are in the affirmative. They can be calculated using Bayes rule as: P( Vi = High and Vg = High | Vs = High) = P( Vs = High | Vi = High and Vg = High) * P( Vi = High, Vg = High ) / P( Vs = High ) = 0.052. Similarly it can be computed that P( Vi = High | Vs = High ) = 0.47, and P( Vg = High | Vs = High) = 0.49. Thus, we can conclude, given the exiting voltage is high, that either the input voltage is high, or the grid voltage is high, but it is unlikely that both occur.
In our research, the Bayesian network approach is applied to model the xerographic process. The reasons are that BBN is a probabilistic framework with a firm theoretical foundation (probability theory, and Bayes theorem), and it is also computational feasible for use in real time. As pointed out earlier, a probabilistic description of the system is preferred to deterministic models in which knowledge is represented in logical form. It is because model uncertainty and sensor noise etc. can be accounted for; it is possible to assign ranking and trust worthiness to the various diagnosis; conflicting observations, such as during intermittent faults, can be accommodated; information content (or worthiness) of various diagnostic tests can be evaluated. The exact updating of a generic Bayesian networks is NP-hard [3]. Currently, the most efficient method for exact belief updating of Bayesian networks is the junction method [4]. The algorithm based this method is used in the commercially available HUGIN system with which we implement our Bayesian network model. 4. BBN Model of a Printing System A Bayesian network model that describes the charge, exposure, develop processes and their interactions is shown in Fig. 2. The model was developed based on physical models of the processes in the literature. The transfer and fusing processes are not fully treated in our current model. It is because our ultimate objective is to design reconfigurable process control, but since these processes do not currently have any sensing or actuation capabilities, they cannot contribute. In Fig. 2, the structure of the BBN clearly shows the sequential nature of how those subprocesses interact with the PR to determine the toner density profile on the PR. It is assumed that the manipulated variables (controllable variables) used in xerographic process control are the grid voltage of the charging Scorotron, the laser power, and the development bias voltage. Unconventional development processes may have other actuators and controllable variables. The observed variables are Vs, the voltage of the PR after the charging process; Vp, the voltage after the exposure process; and DMA, the developed toner density after the development process. The first two are generally measured using electrostatic voltmeters (ESV). D0 would be the actual density output on the output medium after the transfer and fuse processes. Currently, sensors are not available to D0 directly. Also variables in the transfer and the fuse processes are not used for xerographic process control. For these reasons, the transfer and fuse processes are not included and, for simplicity, DMA and D0 are assumed to be uniquely related. Each variable in the network is associated with a conditional probability distribution function. Randomness of these variables is due to disturbances, faults, and model uncertainties. We now describe the conditional probability distributions for each of the charging, exposure and development processes. Charging Process We assume that the charging is achieved by a scorotron (i.e. a grided corotron). When a PR passes beneath the charging system, the charge that is developed on the PR,
Vs , is given by
Vs = Vg [1 − exp(− I 0 / Vg Cv)] + Vi exp(− I 0 / Vg Cv )
[1]: (1)
where Vi is the voltage of the charging wire, V g is the applied grid
voltage, a controllable variable used to regulate charging, ν is the PR speed, I 0 is a parameter that describes the scorotron response, C
is the PR capacitance per unit area. Eq. (1) is a mathematical idealization of the charging system, which employs simplified system characteristic curves and neglects the existence of uncertainty that is very common in the process. Uncertainties are captured by defining the conditional probability distribution of Vs given Vi ,Vg ,ν , C , I 0 : p (Vs | Vi ,Vg ,ν , C , I 0 ) = N [ f (Vi ,Vg ,ν , C , I 0 ),σ s ]
t eη s As ⋅10 − D i ⋅ P0 dt (6) o ε sC where t is the exposure time. To construct the probabilistic model based on the deterministic form, the distribution of the V p given
V p = Vs −
i
∫
Vs , I x , Di is determined by
p (V p | Vs , I x , Di ) = N [Vs −
(2)
eη s As t P ⋅ dt *10 − D i ,σ p ] ε sC o 0
∫
i
(7)
where f (Vi , V g ,ν , C , I 0 ) = V g [1 − exp(− I 0 / V g Cv )] + Vi exp(− I 0 / V g Cv ) (3)
σ p is assigned as 10% of the range of V p , for simplicity.
and N [mean, dev] denotes a Gaussian distribution with mean “mean” and standard deviation “dev” (The choice of normal distribution is arbitrary and is for illustrative purposes only. The actual distribution could be obtained from the field in the real situation. This applies to following sections). Thus, the mean of the probability distribution is determined by the physics, while the uncertainty of the mathematical model is capture by the standard deviation factor, σ s in Eq.(2). The uncertainty factor can be constructed using field and component reliability data. In our experiments, σ s is assigned as 1% of the range
Development Process The electrostatic image is developed into a visible image with toner particles. Development occurs when the electrostatic force between the PR and the charged toner particles exceeds the total adhesion force of the toner to the carrier beads in the development housing. An estimate of the developed toner mass per unit area, DMA, can be obtained by assuming that development proceeds until the striping electric field collapses completely [8]. The developed toner density is therefore given by k d ε 0V p v DMA = (8) (Q / M )(d − δ )
of Vs . Exposure Process The role of the exposure process is to selectively discharge the PR based on the input image by modulating a rastering laser beam. The output variable of this process is therefore the discharged voltage. The relationship between the surface potential and the light exposure is called the Photo Induced-Discharge Curve, or PIDC. Assuming that the photon flux is Ix and the absorption area with the photogeneration region is A, the rate of change of the electric field is given by [1], dE eη s AP0 = (4) dt ε sC where η s is the supply efficiency, e is the unit electron, ε s is the dielectric permissivity of the PR. The photon flux I x is related to the controllable laser power P0 via: (5) P0 = C ⋅ I x where C is a constant. Assuming that η s is independent of the electric field, integrating Eq. (4) over the exposure time gives [7]: S E = E0 − X s eη s As λ where S = is the photosensitivity. E0 is the electric field ε s hc when the PR is not exposed to light (set up by the charging process); and X is the light exposure given by: X =
hc t P0 dt λC ∫0
where h is the Plank’s constant, c is the velocity of light, and λ is the wavelength of laser, also X i = X *10 − Di , Xi is the input exposure reflecting from the original image, and Di is the density of input image. Since, V, which is the measured voltage on the PR surface is related to E is the electric field in the photo-generation by V = Es , the PIDC is given by:
where ε o is the free space permissivity, d / kd is ration between the dielectric thickness of the PR and the developer rolling space. v is the velocity of PR, Q/M is the toner charge-to-mass ratio (tribo). δ is the gap between the development housing and the surface of the PR. Notice that the toner tribo Q/M is generally described by a statistical distribution. Similar to charge and exposure processes, the uncertainty in the model is defined by the distribution function of m given V p , Q/M, d,
δ and v, p (m | V p , Q / M , d , δ , v) = N [ f (V p , Q / M , d , δ , v),σ m ] where f (V p , Q / M , d , δ , v) =
k d ε 0V p v
(Q / M )(d − δ ) σ m is assigned as 5% of the range of m.
(9)
(10)
Ancestor nodes To complete the BBN model in Fig. 2, the prior probabilities of the ancestral variables (i.e. those variables with no parents) must be specified. For the controllable variables, they are set deterministically based on the process control algorithm. The other ancestral variables can be treated as sources of disturbances. Their prior probabilities can be specified using field data. Currently, these are specified based on nominal operating conditions. 5.
DISCETIZATION OF BBN
The BBN model described in Section 4 is a continuous BBN since the node variables can take on values in a continuum. Currently, our ability to implement continuous BBNs is severely limited: conditional distributions can only be linear functions of the parent nodes. For continuous BBNs with arbitrary nonlinear conditional probability distributions, the BBNs must first be approximated by a discrete BBN. A discrete BBN is one in which each node can only take on finite number of values (states). The discretization process amounts to partitioning the continuous probability distribution function into intervals.
The propagation and updating of BBN indeed is a NP-hard problem. The computational burden increases exponentially with the number of possible states at each node. The amount of memory for storing tables of conditional probabilities also increases dramatically. It is therefore, imperative that when the continuous BBN is discretized, the number of states at each node is kept at a minimum. To maximize the usefulness of discrete states, a maximum entropy criterion [13] is adopted in the determination of the optimal partition of the range of each continuous variable. Suppose that the continuous range of the variable associated with a node has been partitioned into n segments, a1, …., an. Let the prior probability of the occurrence of the i-th segment ai be pi. We can view the discretized node as a information source with entropy given by: H ( S )= −
n
∑ p log( p ) i
(11)
i
continuous probability distribution, P( Vs ). The approximation is based on Bayes Theorem: P( Vs )
≈
P( Vs | Vi L , Vg L , ν L)*P( Vi L , Vg L , ν L)
+ P( Vs | Vi M , Vg L , ν L)*P( Vi M , Vg L , ν L)+…
(13)
+ P( V s | Vi H , Vg H , ν H)*P( Vi H , Vg H , ν H) where the conditional probabilities are all evaluated at the combinations of various centroids of the segments in the partitions of the parent nodes. The exact expression of P( Vs ), on the other hand, requires the integration over the segments of the parent partitions. Finally, the discrete conditional probabilities of Vs in each interval conditional on the discrete state of the parent nodes Vi , Vg L ,
i =1
We wish to adjust the partition so that H(S) is optimized. It is easy to see that the optimal solution is given by
To adjust the partition for the child node Vs, so that P ( Vs L ) = PM ( Vs M )=PH ( Vs H )= 1/3 we must first approximate its prior
pi such that each
interval is as likely to happen as the other: p1 = p 2 = ... = pn = 1 / n . To illustrate the process of discretization, consider the simplified charging subsystem as an example (Fig.4). Assume that the scorotron and he PR property parameters, I 0 and C are constants. There are three inputs to the subsystem, supply voltage Vi , grid voltage V g and PR velocity ν . These variables in general can take on continuum of values. The output charged voltage Vs on the surface of the PR is the only output. Suppose that the prior probability of ancestral variables Vi , Vg , ν are continuous normal distributions with means and deviations. Assuming the range of each variable is to be partitioned into three intervals corresponding to “L”, “M”, and ”H” respectively. First, the ancestral nodes are optimally discretized by considering its prior probability distribution. Let us denote the discretized states of the variables Vi , Vg , ν as Vi L , Vi M , Vi H , Vg L , Vg M , Vg H , and
ν L , ν M , ν H. The optimal partitions are obtained by adjusting the partition so that the prior probabilities of each interval is 1/3.
v must be computed:
P( Vs H | Vi M , Vg L , ν
L
) =
∫ p(V
s
| ViM ,VgL , v L )dVs
(14)
VsM
A portion of the conditional probability table for Vs given Vi M , Vg and, v is shown in Table 2. A discretization algorithm has been written to generate the tables of prior and conditional probabilities automatically. These tables can be imported into the commercial BBN software, HUGIN, to perform probability propagation and updating when evidence is introduced into the BBN model of a printing system. Some experiment results are presented later. Discretization Algorithm This algorithm follows a recursive pattern, which is suitable to for acyclic directed graphs. An acyclic directed graph, as illustrated in Fig. 2, has a tree-like structure. This algorithm automatically transverses over all the nodes in a Bayesian network, segments the value range of each variable, calculate the joint probability of all the adjacent parents, generate the table of prior conditional probabilities of each variable. A
B
Vg C
E
ν
Vi
D
Vs
Figure 5. An example to illustrate Discretilization Algorithm
Figure 4. Simplified charging system For example:
∫ P(V )dV = 1 / 3
for the “H” interval for Vi.
V ∈ViH
Denote the centroid of each interval also with the name of the interval: e.g. Vi , H =
∫ V ⋅ P(V )dV / ∫ P(V )dV
V ∈Vi , H
(12)
Suppose we have a Bayesian Network as figure. 5. Node C has 2 parents A and B. D has 3 parents C, B and E, C and D have a common parent B. If we specify that we want to get the prior probability table for Node D, the algorithm will check if the parent nodes of node D have been segmented and the joint probabilities of B, C, and E are available.
Vi M
Vg L
Vg M
Vg H
P( V s L | Vi , Vg , ν )
νL 0.0763
νM 0.6805
νH 0.9997
νL 0.0012
νM 0.1117
νH 0.9586
νL 0.0000
νM 0.0001
νH 0.1461
P( Vs M | Vi , Vg , ν )
0.7635
0.3176
0.0002
0.2693
0.7745
0.0414
0.0011
0.0841
0.7687
P( Vs H | Vi , Vg , ν )
0.1603
0.0019
0.0000
0.7295
0.1138
0.0000
0.9989
0.9158
0.0853
Table 2. P( Vs | Vi ,
Vg , ν ) of discretized BBN (partial)
Since, B, C, and E all have not been segmented, the algorithm will do the segmentation of B, C, E at first. But the parents of Node C also have not been segmented, A and B will go first. In this fashion, a recursion is performed. The algorithm includes following procedures,
integrating over each interval of this node along the continuous probability distribution of every set of combination of intervals of all its parents.
Segmentation This procedure segments the probability distribution of a variable into desired number of intervals. The segmenting points along the value range and centroids of each interval are calculated and recorded in a structure representing this node. Least Square Error (LSE) approximation is employed to evaluate integration.
This procedure is to apply maginalization theorem to remove unnecessary variable in a joint probability. For example, if the joint probability of P(A,B,C) and prior probability of A are known. We can get P(B,C) by marginalizing P(A,B,C) over A.
Marginalization
P( B, C ) = ∑ P( A, B, C )
(15)
A
Joint Probability In a set of parent nodes, either all the nodes have one or more common ancestors, or some of them have. This set can be divided into several subsets of nodes with common ancestors or no ancestors. We use “unit” to represent these subsets. If there is only one unit and there is one parent node in it, the joint probability actually is the prior probability of the only parent. Otherwise, the joint probability can only be acquired either by looking for the joint probability entity with this set of parent nodes or maginalizing a joint probability entity with a set of nodes which include all the parents nodes. If there are multiple units and they are independent to each other, The joint probability could be calculated separately, then combined together by simple multiplication. This process resembles to acquiring joint probability of independent variables. For more complicated cases, such as P(B,C), since there is a directed connection between B and C in above example, the prior joint probability P(B,C) could not be acquired by simply multiplying prior probability of B and C. An procedure of marginalization is necessary to get it. Conditional Probability The procedure for calculating conditional probability is the main procedure in this algorithm. Main recursion occurs here. When the desired node in a Bayesian network is assigned to this procedure, it checks if its parent(s) has been segmented. If not, the procedure will call itself to process those parents. This is the main recursion in this algorithm. A Bayesian network is an acyclic directed graph. There is no loop in the graph. Therefore, by such recurrence, the ancestor nodes could be eventually reached and the recursion is terminated. All the nodes in the Bayesian network is processed one by one in a fashion of buttom-up then top-down. With the information of segmentation, centroids and joint probability of parent nodes, the centroids of one node could be evaluated by applying the deterministic relationship equation of this node and its parents and the continuous probability distribution of this node could be calculated. Then, the discrete conditional probability of this node conditional on its parents can be derived by
After getting the prior probabilities of all the nodes in the Bayesian network, some conversion is necessary to transform the information into the format recognizable by the Hugin System. A conversion procedure was written to do so. Following is a flow chart of this descretization algorithm. Get the node to be discretized
No Parent nodes
Has parents? Yes
Yes No
No
Parent nodes independe
Marginization
Yes
Joint Probability of parents
Segmentation
Yes Conditional Probability
Has parents? No
End
Input and Output Density
Input and Output Density 1 0.8
1.0000 0.8000
L
0.6000
ML
0.4000
M
Probability
Pr ob 0.6 abi lity 0.4 0.2
MH
0.2000 MH
L ML M MH H Input Density
L
Output Desity
H
0 1 0.5 0.4
0.5
0.3 0.2
Output density
0
0.1
Input density
Figure 6. The toner reproduction curve (TRC) generated using the BBN 6. RESULTS The discretized Bayesian Belief Network of the printing process has been implemented on the commercial software HUGIN for the propagation and updating of BBN. The BBN can be used both in the predictive mode and in the inference mode. To illustrate the predictive mode, the BBN is exercised to generate a probabilistic distribution of the Tone Reproduction Curve (TRC). Given a set of process parameters, the TRC is the mapping of the input image density and the output image density. In the BBN model, the TRC is generated by inputting various (discretized) input densities Di, and observing the probability distribution of the output image density (DMA). Such a probabilistic TRC is shown in Fig. 6. Notice the linear relationship between the input density and the most likely output density. Next we show the ability of the BBN to perform diagnosis. Suppose an image with moderate density ( Ds=”M” ) is to be printed. It is however discovered that the printed image is denser than desired ( DMA = “H”). To diagnose this process, we introduce Ds=”M” (moderate) to be an evidence. After probability propagation and update, the network shows there are two possible causes of the above faulty phenomena. One is that toner charge (Q) is low with probability 0.61, the other is that laser power (Po) is low with probability 0.69. If we later discovered that the laser power is normal, the network asserts that the high output density is caused by low toner charge with probability of 0.98.
Fig. 7. Screen shot of HUGIN implementation of the BBN for xerographic system. 7. CONCLUSIONS A Bayesian Belief Network (BBN) model of the xerographic printing system has been developed. A continuous BBN of the process is first
developed based on the physical modeling of the xerographic process. The continuous BBN is subsequently discretized based on the maximum entropy criterion. The discrete BBN is then implemented on the commercial BBN updating and propagation software HUGIN. The BBN has been shown to be effective in both prediction of print quality and the diagnosis of faults. REFERENCES [1] L.B.Schein, Electrophotography and development physics, Springer-Verlag, 1988 [2] D. M. Pai and B. Springett, Physics of Electrophotography, Review of Modern Physics, 5:pp.163-211, 1993. [3]Cooper, G. Computational Complexity of Probabilistic inference using Bayesian Belief Networks, Artificial Intelligence 42: pp.393405, 1990. [4] F. Jensen. Implementation aspects of various propagation algorithms in HUGIN. Research Report R-94-2014, Department of Mathematics and Computer Science, Alborg University, Denmark, 1994. [5] F. V. Jensen, S. L. Lauritzen, and K. G. Olesen. Bayesian Updating in Causal Probabilistic Networks by Local Computations. Computational Statistics Quarterly, 4:pp.269-282, 1990. [6] F. V. Jensen, An introduction to Bayesian Networks, Springer, New York, NY, 1996. [7] Melnyk, A. R., and D. M. Pai, Proceedings of SPIE/SPSE Symposium on Electronic imaging, Santa Clara, pp.141, edited by J. Gaynor, SPIE, Bellingham, WA. 1990. [8] Hayes, D. A. J. Imaging Technology, 15:pp.29, 1989 [9] W. Hamscher, L. Console, and J.de Kleer, eds., Readings in Model-based Diagnosis, Morgan Kaufmann Publishers Inc. San Mateo, CA 1992 [10] G. Brewka, J, Dix, and K.Konolige, Nonmonotonic Reasoning: An Overview. No. 73 in CSLI lecture notes, Stanford, CSLI Publications CA 1997 [11] S. Andreassen, F.V. Jensen, S. K. Andersen, B. Falck, U. Kjaeruff, M.Woldbye, A Sorensen, A Rosenfalk and F. Jensen, “Munin – an expert emb assistant,” in computer aided electromygraphy and expert systems pp.255 –277, J.E. Desmedt, ed. Amsterdam, 1989 [12] D. Geiger and P.P. Shenoy, eds., Uncertainty in Artificial Intelligence: Proceeding of the Twelveth Conference 1996, Morgan Kaufmann, San Mateo, CA 1997 [13] S. Roman, Introduction to Coding and Information Theory, Springer, 1991