East Lansing, MI48824. U. Schulthess & R. Ward. Crop and Soil Sciences Dept. Michigan State University. East Lansing, MI48824. Abstract: In this paper, work ...
Copyright © lFAC Artificial Intelligence in Agriculture. Wageningen. The Netherlands. 1995
AN EXPERT SYSTEM FOR WHEAT DISORDERS DIAGNOSIS AND TREATMENT USING A HIERARCHICAL CLASSIFICATION PROBLEM SOLVER
S. EI-Beltagy & A.RaCea
Central Lab for Agricultural Expert Systems Agricultural Research Center Ministry ofAgriculture and Land Reclamation. Cairo, Egypt
A.Kamel & J.Sticklen
U. Schulthess & R. Ward
Intelligent Systems Laboratory Computer Science Department Michigan State University East Lansing, MI48824
Crop and Soil Sciences Dept. Michigan State University East Lansing, MI48824
Abstract: In this paper, work done on an Expert System for the diagnosis and treatment of wheat disorders using a Hierarchical Classification Problem Solver is discussed. The criteria on which a classification model was built is described. A method for promoting an intelligent dialogue with the user initially designed as a result of a conflict that arose between the different branches in the classification hierarchy is also addressed.
Keywords: Expert Systems, Diagnosis, Knowledge-based systems, Hierarchical structures, Classification, Intelligence, Dialogue, Database.
of view. The strategic aspect of this system involves decisions to be taken with respect to varietal selection, optimal planting and harvesting dates, preplanting parameters, and irrigation/fertilization management. The tactical aspect, on the other hand, deals with insect, malnutrition, disease, and weed identification and remediation. In addition, an integration between this system and a simulation model based on the CERES wheat model (Ritchie, Godwin, & Otter-Nake, 1985) is planned.
1. INTRODUCTION
The application of classification especially with respect to diagnostic problems has been gaining increasing attention in AI research (Gomez & Chandrasekran, 1981; Weiss & Kulikowski, 1984; Clancey 1985). In this paper, an Expert System for the Diagnosis and Treatment of Wheat disorders using a Hierarchical Classification Problem Solver is described. This Expert System is part of a larger project for irrigated wheat management in Egypt. The aim of the project is to develop an integrated system that will address all aspects of wheat management both from a strategic and a tactical point
This paper, addresses two main issues: 1. the criteria on which the system's classification model was based and 2. a data inferencing
257
themselves one by one. This process of attempting to establish the descendants is referred to as "refming" the parent hypothesis. If any of the descendants is established, it repeats the same action, and so on until its leaf is established. If, on the other hand, a hypothesis fails to establish, then it is ruled out and so are all the hypotheses beneath it in the hierarchy. The entire process is very similar to that employed by a depth frrst search technique.
mechanism that was designed in order to promote an intelligent dialogue with the user.
2. CONCEPTUAL MODEL REPRESENTATION In developing the Expert system for Diagnosis & Treatment of wheat disorders in Egypt, the Generic Task Approach to expert system development proposed by Chandrasekran (Chandrasekran, 1986) has been followed . The idea behind the Generic Task approach, is that the way a problem is to be solved, depends largely on its type ex. diagnosis, design, etc. Consequently, problems of the same type could share some sort of a generic problem solver. So, according to the Generic Task methodology approaching a diagnosis problem will be inherently the same regardless of the domain in which such a problem is being addressed. The classical example of a problem solver that could be applied to a diagnosis problem is Hierarchical Classification (Gomez & Chandrasekran, 1981; Chandrasekran, 1983 ) and it is this problem solver that has been used in implementing the Wheat disorders expert system.
2.2. Representation model
At this point, a clarification of what is meant by 'Wheat Disorders' is in order. References to wheat disorders in this paper, will mean disorders caused by insects, nutritional deficiencies, or diseases. Although each of these disorders represents a different scope within the Wheat domain, they have been integrated into one classification system. In the initial design, though, attempts were made to separate them into three separate classifications thus allowing each subsystem to apply a classificatio~ that was best fitted to its nature. They were to be integrated into a Hierarchical classifier where at the top level one or more observations would determine which of the three classifications would be pursued. However, as the process of knowledge elicitation progressed, it was discovered that such distinction was not possible at a high level in the hierarchy. The three subsystems shared many common observations which meant that in order to differentiate between the~ the. system would have to ask about very speCific disorder knowledge at a high level in the hierarchy. In this case, what is called a high level in the hierarchy would have contained almost all the disorders which represent the leaf nodes and the entire concept of hierarchical classification would have been breached. Due to those fmdings, it was concluded that this design was not a valid one.
2. J Hierarchical Classification
Hierarchical Classification is a method for organizing, controlling and choosing from a set of hierarchically organized hypotheses. In this hierarchy, knowledge is organized in such a way that you move from the more general hypotheses to the more specific as you transcend the hierarchical tree structure as shown in Fig. 1. Using a control strategy known as "establish and refme" the hypotheses are explored top-down. If a hypothesis at the top level establishes, its immediate descendants are required to establish
Most General (hypothesis)
/
In the second attempt to design this problem it was decided that the commonalties between the three subsystems should be used in order to achieve a better classification. At that time, sufficient knowledge about each of the subsystems had been acquired to extract enough commonalties between them and thus create a unified classification. The most general commonalty between diseases, insects and nutritional deficiencies, was found to be the effects they have on different plant parts. These effects are manifested in the form of observable symptoms such as leaf discoloration stem deformation, etc. Thus, the second classification was based on observations visible to the user on different plant parts (Fig 2). By asking, the user what growth stage the plant is in, the system is able to determine
~
Less General
Less General
........•
Specific
1
Most Specific Fig. 1. A Hierarchical Tree Structure.
258
Wheat Disorder
Fig 2. A Simplified diagram of adjusted classification. which plant parts to ask about. Further questions reveal whether a plant part is abnonnal or not. Once a plant part is known to be abnonnal, the system can ask about the specific abnonnality which this part manifests, and so on until it reaches very specific observations on that part. At this point, if the system suspects a given disorder, it tries to establish it by asking all questions relating to that disorder, even if such observations are on another plant part.
(Fig 2). Basically, if a path from a root to a leaf exists, then the disorder at the leaf is present. Using the classification model detailed above, an implementation problem was faced . This problem is a general one that could be encountered by anyone using Hierarchical Classification. The problem is encountered when a leaf node needs more than the infonnation found in its path to establish. Typically the infonnation it needs will be contained in another path. In our case, the problem became apparent when the system had to ask questions that were very specific to other plant parts in order to establish a given disorder. The following case is an example: The system has established that there are reddish brown pustules on the leaves, but it still needs to know if there are similar pustules on the stem in order to detennine whether the disease is Stem rust or Leaf rust (both diseases result in reddish brown pustules on the leaves). At this point, the system will ask the user if there are reddish brown pustules on the stem or not. Assuming that the leaves' path is to be pursued before the stem's, if the user confinns the stem observation, asking him later if there is any abnonnality on the stem(a question needed to establish the stem node)or if there are any
3. IMPLEMENTATION This system has been implemented using a Generic Task Tool developed at Michigan State University (MSU). In this tool, a node is represented by a table matcher where each entry in the table represents either a database variable or another matcher. Each database variable is associated with a question. A user will be presented with the question only if the database variable has never been assigned a value. The combination of possible inputs for each question denotes different rules and matching patterns. If a combination of inputs results in a match value greater than a given threshold, the node is said to be established. By asking the user a series of questions, the system is able to pursue or rule out paths in the classification in which the leaves represent disorders 259
However, after implementing this solution it was found that the system became very confusing to the user especially when the paths were long. Ex. the user would be asked a lot of questions about the leaves, then general to specific questions would be asked about the stem, then all of a sudden the user would be asked about the leaf again. This solution also posed maintainability complications and as a result, was discarded.
Reddish_Brown_Pustules_on_Leaves Are there any reddish brown coalescing pustules on the leaves and sheath? Reddish_Brown _
pustules~~~~em
Are there any reddish brown coalescing pustules on stem?
________________ l~~~ ____________ , I
I
More questions about the leaves.....
I
I
I
~---------------~-----------------
Second Solution: Building an inference mechanism that will cause the expert system to act in a more "intelligent" way.
I
Stem_ Abnonnal Is there any visible abnonnality on the stem?
Pustules_on_Stem
Yes
Are there any pustules on the stem?
iII0giC~
It could be stated that if a specific observation is true then so is its general observation. So, it was concluded that if using some method, the system could logically infer a value of a database variable (ex. a general observation variable) from the value of another database variable (ex. a specific observation variable), then an optimal solution will be reached. In the example given above, given that the observation on the stem is 'True' i.e. there are reddish brown pustules on the stem, it should be deduced that the stem is abnormal even though the question of stem abnormality has never been asked, thus the database variable containing this information should be set to 'True' as well.
questions
I
Fig. 3. An illustration of the sequence of questions showing two illogical questions at the end. pustules on the stem, will seem illogical (Fig. 3), and might cause the user to lose confidence in the system. Since, this system deals with a large number of disorders, some of which require the opposite order of inferencing, the simple solution of reversing the paths would have never worked.
The method followed in order to bring that concept into effect is as follows: A framework where relations between database variables could be represented, was created. Using the concept of dependability, relations between variables were defmed. Example: it is said that if variable X has as its dependents variables: V, Z, & Y then if variable X takes on a value of 'True' then all its dependents (V, Z, & Y) are set to 'True', otherwise its dependents are left unassigned. In the example given, the variable Reddish- Brown- Pustules-on-Stem will have as its dependents Stem_Abnormal and Pustules_on_Stem. The example above represents what could be called a (True True) Dependency.
4. DATA INFERENCING In the problem described in the previous section, a conflict arose as a result of asking about a specific observation in another branch of the hierarchy before asking about its more general observations. To overcome this problem, two solutions were proposed. First Solution : Guarding variables by their Path. If before asking a specific question about a plant part in another path the system was to ensure that all general observations relating to that part were to be asked first, this problem could be avoided. This could be done by including all the variables in the path preceding the specific observation on that other plant part in the table matcher of the disorder the system attempts to establish. This solution basically means that if an observation is needed from another branch in the hierarchy, the complete path needed to reach that observation will be pursued, but then, control will return to the active branch. In the example given in Fig. 3. this will mean that the last two questions will precede the Reddish_Brown_Pustules_on_Stem question.
After implementing this sort of dependency and observing its effect in increasing the efficiency of the expert system, it was decided to extend this concept even further in order to cover all possible Boolean and non Boolean combinations. However, this extension was no longer to solve a problem, but to enforce a more intelligent behavior over the expert system and thus give its user the benefit of an 'intelligent dialogue'. In order to achieve this extension, a set of dependencies was identified. These dependencies include the following: (True False): If a value of one variable is true then the value of its dependent should be false .
260
(False True): If a value of one variable is false then the value of its dependent should be true. (False False): If a value of one variable is false then the value of its dependent should be false . (True Value): If a value of one variable is true then the value of its dependent should be set to a given value. (False Value): If a value of one variable is false then the value of its dependent should be set to a given value. (Value True): If a value of one variable is equal to a given value, then the value of its dependent should be true. (Value False): If a value of one variable is equal to a given value, then the value of its dependent should be false . (Value Value) : If a value of one variable is equal to a given value, then the value of its dependent should be set to another given value. Value is any value assigned to a non Boolean variable. The (Value Value)dependency is the most general dependency form as it encompasses all the other dependencies. However, the other forms were presented as they are related to implementation issues.
there exists relations between variables such that a value of one variable could be deduced from another.
6. ACKNOWLEDGMENTS We would like to thank Dr. Rafeh, and Dr. Salah for their support and Dr. Abd El Maboud, Dr. Abd El Ghani, Dr. El Dawoudy, Dr. Hammam, and Dr. Soliman for their knowledge. We would also like to thank engineers E.Fahmi, K. El Bahnasi, and M. Ismail, for acquiring and coding that knowledge. This research is supported by joint USDA (OICD) / NARP funding.
REFERENCES Chandrasekran, B. (1986). Generic Tasks in Knowledge-Based Reasoning: High-Level Building Blocks for expert system design . IEEE Expert, 1(3), 23-30. Chandrasekran, B. (1983). Towards a Taxonomy of Problem Solving Types. AI Magazine, 4(1), 917. Clancey, W. J.(1985). Heuristic Classification. Artificial Intelligence 27(3), 289-350. Gomez,F., & Chandrasekran, B. (1981). Knowledge Organization and Distribution for Medical Diagnosis. IEEE Transactions on Systems, Man, and Cybernetics, SMC-Jl(1), 34-42. Ritchie,J.T., Godwin, D.e., & Otter-Nacke, S. (Ed.)(1985). CERES Wheat. A Simulation Model of Wheat Growth and Development. College Station, Texas: Texas A&M University Press. Weiss, S. M. & C. A. Kulikowski. (1984). A Practical Guide to Designing Expert Systems. Totowa, New Jersey: Rowman and AlIanheld.
5. CONCLUSION In this paper, an experience in building an expert system for wheat disorders diagnosis and treatment using a hierarchical classification problem solver, was presented. Among the problems encountered in designing this system, was the problem of logical inconsistencies resulting from inter-links between knowledge in the different branches of the classification hierarchy. To overcome this problem, an inferencing mechanism was devised. The wheat disorders diagnosis system was implemented using this inferencing approach. The system which is currently subject to field testing, has demonstrated both acceptable behavior and enhanced performance. Any design based oh the Generic Task Hierarchical Classification, is prone to the problem encountered in designing the expert system for wheat disorders. This is especially true when logical dependencies between variables exist without there being a way of knowing which variables will be assigned a value first. By applying the inferencing mechanism described, not only will such problems be solved, but a more intelligent behavior could be enforced. So, although this concept was essentially developed to solve the problem mentioned above, it has been found to be a powerful inferencing mechanism that could be used to enhance performance even in situations where no such problems are encountered. Basically, it could be applied in any model where 261