Conceptual Ontological Object Knowledge Base And Language

30 downloads 68 Views 123KB Size Report
2 Institute of Mathematics and Informatics, University of Opole [email protected].opole.pl. Summary. This paper deals with AI in aspect of knowledge acquisition ...
Conceptual Ontological Object Knowledge Base And Language Marek Kr´otkiewicz1 and Krystian Wojtkiewicz2 1

2

Institute of Mathematics and Informatics, University of Opole [email protected] Institute of Mathematics and Informatics, University of Opole [email protected]

Summary. This paper deals with AI in aspect of knowledge acquisition and ontology base structure. The core of the system was designed in an object model to optimize it for further processing. Direct concept linking was used to assure fast semantic network processing. Predefined attributes used in the core minimize the number of basic connections within the ontology and help in inference. The system is assumed to generate questions and to specify the knowledge. The AI system defined in this way opens a possibility for better understanding of such basic human mind mechanisms as learning or analyzing.

1 Ambiguity Ambiguity is one of the main concepts of Artificial Intelligence studies and it is mainly linked with text understanding, which is identified with an interpretation i.e. with a reference to ontology. An ambiguous term is the one with many meanings. Therefore, it is possible to assign many different meanings to one identifier, or simply one word. For example, we can take the word ’date’ which has at least three different meanings. However, there are also some concepts that have more than one word they can refer to. We call them synonymic expressions e.g. man, person, human. The information gaps in communication resulting from the the lack of determination of all the connections between concepts cause a necessity to fill them in with guess words. Take the sentence ’John is at school’, which could mean ’John is at his school’ and more precisely ’John is at the school he studies in’. To be unequivocal we should say that this sentence refers to the following statement: the object ’John’ is in the relation ’to be at’ with the object ’school’ with whom John is in the relation ’to study in’. However, this might not be precise enough, so we should say: the object ’John’ is in the relation ’to be at’ with the object ’school’. As we can see, the ambiguity is the main factor affecting the process of understanding. The methods that are currently in use are focused on an analysis of direct

2

Marek Kr´ otkiewicz and Krystian Wojtkiewicz

and indirect contexts. Although they are effective, they are not designed to understand the text. They just assume that some terms are assigned to certain concepts. The assumption is based on the topic of the whole text. Totally different is an approach set in the direction of understanding the text, which is represented as linking of concepts derived from this text with the ontology[1]. In this way, we have an opportunity to excel in interpretation based on the knowledge saved in ontology. This aspect of building the ontology refers to the process of human mind development[2]. With the use of the AI System configured in this way, we can examine its behavior in the situation where the main factors are a changeable structure of information or the inference module. Taking this as a leading point of studies, we can try to find the source of ’stupid questions’ asked by children.

2 Active Knowledge Acquisition Taking previously presented theory as the leading one, an obvious conclusion is that one of the main features of AI system is the knowledge acquisition stored in the attributes of the semantic network elements and in the relations between those elements. This activity requires high accuracy in choosing questions. Already collected knowledge should create the base for determining unambiguous and correct interpretation of statements. This can start further processing, e.g. deduction. Processing ambiguous information may lead to wrong conclusions. For example, we can take a sentence ’John has a date with Ms Johnson’. During the analysis, we find that ’date’ has three different meanings in the ontology. However, semantic structure of ’date’ as the one in calendar excludes it from further processing. The other two are equally possible to use, so it is a classical problem of ambiguity. The next step is to determine which of these meanings should be used in this case. The problem of ambiguity can be easily solved with the questions about detailed attributes of connections in the semantic network. The knowledge gathering action cannot create a wrongful conclusion. Children have their own mechanism of understanding ambiguous statements that generates such questions as: Who? What? Where? When? How?[2] Trying to imitate evolving mind, AI System should behave exactly in the same way. It is expected that the number of questions declines together with the growth of knowledge base, analogically to the human behavior. There are a few well known methods of knowledge representation such as: rules, semantic networks, logical notations[3]. All of them have positive and negative aspects. Although essential is that the chosen one would in the best possible way ensure obtaining required functionalities in the scope of storing ontological knowledge, facts, rules as well as in fast and easy processing in aspects of inference, question generating and complex facts matching.

Conceptual Ontological Object Knowledge Base And Language

3

3 Main Features of COOS The subject area of this study includes the structure of ontological base shown on Fig. 1. During its construction, the main assumption was to ensure maximal capacity of knowledge directly accessible through the connections between specified elements: CONCEPT, ASSOCIATION, PREPOSITION, CLASS, FEATURE, VALUE, DATATYPE, being a part of a given microtheory, which contains rules. The implemented semantic network between concepts ensures Hypernym 0..1

+ + + + + + + + + + + + + + +

*

* DATATYPE

F-V

Symbol Qualitative Enumerable Value Feature

: AnsiString : float : float : List VALUE : List FEATURE

ID Name Symbol Comment Type CompleteSpecialization Emotion Abstract Generalization Specialization Part_of Parts Synonym Antonym Rules

0..1

*

Hypernym : long : AnsiString : AnsiString : AnsiString : AnsiString : float : float : bool : List CONCEPT : List CONCEPT : List CONCEPT : List CONCEPT : List CONCEPT : List CONCEPT : List RULE

0..*

0..1 Meronym

Holonym

feature_datatype

*

Enumerable Class Feature Association DataType EnumValue

: float : List CLASS : List FEATURE : List ASSOCIATION : List DATATYPE : List VALUE

feature_class * *

MODALITY

0..1

+ + + + + + + + +

Object Plural Physical Enumerable Collective Animated Feature Association Preposition

0..*

RULE

0..1 0..*

Antonym used_by

1..*

0..*

MICROTHEORY C

1..*

1..*

+ Concepts : TStringList + Rules : TStringList

R

0..*

+ + + + + +

ID_N Rule_Contents Generalization Specialization Part_of Parts

: long : AnsiString : List RULE : List RULE : List RULE : List RULE

0..1

0..1

Holonym

0..* 0..1

Meronym

0..*

+ Save () : int + Load () : long Exclude RELATION {abstract} + + + + +

0..*

: bool : tribool : tribool : tribool : tribool : tribool : List FEATURE : List ASSOCIATION : List PREPOSITION

Hyponym 0..1

0..*

+ Binary : bool

Symmetry Reflexivity Transitivity Inverse R_Class

: float : float : float : RELATION* : List CLASS

0..1

C-P * + Question : AnsiString : List CLASS + Class + Association : List ASSOCIATION

class_realation

feature_relation

0..1

0..1

PREPOSITION

*

* *

0..1 Feature

0..* Synonym

CLASS

* FEATURE

+ + + + + +

0..1

{abstract}

value_datatype

+ + + + +

0..*

CONCEPT

+ Value : AnsiString + DataType : List DATATYPE + Feature : List FEATURE

*

Hyponym 0..1

0..*

VALUE

Inverse

+ + + + + + + + + + + P-A * + * + + + + * + + * + +

0..* ASSOCIATION

Periodicity Permanent Past Present Future Event Activity State Time Space L_Multiplicity R_Multiplicity L_Multiplicity_min L_Multiplicity_max R_Multiplicity_min R_Multiplicity_max L_Class Exclude Feature Preposition

: float : float : float : float : float : float : tribool : tribool : float : float : int : int : int : int : int : int : List CLASS : List ASSOCIATION : List FEATURE : List PREPOSITION

0..*

Fig. 1. The object model of the ontological base structure

a possibility of direct search for essential elements. This is very important for question generation. For example, CLASS has a structural connection with concepts ASSOCIATION, FEATURE and PREPOSITION, but it does not have a straight link with VALUE. It is derived from the assumption that VALUE cannot refer to CLASS. The statement ’red car’ (VALUE:=’red’ and CLASS:=’car’) within the presented ontology has to be filled out with the information connecting them i.e. FEATURE. This statement should be presented as ’car of red color’ or ’red color car’ i.e. C#car [F#color V#red]. The studies of the ontological base structure show the following categories of information gaps: • • •

a lack of concepts in the ontological database, a lack of information e.g. attribute values of already known concepts, a lack of information on concept linking in the database.

If the system does not have a concept in its base, it will ask for it. We should interpret it as an attempt to create a new object and to fill in its attributes

4

Marek Kr´ otkiewicz and Krystian Wojtkiewicz

with appropriate values. The lack of information about specific values of essential attributes generates a question that enables filling the values. In case there is no connection in the ontological database between certain concepts that are connected in the statement, the system question will refer to the correctness of a given connection. The concepts C#car and F#color may not be linked in the ontology. That means the system does not know whether the class ’car’ is connected with the feature ’color’. If the system obtains the information that this connection is allowed, its future uses will be clear. Moreover, the system will use this information in any other statement that contains the class ’car’ to generate questions about this feature. The rule applies to all the other connections within the ontological database. The presented structure is the core of the ontology. It allows for an instant verification of concepts connection admissibility. This core contains all the information about essential attributes for each of the concepts. A large number of basic attributes decreases considerably the number of connections. For each of the concepts, there are specified attributes needed for the correct interpretation. Not only do they make the system operate faster, but they are also used to distinguish between different categories. Basic connections and object model proprieties are implemented directly into the structure. Most of them are described in the next section. The class CONCEPT has a number of technical attributes such as ID, Name, Symbol, Comment, Type, whose meaning is obvious or is not essential in this matter. Therefore, we do not characterize them. The attribute CompleteSpecialization is bound with Specialization containing the list of concepts being specialization of a given concept according to the scheme specializationgeneralization. If CompleteSpecialization is set to value one, it means that the list Specialization is complete, and it is not possible to add any more concepts to it. Generalization contains the list of all the concepts being generalization of a given concept. The attributes Part of and Parts are used for a description of the dependency - ’whole-part’. The lists of synonyms and antonyms of the concept are located in the attributes Synonym and Antonym. The additional attribute Rules is a list of rules, in which the concept is used. Its task is to accelerate considerably the context search for concepts, which is of high importance for its future uses in the scope of ontological database analysis. The class named CLASS is a description of the objects representing classes and objects presented in knowledge base. The attribute Object differs objects from classes. An example of the class is C#car, and of the object is C#Alan_Turing. The meaning of the attributes: Plural, Physical, Enumerable, Collective, Animated is compliant with their intuitive meanings. The lists FEATURE, ASSOCIATION, PREPOSITION describe possibilities of linking elements of those lists with a given class. The objects of class FEATURE contain information about the features that can be related to CLASS, FEATURE, ASSOCIATION, DATATYPE or

Conceptual Ontological Object Knowledge Base And Language

5

VALUE. The limitations of those links are within the values of the attributes: Class, Feature, Association, DataType, EnumValue. VALUE is a class with only three attributes whereas Value is a string representing a value of this class e.g. V#5 or V#red. The lists DataType and Feature are analogical to the previously described ones. They set the scope of possible connections between the mentioned concepts. The class RELATION is an abstract class, which is the generalization of class PREPOSITION and ASSOCIATION. The attributes Symmetry, Reflexivity, Transitivity are the basic features of relation. The attribute Inverse shows an inversed relation to a given one and R Class is a list of admissible right operands of relation. The class PREPOSITION contains the lists - Class and Association and the attribute Question that contains question expressions appropriate for given preposition. The class ASSOCIATION describes the key element for the majority of statements. The attribute Periodicity is linked with the periodicity of association. Permanent is the information about permanency of association. The attributes Past, Present, Future are used for fact positioning on the time axis. Event gives information whether the duration of association is important or not. Activity and State are used for telling the difference between states e.g. A#sleep and activities A#go. Attributes Time and Space contain the information about invariability in time and space. The attributes within the Multiplicity group determine the multiplicity of a given association or the scope within which the multiplicity may be set. L Class is a list of allowed left operands. A list of concepts excluded from the simultaneous use is stored in the attribute Exclude. Feature and Preposition are the lists of features and prepositions that are suitable for a use with a given association. The class MODALITY contains the description of objects making up modalities of statements. If the attribute Binary has a value True, then the modality is bipolar. The class RULE has the main attribute Rule Contents being the text representation of a rule. The lists Generalization, Specialization, Part of, Parts have pointers to the rules being in relation ’generalizationspecialization’ or ’whole-part’ with a given rule. MICROTHEORY is a class describing fields that are aggregations of concepts and rules. Thanks to it, it is possible to move quickly through the inheritance hierarchy, which results in the verification efficiency of related features or of creatable relations, and in other aspects of the object model. The logical attribute type is used in the form of a tribool to enable the operation in a three-valued logic. The issue of answering the problem questions such as ’why?’, ’what for?’ is moved to the competence of the inference module. The general system scheme of the module orientated structure is presented on Fig. 2. It presents a transition of information from the outside of the system to the knowledge base and in the opposite direction. One of the most important features of this construction is a possibility of multilingual text interpretation and translation.

6

Marek Kr´ otkiewicz and Krystian Wojtkiewicz

Natural Language translator (1)

Rules Metalanguage Interpreter

Inference Module

Ontological Core

Statement (metalanguage)

Natural Language translator (2) Natural Language translator (n)

Fig. 2. The general diagram of the connections between the main system elements

4 Metalanguage Metalanguage was created to obtain a possibility of knowledge storing and processing. It performs an interface function between the knowledge base and natural languages[4][5]. A series of assumptions was made while developing grammar and semantic rules. Metalanguage has to be accurate. It means that the possibility of multi-interpretation has to be limited as well as the structure has to force the use of some specifications. It is also very important for this language to be as compact as possible, but at the same time easy to interpret. The latter statement is in opposition to the assumption of simplicity of automatic processing to make different forms of inference. The basic features of the metalanguage are presented below on the examples drawn from the paper[6]. To provide a comparison with other languages of the type, examples are provided in: English (E), Formalized-English (FE), Frame Conceptual Graphs (FCG), Conceptual Graphs Linear Form (CGLF), Conceptual Graphs Interchange Format (CGIF), Knowledge Interchange Format (KIF) and Conceptual Ontological Object Language (COOL) along with its simplified grammar tree. Example: 1 E: FE: FCG: CGLF: CGIF: KIF: COOL: Tree:

Tom owns a dog that is not Snoopy. Tom is owner of a dog different_from Snoopy. [Tom,owner of: (a dog != Snoopy)] [T:Tom][time:"2002"] ] } [proposition *p: (agent [liking *l] Mary) (object ?l Tom) ] (believer [situation: (time[situation: ?p] "2002")] Tom) (believer [situation: (before [situation: ~[?p]] "2002")] Tom) (exists (?p) (and (= ?p ’(exists((?x liking)) (and (agent *l Mary)(object ?l Tom)))) (believer ^(time ,?p 2002) Tom) //’,?p’->the value of ?p is quoted (believer ^(before (not ,?p) 2002) Tom))) {C#Tom} A#believe ({C#Mary} A#like[Present=1][F#time D#year V#2002] {C#Tom}, {C#Mary} ~A#like[Past=1] {C#Tom}) A#believe C#Tom A#like[Present=1] C#Mary C#Tom A#believe C#Tom ~A#like[Past=1] C#Mary C#Tom

8

Marek Kr´ otkiewicz and Krystian Wojtkiewicz

The example shows the use of time attributes Present=1 and Past=0. It also presents the representation of nested associations. The association A#believe has the right operand in the form of a list because Tom’s belief refers to two facts, which may be written down separately. This rule could be explained as follows: {X#a} A#r (expr_1, expr_2) which means: {X#a} A#r expr_1 && {X#a} A#r expr_2, and {X#a} A#r (expr_1; expr_2) which equals to {X#a} A#r expr_1 || {X#a} A#r expr_2. In the above example && stands for a conjunction and || for an alternative.

5 Conclusion The optimal ontology development is still the main concern of the studies over AI Systems. The part of AI systems is being designed to imitate the human intelligence. They often just try to fulfill Turing criteria but the only rational way is by building the system, which can understand any given information. Understanding is a possibility of writing knowledge within ontology base and a further verification and inference. The value of Turing criteria is disputable and there are other conditions to fulfill such as efficiency in the knowledge base search, the verification of the statement semantic correctness, complete answers providing detailed features, the relations possible for concepts, generalizations or specializations etc. All of these can be easily fulfilled by the structure presented in this article. The article describes mainly the core of the system and is a short introduction to the metalanguage. The complete and compact description of the metalanguage and the realization of the semantic network will be presented in the next articles.

References 1. Dyachko A.G. (1997) Text Processing and Cognitive Technologies, Moscow, Pushchino, ONTI PNTS RAN 2. WÃlodarski Z. (1996) Psychology of learning, PWN Warszawa (in Polish) 3. Fuchs, N.E., Schwertel, U., Torge, S. (1999) Controlled Natural Language Can Replace First-Order Logic. 14th IEEE International Conference on Automated Software Engineering, Cocoa Beach, Florida 4. Landauer T. K., Littman M. L. (1990) Fully automatic cross-language document retrieval using latent semantic indexing. Sixth Annual Conference of the UW Centre for the New Oxford English Dictionary and Text Research. Waterloo Ontario: UW Centre for the New OED and Text Research 31-38 5. Frederking R., Mitamura T., Nyberg E., Carbonell, J. (1997) Translingual information access, AAAI Symposium on Cross-Language Text and Speech Retrieval. American Association for Artificial Intelligence 6. Philippe M. (2002) Knowledge representation in CGLF, CGIF, KIF, Frame-CG and Formalized-English, 10th International Conference on Conceptual Structures, Springer Verlag, LNAI 2393, pp. 77-91) Borovets, Bulgaria

Suggest Documents