a frame-based lexicon for arabic language

The 24th Annual Conference on Statistics, Computer Science and Operations Research, Inst. Of Statistical Studies & Research, Cairo University, Egypt, 23-25 December, 1989.

A FRAME-BASED LEXICON FOR ARABIC LANGUAGE Dr. Ahmed A. Rafea Inst. Of Statistical Studies & Research, Cairo University, Tharwat St., Orman, Giza Egypt

1.

Khaled F. Shaalan Inst. Of Statistical Studies & Research, Cairo University, Tharwat St., Orman, Giza Egypt

INTRODUCTION

The Lexicon is very important component in any NLP systems. A lexicon must contain enormous amount of lexical knowledge necessary for robust NLP systems. Organizing this amount of knowledge has become one of the major problems in any development of a NLP system [PUS 87]. In our case, a decision has been taken to store only this lexical knowledge related to understanding the stem. Any other lexical knowledge which does not relate to understanding (e.g. language synthesis) is not taken for the time being. However, we think that the same methodology can be applied on them. A possible solution to design and implement the lexicon was to store the stem and its different attributes in a database file. This solution will suffer from the following disadvantages:  



No procedure will be possible to be attached to a field. As the amount of the stem attributes many be enormous, especially if all the attributes serving all purposes are to be stored, the size of the database will grow very fast. Any interpretation of this enormous amount of the stems and their attributes according to any lexical

criteria will need a specified routine and will take a very long-time to get it. For these reasons, we decided to use a frame-based lexicon that will overcome most of these problems. This system is based on the frame method [BRO 87], [FIK 85], [WAT 86], [MIN 86] for knowledge representation. An early trial to study the usage of frames to represent lexical knowledge is done by Rafea [RAF 86]. 2.

METHODOLOGY

The lexicon is represented as a generic-frame along with its instances. The generic-frame represents a class of stems having the same attributes. A stem, in our definition, is a verb, a noun, or a particle without any inflectional symbols. For a noun, it is the singular masculine indefinite form, or a broken plural (irregular Arabic noun plural). For a verb, it is the past tense verb as it appears in a standalone form. The generic-frame has two slots.  

The first slot will be filled with: a noun, verb, or a particle frame. The second slot will be filled with all the stems having the same attributes.

The noun, verb, and particle frames have morphological, syntactical, semantical slots. The value of these slots are frames. A slot is represented by: its name, the possible values it may take, and the method by which the slot may be filled or retrieved. The possible values, a slot my take, are: enumerated values or a frame. In case of enumeration values, the slot may be filled with a multivalue, e.g. the possible value of the slot type of the agent is (inconcrete, human, animal, plant, and thing). This slot can be filled with a combination of these values, i.e. inconcrete, human, animal, plant, and/or thing. We have provided three methods to fill a slot with its value:

2







By a default value attached with the slot name in the genericframe, e.g. the default value of the gender attribute is masculine, By filling it during the creation of the instance frame, e.g. overwrite the default value masculine with feminine or their combination, or By attaching a procedure (so-called demons [WIN 84]) to the slot in the generic-frame, e.g. a when-added demon is attached to the root slot to convert an inputs stem such as “‫ ”تعمير‬into its root “‫”عمر‬.

Instances of the above generic-frame are created by a special acquisition routine that uses it to generate questions to the user. The first question is to enter a stem entry of the lexicon. The system searches for it. If found, it returns the attributes of this frame for verification; otherwise, a set of queries is to be generated to acquire the attributes of the stem. It is interesting here to mention that if a stem has the same letters but differ in meaning, another entry is created. It is actually a new stem which will be stored in another class, e.g. ‫ ه ّم‬، ‫ تا َجر – هُم‬، ‫ ّمن – ت َاجر‬، ‫مِ ن‬ All the stems having the same attributes are linked together filling the second slot of an instance of the generic-frame. In the same time all the stems in the lexicon are stored in a lexically binary tree representation. In this way, a fast access can be made to the attributes of a specific stem. Al the stems sharing the same orthography are linked together, e.g. ‫ ه ّم‬، ‫ هُم‬such that we can access their attributes. The acquisition routine takes care of any instances that have been already created such that no duplication of the same instances could occur. It takes into consideration the inheritance of all possible common values in the generic-frame and/or the possible common values from the sibling instance frame.

3

3.

IMPLEMENTATION

The lexicon is a very important component in any NLP system. In section 3.1, a detailed discussion is made concerning the reasons for which we decided to use a frame-based lexicon. It is important to note that the lexicon can be viewed from two angles. The first view is looking to the lexicon as a generic-frame with its instances. The generic-frame contains two slots: The first is to be filled by noun, verb, or particle frame, and the second is to be filled with the stems sharing same attributes’ values. This generic-frame is sometimes called schemata [TAN 87]. It just represents what any instance frame must have as slots. The second view is looking to the lexicon as containing the stems stored in a certain order such that we can get all the attributes of a specific stem. This view is used by the lexical or morphological analyzer to recognize the stem part of an inflected word. In effect, the lexicon does not only contain information about the stem but also some lexical knowledge concerning this stem. The first subsection discusses in details how the lexicon is implemented. The second subsection presents the incremental building process of the lexicon, and the tool built to acquire knowledge from the user. The maintenance of information and knowledge stored in the lexicon is a complicated task; therefore, a special tool is designed and implemented which is introduced in the third subsection. 3.1 Lexicon Structure The internal representation was our first problem in building the frame-based lexicon. This internal representation must take into account different factors related to:    

Storing different stems and attributes Facilitating the function of an interpreter that can use this lexicon as a knowledge base and getting certain conclusion. Acquiring knowledge from user. Maintaining knowledge stored in the lexicon.

4



Preserving a taxonomy of words according to their attributes.

As mentioned above, the lexicon consists of a generic-frame and instances. The generic-frame is conceptually a static part of the lexicon as the attributes of any stem can be defined by linguists and Arabic specialists. The frame is represented as a linked list on a file. The node of this list is a record. This record represents a name of a frame, an attribute name, and a value or a demon. The frame file refers to other four files. These files are: 







STEM file: This file contains stems grouped into classes where each class belongs to an instance of the generic-frame. Each class is represented by a linked list of stems. This file is viewed by the lexical analyzer as a lexically ordered binary tree, and is used by the matching routine for searching the existence of a certain stem in the lexicon. WAZEN file: A record of this file contains the template (Wazen) letters “‫ ”الوزن‬and the corresponding diacritic sign of each letter. ROOT file: This file is used by the procedure when_added or when_needed attached to the slot corresponding to the root, to get the root of the input stem. It is also used by the procedure attached to the slot corresponding to the singular or a broken plural stem to the singular form on a when_added or when_needed basis. In effect, a record of this file contains a field describing the relation between the stem letter and the radical letters, and another field indicating the diacritic sign of each radical letter. NAMES file: This file contains frame, attribute, value, and demon names. Consequently, the occurrence of any name in a frame is related by the index the name in this file.

5

3.2 An Example Showing the Lexicon Structure The following example demonstrates how the generic-frame and its instances are implemented: suppose we have the stem “‫( ”أنفس‬souls) and the instantiation of its frame is give in Figure 1 and the internal representation of this knowledge is given in Figure 2. Stem: Instance name: Instance name: Attribute name: value: value: Attribute name: value: Attribute name: value: Attribute name: value: Demon: ‫أفعل‬ ‫هو الفتحة‬ ‫هو السكون‬ ‫هو الفتحة‬ ‫هو حسب الحالة األعرابية‬ Demon: ‫هو السكون‬ ‫هو الفتحة‬ ‫هو حسب الحالة األعرابية‬ Instance name: Attribute name: value: value: value:

‫أنفس‬ ‫أسم‬ ‫خصائص صرفية‬ ‫الجنس‬ ‫مذكر‬ ‫مؤنث‬ ‫العدد‬ ‫جمع‬ ‫التعريف‬ ‫نكرة‬ ‫جامد‬ ‫مصدر‬ ‫الوزن‬ ‫وزن الكلمة هو‬ ‫تشكيل الحرف أ‬ ‫تشكيل الحرف ن‬ ‫تشكيل الحرف ف‬ ‫تشكيل الحرف س‬ ‫المفرد‬ ‫المفرد هو نفس‬ ‫تشكيل الحرف ن‬ ‫تشكيل الحرف ف‬ ‫تشكيل الحرف س‬ ‫خصائص داللية‬ ‫أسم حي‬ ‫أنسان‬ ‫حيوان‬ ‫نبات‬

Figure 1 Instantiation of the Stem Frame The frame is an instance of the noun frame slot in the generic-frame. The instance frame of this stem is found in record # 1000 through 1011 and record # 880 through 884 in the frame file. The description of each of these records is as follows:

6

Type frame link … 0 0 0 0 0 … 1 0 0 0 0 0 0 0 0 0 0 0 …

Word class ptr. … 0 0 0 0 0 … 60 0 0 0 0 0 0 0 0 0 0 0 …

Count

Property link

… 4 3 0 0 0 … 2 11 2 0 1 0 1 1 0 0 0 4 …

… 980 0 0 0 0 … 17 900 0 0 0 0 0 0 0 0 0 980 …

Word type link … 0 0 883 884 0 … 980 0 1004 0 1006 0 1007 0 0 121 130 880 …

Name

Tag

Rec#

… ‫خصائص داللية‬ ‫أسم حي‬ ‫انسان‬ ‫حيوان‬ ‫نبات‬ … ‫أسم‬ ‫خصائص صرفية‬ ‫الجنس‬ ‫مؤنث‬ ‫العدد‬ ‫جمع‬ ‫التعريف‬ ‫جامد‬ ‫مصدر‬ ‫الوزن‬ ‫المفرد‬ ‫خصائص داللية‬ …

… 5 1 2 2 2 … 4 5 1 2 1 2 1 1 2 3 3 5 …

… 880 881 882 883 884 … 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 …

Figure 2 The internal representation of the stem ‫( أنفس‬souls) in the frame-based lexicon when the semantic frame already exists in another noun instance 1.

2.

Record # 1000: This is the starting record of an instance of a noun frame. It keeps the names of the noun type frame. This instance has the sibling link value 980, i.e. its sibling instance is found at record# 980, the body (attributes, values, … etc.) of this instance is 17 records long. This instance has two slots, the possible values of the first slot are found at record# 1001 through 1010 and the possible values of the second slot are found at record# 880 through 884, the stem “‫(”أنفس‬souls) is found in the STEM file at record# 60. The instance has the sub-set link value 1 in the taxonomical structure of the frame-based lexicon which means that this instance is a subset of the generic-frame, a noun frame in this case. Record# 1001: is the starting record of an instance of a morphological frame. The body of this instance is stored at the following records (record# 1002 through 1010). This instance has the sibling link value 900, i.e. the sibling instance is found at record# 900. The body of this instance is 11 records long.

7

Record# 1002: is an attribute name, which is “‫( ”الجنس‬gender). It has two values. 4. Record# 1003: is the first value of the attribute “‫( ”الجنس‬gender). This value is “‫( ”مؤنث‬feminine). As it is a multi-valued attribute, it can have another value. This is indicated by assigning “0” to the next value link. It means that second value is inherited from the default value of the generic-frame, which is “‫”مذكر‬ (masculine) in this case. 5. Record# 1004: is an attribute name. This attribute is “‫”العدد‬ (number), which is assigned one. 6. Record# 1005: is the value of an attribute. It is “‫( ”جمع‬plural). 7. Record# 1006: is an attribute name. This attribute name is “‫( ”التعريف‬definiteness). It assigned the value one, which is inherited from the noun generic-frame. It is marked as a default value in the generic-frame which is “‫( ”نكرة‬indefinite) in this case. 8. Record# 1007: is an attribute name. It is “‫( ”جامد‬concrete), which is assigned one. 9. Record# 1008: is an attribute’s value, which is “‫”مصدر‬ (infinitive). 10. Record# 1009: is a demon. The demon name is “‫( ”الوزن‬the template) and this demon uses the template pattern and diacritic signs stored at record# 121 in the WAZEN file. 11. Record# 1010: is a demon, which has the name “‫( ”المفرد‬the singular). This demon applies the derived knowledge about the singular_getting and its diacritic signs stored at record# 130 in the ROOT file. 12. Record# 1011: is the starting record of an instance of a semantic frame. The body of this instance is found at record #880 through 884. The sibling link of this instance is 980. The body of this instance is four records long. 3.

The above example assumes that the instance of the semantic frame already exists in another noun instance. This representation presents how redundancy is avoided and inheritance from another frame implemented. If this instance of the semantic frame does not already exists, the noun instance of the stem “‫( ”أنفس‬souls) should have been

8

created, as illustrated in Figure 3. The difference of these two situations is that the semantic instances in Figure 3 is followed by its body instead of being inherited from another frame, as shown in Figure 2. Type frame link … … 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 … …

Word class ptr. … … 60 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 … …

Count

Property link

… … 2 11 2 0 1 0 1 1 0 0 0 4 3 0 0 0 … …

… … 17 900 0 0 0 0 0 0 0 0 0 980 0 0 0 0 … …

Word type link … … 980 0 1004 0 1006 0 1007 0 0 121 130 0 0 1014 1015 0 … …

Name

Tag

Rec#

… … ‫أسم‬ ‫خصائص صرفية‬ ‫الجنس‬ ‫مؤنث‬ ‫العدد‬ ‫جمع‬ ‫التعريف‬ ‫جامد‬ ‫مصدر‬ ‫الوزن‬ ‫المفرد‬ ‫خصائص داللية‬ ‫أسم حي‬ ‫انسان‬ ‫حيوان‬ ‫نبات‬ … …

… … 4 5 1 2 1 2 1 1 2 3 3 5 1 2 2 2 … …

… … 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 … …

Figure 3 The internal representation of the stem ‫( أنفس‬souls) in the frame-based lexicon where the semantic frame is created The difference between these two situations is that the semantic instance in Figure 3 is followed by its body instead of being inherited from another instance as depicted in Figure 2. The above examples shows that frames may be linked in the framebased lexicon in various ways: a)

A slot in one frame might be filled by another frame, e.g. morphological frame in Figure 2 and Figure 3, or by a pointer to another frame, if already exists, e.g. semantic fame in Figure 2. b) Frames may be linked in a taxonomical structure, e.g. the parent/child link between a generic frame and its instance represents a subset relationship. The subset frame may inherit the generic frame default values or overrides them.

9

c)

Frames which have same father may be linked together by a sibling link or represent subsets, which have the same parent. In our design, a subset frame may inherit all values of its sibling as a one package.

This structure are introduced initially to store efficiently words and their attributes using inheritance mechanism of frames as a knowledge representation scheme. This can also serve in speeding up accessing a set of attributes comprising an instance of a frame. 3.3 Creating the Frame-Based Lexicon There three methods [FRO 86] which are generally used for acquiring knowledge in a knowledge based system: a) Transcription from a text, b) Elicitation from experts, and c) Induction from examples. We decided to use method (b) to acquire lexical knowledge to the proposed Frame-based Lexicon. The generic-frame is used to generate questions to the user. Instances of the generic-frame are created as a result of generating these questions. In effect, the inheritance of the default value of a certain attribute is taken into consideration while acquiring values of attributes from the user. It is worth noting that during the elicitation of attributes of stems, if there is a when-added demon attached to the attributes then this demon is activated to fill the value of this slot. In our case, there are two demons of this type: The first is to deal with diacritic signs of the stem, and the second is to fill the slot with a pattern, i.e. sequence of actions, encoding the method of getting the root of this stem or the singular form in case of a broken (irregular) plural form. Actually, the values stored in these two slots are methods and not simple values. For this reason, two when-added demons are attached to these slots.

10

3.4 Lexicon Maintenance In order to maintain the frame-based lexicon easily, a procedure is developed such that one can add or delete any attribute value in an existing instance. The stem for which the attributes are to be modified is entered. Accordingly, attribute values are displayed asking the user if s/he wants to modify them. In case that a modification is to take place, the tool uses the genericframe to generate the possible values of this slot in the same way as the acquisition phase. Terminating this session, a new temporary instance is created. The same action is taken during the elicitation process is repeated to create a new instance, if this instance has not been already created. The considered stem is to be removed from the class slot in which it has been kept. 4.

Examples of Utilization and Testing

For testing the system on a sample stem, we have decided to take an article that appears in a daily newspaper in random. The article under the column “Mawakif”, (‫)مواقف‬, of the Egyptian famous writer Anis Mansour published in ALAHRAM newspaper dated 18 th March, 1988 was selected. The following steps have been followed to test the system:   

Getting the morphological, syntactic and semantic attributes of the stem from a linguist. Creating the lexicon using our knowledge acquisition and maintenance procedures. Running a lexical analyzer [SHA 89] that takes its input from a file on which the text is stored, and outputs the analyusis of the inflected word using the lexicon or retrieving the stem’s attributes.

11

4.1 Lexicon Creation The lexicon was built by using the knowledge acquisition tool. In the acquisition process, the user is asked to enter the stem. If the stem is already exists, it displays its attributes and asks the user for verification. If the user wants to enter different attributes for a stem that has been already stored (same orthography but differs in meaning) then the system will proceed as normal. In order to minimize the acquisition time, the system asks the user if remembers a stem having the same attributes, if so, the user is asked to enter this stem and the new stem is added to the class of the mentioned stem. In case of morphological attributes, there are two demons: (procedure attachment to a slot or attribute) the Wazen (‫ )الوزن‬demon and root/singular demon. The Wazen demon will be activated when the user is to fill the wazen (template) of a stem to tis corresponding slot. It will ask the user to enter diacritic sign of each stem letter. The root/singular demon will be activated to guide the user in entering the method of getting the root/singular from the currently entered stem. 4.2 Running the system After the lexicon has been created, the selected text sample is entered to the system for testing. The output of three words a noun, a verb, and a particle from the selected text is given in Appendix (A). 4.3 Examination of the Frame-based Lexicon The lexicon contains 170 stems entries. These are actually the stems extracted from the 328 words essay used as a test. Table 1 shows the distribution of these stems according to the word type (i.e. noun, verb, and particle). As we mentioned before, attributes necessary for word understanding are selected to be stored in the lexicon. This leads to have some stems where a certain class of attributes is not defined for them at this level. Therefore, in Table 1, one can find a count of stems having morphological, syntactical, and semantic attributes. This does not

12

represent any restriction on our system, as these attributes may be added at any time. Our purpose of building the lexicon in this way is to introduce a method of constructing and representing a lexical knowledge base in an efficient manner. Table 1 Distribution of stems according to their types and attributes Total number of words Number of Nouns Number of Nouns with Morphological Attributes Number of Nouns with Syntactical Attributes Number of Nouns with Semantic Attributes Number of Verbs Number of Verbs with Morphological Attributes Number of Verbs with Semantic Attributes Number of Particles

170 115 115 18 51 35 35 35 20

Examining Table 2, we can see that same stems have the same attributes as the total number of words in each class is less than the total number of stems of each class in the lexicon. This sharing leads to saving in storage. However, this will be more apparent when the size of the lexicon increases over time. Table 2 Distribution of stems attributes according to their classes in the frame-based lexicon Number of Nouns Classes Number of Noun Morphological Classes Number of Noun Syntactical Classes Number of Noun Semantic Classes Number of Verbs Number of Verb Morphological Classes Number of Verb Semantic Classes Number of Particles Classes

102 102 17 41 34 34 34 19

In the meantime, representing the lexicon knowledge in this way, enables linguist to perform more studies and researches on the

13

characteristics of the Arabic language. All stems having the same attributes will be easily located in the frame containing them. This structure of the lexicon will also permit to query about stems having the same morphological, syntactic, and/or semantic attributes. This is a very versatile area of study and research. 5.

CONCLUSION AND FUTURE WORK

5.1 Conclusion The stems and their lexical knowledge are stored in a Frame-based lexicon. The frame representation is found very useful to implement the lexicon as stems having the same attributes share the same frame instance. Consequently, cognitive economy is achieved and intersection search can be later implemented easily. The facility to attach methods to a frame slot is found very suitable to represent the Wazen (‫ )الوزن‬of a certain word in a general way. It is also found very suitable in representing the rule of getting the root of a certain stem. We used also this method in coding the rule for getting the singular from a broken (irregular) plural. This work has been succeeded in establishing a sound basis for the methodology to be used in representing an ontology of lexical knowledge of Arabic words, which is an essential basic element in any natural language processing system. 5.2 Future Work We think the frame representation is a good method for storing lexical knowledge. More work is need to use this method in storing all lexical knowledge of the Arabic language for sake of understanding and synthesis tasks. An interpreter can be later implemented for a structured query language to be used by linguists and computer scientists to get, as an example: 

A set of stems having certain attributes scattered in different instances.

14

  

 

The appropriate pronunciation of a certain stem through displaying the stem augmented with diacritic signs. The root of a certain word either a verb or derivative. Derivative nouns from a certain root on condition that the lexicon is to be augmented with attributes necessary for synthesizing. Common attributes of a set of stems. The method used to get a root of a certain stem.

A definition and design of a such language needs more study and investigation. The work presented here will help in any intelligent system using Arabic language such as automatic translation, user interface to a database, or any other learning based computer systems for the Arabic language. REFERENCES [BRO 87] BROWN, F., “The Frame problem in Artificial Intelligence”, Proceedings of 1987 Workshop, Morgan Kaufmann Inc., Californica, 1987. [FIK 85] FIKS, R., KELNER, T., “The Rule of Frame-based Representation in Reasoning”, CACM, 28(9):904-920, 1985. [FRO 86] FROST, R., “Introduction to Knowledge Based Systems”, Readings in Knowledge Representation, Morgan Kaufmann Publication Inc., California 245-262, 1986. [PUS 87] PUSTEJOVSKY, J., BERGLER, S., “The Acquisition of Conceptual Structure for the Lexicon”, Natural Language, 566-570, 1987. [RAF 86] RAFEA, AHMEM, FAFEA AISHA, “Building a Syntactic and Semantic Knowledge Base for Arabic Words”, The Egyptian Computer Journal, 14(2):88-104, Cairo University, Egypt, 1986.

15

[SHA 89] SHAALAN, KHALED, “A Knowledge Base System for Understanding an Inflected Arabic Word”, MSc. Thesis, Institute of Statistical Studies and Research, Cairo University, 1989. [TAN 87] TANIMOTO, S., “The Elements of Artificial Intelligence: An Introduction Using LISP”, Computer Science Press Inc., 1987. [WAT 86] WATERMAN, D., “A Guide to Expert Systems”, Addison Wesley Publishing Company, 1986. [WIN 84] WINSTON, P. H., “Artificial Intelligence”, Addison Wesley Publishing Company, 1984. APPENDIX (A) A.1 Noun

‫ال‬ ‫ـه‬ ‫ي‬

‫الغربيه‬ :‫الكلمة العربية هي‬ ‫غرب‬ :‫عماد الكلمة هو‬ ‫الخصائص الناتجة من صرف الكلمة‬ ‫حروف مضافة من البداية تم حذفها ضمير‬ ‫متصل من النهاية تم حذفه‬ ‫حروف مضافة من النهاية تم حذفها‬ Word Frame:

Instance: ‫أسم‬ Instance: ‫خصائص صرفية‬ Attribute: ‫الجنس‬ Value: ‫مذكر‬ Attribute: ‫العدد‬ Value: ‫مفرد‬ Attribute: ‫التعريف‬ Value: ‫نكرة‬ Attribute: ‫جامد‬ Value: ‫مصدر‬

16

‫الوزن‬

‫‪Demons‬‬

‫‪ :‬فعل‬

‫وزن الكلمة هي‬ ‫تشكيل الكلمة هو‪:‬‬ ‫تشكيل الحرف غ هو الفتحة‬ ‫تشكيل الحرف ر هو السكون‬ ‫تشكيل الحرف ب هو حسب الحالة األعرابية‬ ‫الجذر‬

‫‪Demons‬‬

‫الجذر هو غرب‬ ‫تشكيل غرب هو‪:‬‬ ‫تشكيل الحرف غ هو الفتحة‬ ‫تشكيل الحرف ر هو الضمة‬ ‫تشكيل الحرف ب هو الفتحة‬ ‫دالالت اإلضافات املتعلقة بتصريف الكلمة‬ ‫مفرد‬ ‫مؤنث‬ ‫معرف بأل‬

‫العدد‬ ‫الجنس‬ ‫التعريف‬ ‫مدلوالت الحروف الداخلة علي الكلمة‬

‫مفرد‬

‫العدد‬

‫‪A.2 Verb‬‬ ‫يحتاج‬ ‫الكلمة العربية هي‪:‬‬ ‫احتاج‬ ‫عماد الكلمة هو‪:‬‬ ‫الخصائص الناتجة من صرف الكلمة‬ ‫ي‬ ‫حروف مضافة من البداية تم حذفها ضمير‬ ‫إضافة ألف في أول الكلمة بعد حذفها‬ ‫حروف حذفت من الكلمة فأضيفت‬

‫‪17‬‬

‫‪Word Frame:‬‬ ‫فعل ‪Instance:‬‬ ‫خصائص صرفية ‪Instance:‬‬ ‫‪Attribute:‬‬ ‫التصرف‬ ‫‪Value:‬‬ ‫متصرف‬ ‫‪Attribute:‬‬ ‫تعدي الفعل‬ ‫‪Value:‬‬ ‫متعدي‬ ‫‪Value:‬‬ ‫الزم‬ ‫الوزن‬

‫‪Demons‬‬

‫‪ :‬افتعل‬

‫وزن الكلمة هي‬ ‫تشكيل الكلمة هو‪:‬‬ ‫تشكيل الحرف ا هو الكسرة‬ ‫تشكيل الحرف ح هو السكون‬ ‫تشكيل الحرف ت هو الفتحة‬ ‫تشكيل الحرف ـا هو الفتحة‬ ‫تشكيل الحرف ج هو الفتحة‬ ‫الجذر‬

‫‪Demons‬‬

‫الجذر هو حوج‬ ‫تشكيل حوج هو‪:‬‬ ‫تشكيل الحرف ح هو الفتحة‬ ‫تشكيل الحرف و هو الفتحة‬ ‫تشكيل الحرف ج هو الفتحة‬ ‫خصائص داللية ‪Instance:‬‬ ‫‪Attribute:‬‬ ‫نوع الفاعل‬ ‫‪Value:‬‬ ‫أسم معني‬ ‫‪Value:‬‬ ‫انسان‬ ‫‪Value:‬‬ ‫حيوان‬ ‫‪Value:‬‬ ‫نبات‬

‫‪18‬‬

‫شئ‬ ‫نوع املفعول به‬ ‫كائن حي‬ ‫ليس كائن حي‬

‫‪Value:‬‬ ‫‪Attribute:‬‬ ‫‪Value:‬‬ ‫‪Value:‬‬

‫دالالت اإلضافات املتعلقة بتصريف الكلمة‬ ‫مرفوع‬ ‫االعراب‬ ‫منصوب‬ ‫مجزوم‬ ‫فعل مضارع‬ ‫الزمن‬ ‫مدلوالت الحروف الداخلة علي الكلمة‬ ‫املضارعة‬ ‫‪A.3 Particle‬‬ ‫عليه‬ ‫الكلمة العربية هي‪:‬‬ ‫علي‬ ‫عماد الكلمة هو‪:‬‬ ‫الخصائص الناتجة من صرف الكلمة‬ ‫ضمير متصل من النهاية تم حذفه‬

‫ـه‬

‫‪Word Frame:‬‬ ‫نوع‬ ‫مكان‬ ‫استدراك‬ ‫دالالت اإلضافات املتعلقة بتصريف الكلمة‬ ‫مفرد‬

‫العدد‬

‫‪19‬‬

‫آداة ‪Instance:‬‬ ‫‪Attribute:‬‬ ‫‪Value:‬‬ ‫‪Value:‬‬