a bilingual eeps ontology model

1 downloads 0 Views 4MB Size Report
Jan 30, 2014 - For example the query “educational institutes in Ethiopia with at least two ...... To strengthen the above discussion, let's Consider RDFS Example from w3schools. It ..... Another factor “Social media” is estimated to play an.
ENGINEERING SEMANTIC WEB FOR E-COMMERCE BUSINESS INTELLIGENCE: A BILINGUAL EEPS ONTOLOGY MODEL

By Animaw kerie

A THESIS SUBMITTED TO THE SCHOOL OF GRADUATE STUDIES OF ARBA MINCH UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE

January 2014 ARBA MINCH

ii | P a g e

Acknowledgements “Knowledge is in the end based on acknowledgement”. Ludwig Wittgenstein

I would like to thank my thesis advisors Prof. DP Sharma and Mr. B.Thillaieaswaran ( Asst.prof) for their kind supervision, productive discussions and suggestions that guided me in my research. Next to main advisors, the contribution of Dr.million Meshehsa from Addis Ababa University and Mr.Amanuel from Vrije Universiteit Amsterdam in shaping the title or topic of research and their fruitful comments are acknowledged. I am grateful to all staffs of department of computer science whom I was fortunate to learn from. My special thanks go to Mr.Nebyat Fikru (PG case handler) for his help and serenity during my work. I heartily thank my friends Amare, Binyam, Ageru, Shimelis, Sisay, Yibeltal, Kibreab and Tamirat who supported me during my hard times. Finally, I am deeply grateful to my father for his love, faith in me and follow-up. On the final line but first in mind, I am so grateful to my wife Selam Tigistu who sacrificed a lot just to give me my lovely baby Zamanuel (wusaye). I remember her salient words ever to strengthen me when she was in the surgery room.

I|Page

Dedication

Dedicated to:

My wife Selam Tigistu

II | P a g e

Table of Contents Contents Acknowledgements .................................................................................................................... I Dedication.................................................................................................................................. II Table of Contents ...................................................................................................................... III List of tables ........................................................................................................................... VI List of figures ......................................................................................................................... VII Abstract ................................................................................................................................ VIII CHAPTER ONE .......................................................................................................................1 1.

INTRODUCTION ..............................................................................................................1 1.1.Background .......................................................................................................................1 1.2.Motivation .........................................................................................................................9 1.3.Statement of the problem ................................................................................................. 10 1.4.Objective of the study ...................................................................................................... 12 1.4.1. General objective...................................................................................................... 12 1.4.2. Specific objectives .................................................................................................... 12 1.5.Scope and limitation of the study ..................................................................................... 13 1.6 Methodology ................................................................................................................... 14 1.6.1.Literature review ....................................................................................................... 14 1.6.2.Data set collection ..................................................................................................... 15 1.6.3.Implementation tools ................................................................................................. 15 1.6.4.Testing procedure ...................................................................................................... 15 1.6.5.Thesis structure ......................................................................................................... 16

CHAPTER 2 ............................................................................................................................ 17 2. REVIEW OF CONCEPTS AND RELATED RESEARCHES.......................................... 17 2.1.Semantic Web Engineering .............................................................................................. 17 2.1.1. Layered Architecture of the Semantic Web ............................................................... 17 2.2. Semantic web technologies ............................................................................................. 24

III | P a g e

2.2.1. Resource Description Framework (RDF) .................................................................. 24 2.2.2. Web Ontology Language (OWL) .............................................................................. 26 2.2.3. SPARQL Query Language for RDF ......................................................................... 27 2.3.Ontology Engineering and Reengineering ........................................................................ 28 2.3.1. Ontology specification .............................................................................................. 28 2.3.2. Ontology design ....................................................................................................... 29 2.3.3. Ontology Building Methodologies ............................................................................ 29 2.3.4.Ontology development tools and languages ............................................................... 30 2.4. Ontology localization ...................................................................................................... 32 2.4.1. Introduction .............................................................................................................. 32 2.4.2. Trends in modeling multilingualism in ontologies: State of the art ............................ 33 2.4.3. NeOn approach for ontology localization.................................................................. 35 2.4.4. Performance of protégé to support multilingual ontologies ....................................... 35 CHAPTER 3 ............................................................................................................................ 37 CRITICAL ANALYSIS OF E-COMMERCE TECHNOLOGIES AND DOMAIN SPECIFIC ONTOLOGIES .................................................................................................... 37 3.1.The E-Commerce domain ................................................................................................ 37 3.2. Emerging supportive technologies for e-commerce ......................................................... 39 3.2.1. Knowledge representation and semantic web technologies ....................................... 41 3.3.Ecomerce Domain Ontologies .......................................................................................... 44 3.3.1.UNSPSC (United Nations Standard Products and Services Code) .............................. 45 3.3.2.unspscOWL............................................................................................................... 46 3.3.3.myOntology .............................................................................................................. 47 3.3.4. eClassOWl ............................................................................................................... 47 3.3.5.GoodRelations ontology(gr) ...................................................................................... 48 3.5.Pragmatic analysis of GoodRelations ontology usage on the web ..................................... 51 3.5.1.Conceptual Schema and Pivot Concepts .................................................................... 53 3.5.2. GoodRelations on the LOD data space...................................................................... 54 3.5.3. Namespace Analysis in GRDS ................................................................................. 55 3.5.4. The Concept Usage analysis ..................................................................................... 56 CHAPTER 4 ............................................................................................................................ 61 IV | P a g e

DESIGN and DEVELOPMENT of ETHIOPIAN EXPORT PRODUCTS ONTOLOGY (EEPS) ..................................................................................................................................... 61 4.1. Introduction .................................................................................................................... 61 4.2.The proposed Methodology ............................................................................................. 62 4.2.1. Specification phase ................................................................................................... 64 4.2.2. Conceptualization phase ........................................................................................... 67 4.2.3. Formalization phase ................................................................................................. 75 4.3. Implementation phase or Ontology coding ...................................................................... 82 4.4. Bilingual ontology development approach ...................................................................... 84 CHAPTER 5 ............................................................................................................................ 85 EVALUATION OF OUR WORK .......................................................................................... 85 5.1. Reasoning based on Description Logics .......................................................................... 85 5.2.Extensibility and reusability ............................................................................................. 86 5.3. RDF validation ............................................................................................................ 87 5.4. SPARQL Querying ..................................................................................................... 87 CHAPTER 6 ............................................................................................................................ 89 CONCLUSION AND FUTURE WORKS.............................................................................. 89 6.1. Conclusion ...................................................................................................................... 89 6.2. Future work .................................................................................................................... 90 Bibliography ............................................................................................................................. 92

V|Page

List of tables Table 1.1: Activities and respective tools.............................................................................15 Table 3.1: Comparison of major standard vocabularies and ontologies developed for ecommerce domain....................................................................................................................................51 Table 3.2: list of ontologies and their percentage in the GRDS............................................57 Table 3.3: usage of gr: businessEntity class in the GRDS.....................................................58 Table 3.4: Usage of gr: ProductOrService.............................................................................59 Table 3.5: GoodRelations Concepts Usage Analysis and their ranks....................................60

VI | P a g e

List of figures 

Fig: 1.1 Linked open commerce diagram as of December, 2013......................................3



Fig 1.2 Ontology multilingualism statistics ....................................................................6



Fig.1.3 Scope determination ................................... .......................................................13



Figure 2.1 the layered cake model of the semantic web........................................................18



Fig 2.1 Simple RDF graph................................................................................................25



Fig 2.2. The Structure of OWL 2..............................................................................................27



Fig.2.3 Methontology ontology life-cycle........................................................................30



Fig 2.4. Ontology localization process of a NeOn approach...........................................36



Fig 2.5.Bilingual pizza ontology on protégé 4.3 interface



Fig.3.1 Divergent technology entities to support ecommerce.........................................41



Fig 3.2.Linked open product project derivation scenario................................................45



Fig 4.1 The Proposed methodology.................................................................................64



Fig 4.2 Partial OntoGraphical-hierarchical view of concepts.........................................74



Fig 4.3 EEPS Classes .....................................................................................................76



Fig 4.4.Description of the eeps:LivestockAndMeat class...............................................77



Fig 4.4 EEPS object properties........................................................................................78



Fig 4.5.EEPS data type properties...................................................................................79



Fig 4.6 Data type property assertion for Afework International Group..........................80



Fig 4.7 UML Class Diagram for EEPS...........................................................................81



Fig 5.1 Classification result.............................................................................................86



Fig 5.3 RDF validation ....................................................................................................87



Fig 5.4 SPARQL Querying members or elements of a class..........................................88

VII | P a g e

..........................................36

Abstract E-commerce is the major factor of a digital economy worldwide. Ethiopia is a country enriched with exportable Products and services but has no standardized, structured and harmonized eCommerce system. No semantic retailer website or e-business portal which adopts universal and standardized vocabulary or ontology is existed. Semantic web is the future web version to amplify knowledge sharing via an open system. The key technology for the Semantic Web is ontologies. In order to automate E-business with semantic web technologies, we need to develop and annotate domain specific ontologies for product offerings on the Web. In order to improve the visibility of products information on the latest search engines, agents and recommender systems a consistent ontology is presented in this thesis, the ontology was constructed by gathering and the analyzing Export commodities information on the web. The Ethiopian export products and services ontology (EEPS) was modeled as part of this thesis effort. This consistent ontology is derived from the existing heterogeneous products classification using GenTax algorithm and it is designed as a GoodRelations compliant ontology for the vertical industry. Since the existing standard web offers ontology (GoodRelations) is generic in its nature, in this thesis we show how to improve search engine optimization (SEO), web visibility and Semantic interoperability of Ethiopian export products and services on the web. EEPS is a combination of concepts, classes, object properties, data properties, relationships, instances and axioms in the domain; inferencing with description logic (DL) mechanism are modeled to enable a fullyfledged and semantic ecommerce framework. Keywords: RDF, bilingual Ontology, GoodRelations, Semantic web, OWL, protégé, SPARQL

VIII | P a g e

CHAPTER ONE 1. INTRODUCTION 1.1.Background In this chapter we describe background for our work. We describe the background of the selected case study, the current web, semantic web technologies, present historical background on knowledge representation, and the evolution of Ecommerce domain with respect to the revolutionary technology. According to Sir Tim Berner’s lee the growth of the World Wide Web (www) has two phases in which these phases can have different naming by the web family. The first phase is called “web of document”, “human–process able web”, “syntactic web” or “the current web” while the second phase is called “web of data”, “Machine-process able web”, “semantic web” or “the future web”. The World Wide Web consortium estimated that web reasoning which is the final improvement of the second phase is estimated to be actualized on 2040 G.C. Many scholars scrutinized the current web and identified the following limitations. Generally limitations can be categorized into three. I.e. no parametric search, unlinked data, limited ability to reuse data. Current web seems a bottleneck for searching a relevant document. The searching method results in many unrelated information. This problem becomes magnificent when the web search term becomes multilingual. The reason is, due to weak data accessing methods and data description used in the traditional web. It relies on keyword based search. (Sabou, 2006) These techniques provide a relatively high recall (as all Web sites that mention a given keyword are retrieved) but a low semantic recall (as pages about the desired topic but not containing the keywords are ignored). Their precision is low because only few of the retrieved pages contain the information that the user needs. As a result logically complex queries become hard to answer. For example the query “educational institutes in Ethiopia with at least two partners” will result to hundreds of hits about education or about Ethiopia. But what keyboards should I use? , School, university or any other term? The exact answer will not be retrieved due to high semantic search challenge. To explain this problem again we let’s take another example, if we suppose to query a search engine using the three terms “book”, “about” and “hotel”. From this query, it is clear for a human reader that we want a book about hotels, but search engine displays results related to 1|Page

hotel booking. The search precision becomes depreciated as more custom natural language used as a query term. On the first Example the equivalent query in Amharic የኢትዮጵያ ትምህርት ተቋማት”

“ቢያንስ ሁለት መስተጋብር ያላቸው

will result to less than 20 hits only. The lack of an ability to understand the

context of words and relationships between search terms is the main reason for inaccuracy of the searches. But if the search engine would understand the meaning of each word and if the corpus becomes structured, the more accurate the search result will be. This is one of the goals of the semantic web. The second limitation of the existing web is web data not linked. More than 95% web pages on the www are not linked to each other on the semantic web. If data is linked on the semantic web , organizations can work with every other business and organization in other sectors or regions and they all share common and open data using currently available structured methods. (Google Metaweb: Semantic Open Linked Data Boost, 2010) RDF provides the foundation for publishing and linking data. (Semantic web, 2014) DBpedia (DB refers to database) knowledge base is as part of a linked data cloud which is the structured information version of Wikipedia which is the largest knowledge source of mankind. In September 2013, version 3.9 of DBpedia was released. Its dataset describes four million entities, out of which 3.22 million are classified in a consistent ontology, including 832,000 people, 639,000 places, 116,000 music albums, 78,000 films, 18,500 video games, 209,000 organizations, 226,000 species and 5,600 diseases. It is developed in around 119 different languages (). Specific to this research interest the open data version of commerce i.e. linked open commerce which architecture is completely based on open linked data is designed to interconnect global ecommerce. Linked open commerce (LOC) is a community effort to collate, cleanse, consolidate, and augment structured e-commerce information from the World Wide Web, and to make this information available via a SPARQL endpoint (Kingsley Idehen, 2009). The objective of LOC is to encompass and integrates data about: (Kingsley Idehen, 2009) 

Companies and persons ("Business Entities")



Product features ("Datasheets")



Offers, prices, payment options, and all other commercial aspects of supply

2|Page



Demand data, e.g. public tenders, requests for quotations, or private wish lists



Store and service locations including their geo position and opening hours

Based upon to the commitment of this thesis work, the 2nd objective of LOC among the above five objectives gets high attention. Since most product information is open or public, developing ontology to join the LOC platform as a linked element is a research gap to be filled.

Fig 1.1 Linked open commerce diagram as of December, 2013

Generally according to (Amjad Farooq, 2010) the semantic web is different from non-semantic web (current web) with several parameters such as content, conceptual perception, scope, environment and resource-utilization. (i) Content: Semantic web encompasses actual content along with its formal semantics. Here formal semantics are machine understandable content, generated in logic based languages such as Web Ontology Language (OWL 2.0). (ii) Conceptual Perception: Current web is a collection of multiple hyperlinked documents. There are no formal semantics of keywords. In semantic web this limitation will be handled via

3|Page

the concept of ontologies, where data is given well-defined meanings, understandable by machines. (iii) Scope: Through literature survey [4] it has been determined that inaccessible part of the web is about five hundred times larger than accessible one. It is estimated that there are billion pages of information available on the web, and only a few of them can be reached via traditional search engines. In semantic web formal semantics of data are available via ontologies, and the ontologies are the essential component of semantic web, accessible to semantic search engines. As said by sir Tim Berners-Lee , “Semantic web is not the replacement of current web but it is an extension of current web, in which information is given in a well-defined manner, so that it enable the computer and people to work in cooperation”. The Semantic Web Technologies growing interest in human computer interactions where it enables the machines to interpret the data published in a machine interpretable form under web. Tim Berners-Lee describes this as “The semantic web goal is to be a unifying system which will (like the web for human communication) be as un-restraining as possible so that the complexity of reality can be described”. (Dekdouk, 2010) Many researchers worked to enhance the machine comprehension on Web documents, (such as, the probability statistics, space vector approaches, Nature Language Processing (NLP) and machine learning); the study result does not meet satisfaction yet. With a Semantic Web, many issues, like knowledge-repository, search agent, information parser, etc., which cannot be done well with the current Web, will be easily handled. (Li, 2002) The developers of end user applications need not worry about how to parse the information contained in the raw data, the ontologies that the Semantic Web used will make everything explicit. To make the Web semantic, some new standard Web ontology languages will have to be worked out. Unlike the previous HTML, which was used to ’display’ the contents to the human beings, XML and RDF can be used to add to user defined annotations to (parts of) web pages. However, we still need latest ontology language, such as OWL2 in order to specify the meaning (semantics) of the annotations in a machine understandable form. Thus, by adopting a Semantic Web ontology language OWL together with Description Logics technologies like Description

4|Page

Logic reasoner, we believe that it provides a powerful framework for defining and declaring ecommerce annotations. All in all, is clear that the more the machine understands the term the more it operates the data, the more it executes instructions and the more its responsiveness will be. To make web terms more recognizable by machines the concepts should be modeled in a formal language with logic information. So that information should be augmented semantically using ontologies. Radical idea from the field of artificial intelligence was adapted to the current web for automated knowledge representation and conceptualization. (Saraf, 2008) The term ontology has its origins in the Greek Philosophy, where it meant “systematic explanation of being” (Aristotle2). In the field of Philosophy, ontology is the theory of things or objects and their relationships. Recently, research on ontology becomes an interdisciplinary subject. It combines philosophy, linguistics, logic (first order logic and description logic and computer science) (Jarrar, 2005) according to Jarrar, within computer science again it combines artificial intelligence and data base.AI scientists use it for knowledge representation and data base experts use it for conceptual data model representation. A recent methodology called “Web - based information

is based on automatic gathering, filtering, search

competitive and

intelligence”

transformation

of

information in the Web using a combination of crawlers, wrappers and ontologies. (Grilo, 2013) So ontology acts as a cement to build the semantic web. The ontology is the recommended mechanism by W3C for formal semantics and it is considered as a backbone of every SW application. Several definitions were provided about ontologies over the past 2 decades. But the most frequently used definition is a definition given by Gruber in 1993. “An explicit specification of a conceptualization”. Later this definition was enhanced by Borst in 1997 as “formal specification of a shared conceptualization”. The detail is explained on the state of the art. The main aim of ontology is to enrich a data with additional meaning (semantics) so that more people, objects and machines can work on it. Ontological Engineering refers to the set of activities that concern the ontology development process, the ontology life cycle, and the methodologies, tools and languages for building ontologies. (Ponsoda, 2011) Similar to software engineering; ontology engineering is concerned with the design, representation, architecture and management aspects of ontologies. The 5|Page

questions that should be answered by this ontology engineering methodology are ontology mapping (reusability), scalability, distributed development and application independency. These listings are the expected outcomes of this thesis. Any ontology cannot perfectly represent the domain knowledge permanently. Because knowledge is a dynamically variable process since additional assets are added to cultures in different times. So there is no perfect ontology. But to minimize time, money and reliability reusing previously developed ontologies related to the project by mapping it consciously is one of the research methodologies in this thesis. Although the highest number of Internet users is represented by English native-speakers with nearly half a billion users, this only represents around 30% of the total amount of internet users. English users are closely followed by Chinese speakers, representing 23%, and followed at a long distance by Spanish speaking users with nearly 8%. (Internet World Stats, 2010) According to Ontoselect statistics, among the collected 170 multilingual ontologies, more than 80% of the ontologies on the semantic web are developed in English. Figure 1.2 illustrates this on the pie chart. (Buitelaar, 2007)

Fig 1.2 Ontology multilingualism statistics

At present most of the development work on the Semantic Web is postulated on the basis of using English language terms to identify the relationship between web resources and ontologies. (Bryan, 2001) But researchers reveal that the growing importance of multilingual information retrieval and machine translation has made multilingual ontologies an extremely valuable

6|Page

resource. Next to English other ontologies are also available in Arabic, Chinese, and Spanish, German, Japanese and others. But as per the survey of (E. MONTIEL-PONSODA, 2009) up to now, the number of multilingual ontologies is still quite small compared to the total amount of ontologies available in the Web. Since web documents written in different natural languages are tremendously increasing, the fashion of ontology modeling in custom language becomes one of the hottest research topics. Functional ontology or knowledge conceptualization

in Amharic language of any domain is

unavailable. Tessema Mindaye et.al recommended developing Amharic WordNet. They concluded that the WordNet will enhance the performance of many information retrieval and natural language processing tools for the language. It will also give the language a chance to be integrated with other languages for cross language processing. (Tessema Mindaye, 2009) Different Scholars of Addis Ababa University from informatics and computer science departments recommended ontology based Amharic information retrieval systems on their thesis work but no research is existed which realizes it. Recently semantic web technology has been applied in different applications such as semantic search and indexing , machine learning, bioinformatics, grid computing, E-government ,Eprocurement and E-commerce. Among these application domains E-procurement (50% Ecommerce) and E-commerce got an emphasis in this thesis. This is due to two significant reasons. The first is, due to Ethiopian transformation with global digital market impact and higher attention of eCommerce by the government and the second is to contribute a research effort to the world of semantic web specifically represented in multilingual ontology in which originally not done by any researcher .Semantic Web has the potential to extremely influence the further development of the internet market, where e-commerce plays an important role. This gives the overview about how the producer describes their resource in ontology content and how the consumer will retrieve the efficient data. Electronic Commerce, commonly Known as e-Commerce or ecommerce, consists of buying and selling the products or services over the internet. Ecommerce lifecycle is a complex process with stages of match making, negotiation, contract formation, and contract fulfillment. With the rapid development of internet ecommerce has its history beginning in the late 1970’s where it was mostly seen as a way of providing transaction electronically, through technologies as Electronic 7|Page

Data Interchange (EDI) and Electronic Fund Transfer, for used e.g. for purchase order or invoice exchange, between the companies and banks. Generally, Electronic Data Interchange (EDI) between companies and Automatic Teller Machines (ATM) for banking were the first introductions of the electronic commerce (eCommerce). (Ali Ghobadi, 2011) Wrote that “ the Internet is the most striking example of a marketplace, full of economic opportunity, which has grown out of the open-source software community”. There are different types of ecommerce business models. Among these B2B (business to business), B2C (business to consumer) and C2C (consumer to consumer) are the most common. Business-to-Consumer (B2C) E-Commerce is the predominant commercial experience of Web users. The evolution of B2C eCommerce has been formed through various generations. A Comparative shopping system is the last generation of B2C eCommerce systems that connect to multiple online stores and collect the information requested by the user. These models must visit several shops, excerpt product and price information, and compile a market overview. To alleviate this situation tools like shop bots are available on the web. Shop bots are software agents used to automatically extract product and price information. (Grigoris Antoniou, 2008) The comparative result obtained is then displayed in a tabular format in the user's browser. Their functionality is provided by wrappers, programs that extract information from an online store. The problem is one wrapper should be developed per one store. (Ali Ghobadi, 2011) Current scenario of E-Commerce: A search for any product or product offers is the starting point for most e-Commerce transactions. E-Commerce web applications are designed to return the most appropriate data to the user but the current applications are failing in returning the relevant data to the consumers. Following limitations are observed on current ecommerce models. a. Information Asymmetry & Price Dispersion: This situation occurs where the same product with same features is available with different price values in different websites to the consumer’s. b. Semantic Description & Extension is Deficient: This situation occurs where the product’s generic attributes are not considered, such as price, color, function, origin and material etc... c. Business Attributes: This situation occurs where the customers choose the tax percentage, type of pay and discount offered if any etc.

8|Page

d. Interoperability in an inconsistent environment: This situation occurs where the consumer is in the conflicting state to choose the best option from the available websites. The initiation of semantic web tries to minimize the above limitations. Different website solutions provide E-tendering in Ethiopia. Among these ethiotender.com and 2mercato.com etc are popular. The tender software agents tracks tenders from newspapers, gazettes, websites, tender bulletins, private companies, and public sector organizations. Some of them are designed to provide multilingual support. Some of the common problems in current tendering mechanisms are associated with automation in terms of difficulties in appropriate matchmaking the needs or offers between parties, automation negotiation strategies and complicated tender assessment. Although current software agents are delivering daily tenders to the business firms and clients via email and SMS alert, extracting information from a comprehensive and high dozen of tender documents for decision support system is untouched area which needs high demand of research. The knowledge of tendering process, products, supplies, services and subjects of any tender should be represented using ontologies. The predominant semantic web technology applied to the implementation of this thesis is OWL. (OWL Web Ontology Language, 2013) OWL 2 is the latest data modeling language. Its expressiveness, flexibility, and efficiency make it an ideal modeling language for creating web ontologies that represent exceptionally complex and refined ideas about data. (David, 2013) Generally, the main intent of this thesis is first to critically analyze semantic web technologies employed for E-commerce and then to design a prototype E-commerce domain specific ontology model translated in two languages (English-Amharic). 1.2. Motivation Advances in information technologies have strongly and consistently supported organizations to deliver the right information to the right person at the right time. E-commerce is among the applications that need high demand of information sharing and reusing. The domain itself is ambiguous and complex to humans unless its meaning is processed and supported by machines. Searching for a specific product is multi-dimensional activity which synchronizes machine to machine, human to machine and human to human communication. Searching for quality Ethiopian commodities on the web will result to unsuccessful query result. This is mainly 9|Page

because many products and services are unstructured and published in local languages like Amharic. Amharic corpora’s, words and concepts on the web are not modeled on the semantic web platform ontologically. Nowadays many businesses and companies are getting semantic web service enabled. Marketers of the future will need to know their audience so well that they are able to create valuable experiences and information before the question is asked or the need is conveyed. (microsecommerce, 2012) Creating machine readable information content and embedding it into the existing commercial websites of the country will improve the information retrieval facility by consumers in one side and will create the opportunity to join these businesses to the linked data platform. Many researchers explored text mining and information retrieval approaches on Ethiopic or Amharic corpus. But almost all of them recommended the further development of ontologies in the respected domains. The semantic web annotation is open to all languages. Amharic is the official language of Ethiopia, in which most public tenders are expressed, processed and evaluated with it. More than one hundred tenders are released every day, in which most of them are in Amharic language. Although it is believed that ecommerce is a bridge and a new way to efficient global market, its simplicity, accessibility and flexibility needs a high demand of research. To sum up the motivation of this thesis is to represent this rigorous knowledge formally in a structured manner and adapting it to the semantic web or the future web by using ontology language as a tool of gumption. 1.3. Statement of the problem According to (ZAHRA, 2000) findings the Ethiopian economy is poorly integrated in the global economy based upon some indicators such as trade to GDP ratio, share of manufactured items in total exports, and level of FDI in the country. Share of manufactured items for export are highly influenced by the E-commerce inspiration through search engines and online business transactions. Currently, while conventional search engines are able to search for keywords on Web sites, only human users can read and interpret product and service information on the Web. The innovative technology used by the Semantic

10 | P a g e

Web enables intelligent applications and search engines to grasp the meaning of the information, process this data and display it so that it is comprehensible or meaningful to humans. First, there must be standardized structures, so-called ontologies, which should be easy to implement in various applications such as ecommerce, e-procurement or e-tender. To achieve this aim, different development platforms are available. Next easy to use tools should be made available to publish or search information over the semantic web. The final step is to interlink all the data, such as tender notices information, product offers and prices over the open linked data infrastructure. As a result of this intelligent cross-linking between data, interoperability will occur between heterogeneous domains so that knowledge sharing and diversification will be simple. Generally, automatic and intelligent integration, combination and connection of offers will be possible. Now a day’s only few number of domains are modeled semantically or ontologically over the semantic web. Among these medical domains, food domain, bio-informatics and e-business domains are found. The problem here is most of them are conceptualized mono lingual that means they are annotated from the perspective of a single culture, in one natural language and in a single domain. Big semantic web initiatives in ecommerce domain like GoodRelations vocabulary, ebsemantics, UNSPSC and unspscOWL are among the popular product and service ontologies worldwide which provide machine readable description of offers. The common drawback of these ontologies is their generalness’ and monolinguality. Although high effort is shown on GoodRelations ontology to conceptualize ecommerce on web of data, still it is impossible to use it directly for vertical industries. Searching for a particular type of product or service on the Web using standard search engines like Google or others can be an exasperating experience. This is for a number of reasons. First, when the query should be exactly the same as the vendor uses for describing the offer. If we search for "TV", we won’t find pages that use "television", for instance. (Hepp M. , Good Relations, 2008) Second, criteria setting for our search term are impossible, for example, flat screens interface TV. In short as a data is becoming too big on the web (more than 30 billion web pages now days on the web), its structure and meaning remains unexploited.

11 | P a g e

The key contribution of this work is to develop ontological means for specification of products and services that enable compact representation of multi-attribute and multi-lingual offers. This research will restrict the domain to major Ethiopian exportable products and services. Generally, in developing semantic web enabled product ontology model the following practical problems arise.  Existing e-commerce platforms cannot solve semantic interoperability problems between heterogeneous markets.  All local products and services such as ‘chat’ or ‘khat’ are not conceptualized in formal, standard and global taxonomies such as product type’s ontology (PTO).  No emphasis is given to ontologically describe agricultural commodities. For example GoodRelations e-commerce ontology focus is only on manufactured items such as TV set.  No Ontology is found which is annotated in Amharic language.  The challenge of searching due to heterogeneity and dynamicity of products.  Making ecommerce related documents structured and machine understandable 1.4. Objective of the study 1.4.1. General objective The general objective of this thesis is to model an ecommerce data structure; a product data model to amplify the vision of Semantic Web-based eCommerce and define the eeps ontology that covers the representational needs of Ethiopia export business scenarios for commodity products and commodity services. 1.4.2. Specific objectives 

Analyze ecommerce domain.



Critically analyze and evaluate emerging technologies employed for E-commerce



Empirically analyze conceptual discovery of GoodRelations ecommerce ontology usage pattern.



Design a GoodRelations compliant bilingual ontology for export business domain.



Implement semantic based query with SPARQL.



Evaluate the developed ontology based on different sound criteria’s.

12 | P a g e

1.5. Scope and limitation of the study Engineering semantic web aspects fully to all e commerce business processes is a generic topic which cannot be finished in this thesis. Engineering the web includes all phases of analysis, design and implementation. Therefore in this thesis our focus is on the design (modeling) part only. The selected Application area for study is demonstrated on the diagram below. Given the limited time available for this work, developing ontology for all products and generated from Ethiopia is not possible due to the following constraints. General

E-commerce

Semantic web Ontology engineering

B2C

E-procurement B2B

E-government

B2G

Product and service business Ontology Modeling Export products or service

Specific

Local products

EEPS Ontology

Fig.1.3 Scope determination

1. The variety and specificity of products. 2. The dynamicity of products (daily new product may arise) 3.

The shortage of domain experts.

Therefore, in this research, we focus on reusing a standard ontology that is used to semantically to describe web offers that can be embedded into the existing static and dynamic web pages. Nevertheless, as there is similarity between most products structure of local products with others we will adapt and reuse other ontologies such as “GoodRelations” ontology. Therefore the proposed method is applied to corpora of products and services description domain.

13 | P a g e

1.6 Methodology 1.6.1. Literature review The methodology of this thesis is both empirically validated and Requirements driven. Empirical in the sense that the approach is evaluated with a number of evaluation criteria’s to demonstrate the systems feasibility and analyze its performance. In the case of this research, it would be necessary to study existing ontology development methodologies, reflecting on them to determine their advantages and disadvantages and using relatively the best method. Data to be analyzed is mainly collected from published articles, standards, and reports, but magazines and newspapers engaged in this area, are also used to the extent that they present relevant information dealing with ontology modeling, ontology multilingualism, E-commerce models, product catalogs, product taxonomies and any latest information related with the sole of the thesis. These sources are mainly acquired by an extensive literature search conducted on several publication databases and on the internet, to gather as much relevant material as possible. The journey of this thesis can be depicted based on the following phases. First Phase- Review of related research works and compile in the direction of proposed research study. The secondary research is sited from: Research journals and periodicals, Published books in the relevant areas, Conference Proceedings, Websites and discussion groups associated to relevant research, Ecommerce standards, and Company websites. Second Phase - Survey and study of existing ecommerce (including business logic, nationwide laws, processing technology etc.), and critical analysis of domain ontologies. This is a comprehensive observation of the existing system including its conceptual workflow, usage pattern and execution of the overall business process.

Third Phase- In this phase, the design and development of EEPS model, including ontology development and description.

Fourth Phase - evaluation of the proposed model with specified criteria’s.

14 | P a g e

Fifth phase – Conclusion and future recommendation will be presented. 1.6.2.Data set collection The products and services domain is captured from existed vocabularies and ontologies over the web. Thousands of RDF triple dataset is collected from websites adopted the global ecommerce vocabulary. In addition websites such as ethiomarket.com was valuable to gather export products data. 1.6.3. Implementation tools Activities

Tools , program or environment

Is the modeling tool, ontology design and construction.

Protégé 4.3

A tool used to store ,access, and manage RDF Data and linked data

Virtuoso open link(open source)

server Universal and free standard vocabulary for ecommerce on the web of

Good Relations vocabulary

data. It provides machine readable language for encoding all major aspects of offers on the web. A tool used to visualize ontology in UML like model

OWLGrED

FaCT++

A reasoner plugin for protégé

Table 1.1. Activities and respective tools needed for design, modeling, analysis and implementation

1.6.4. Testing procedure The challenging part of any ontology construction or design is its evaluation or testing phase because ontology-based research is not future proof by itself. There is a high Considerable debate about the importance and effectiveness of metrics to evaluate results of ontology research. Coverage of domain and degree of formalization are among the limiting factors to predict the capability of ontologies. The following evaluation criteria’s are used to qualify this thesis. 1. Consistency and extensibility testing- Evaluate the extent to which the developed ontology can be used to identify inconsistent knowledge. 2. RDF validation with W3c RDF validation service 3. Comments gathered from domain experts. 4. Query answering-we Evaluate the extent to which the ontology can be used to answer questions of relevance to the domain on the SPARQL end point.

15 | P a g e

All the above evaluation procedures for the developed ontology are compliant with semantic web standards. 1.6.5. Thesis structure This research is organized into six chapters. Chapter one unfolds the research background, providing incite to the problem area. In addition, the research objective, research question, scope and limitation, and significance of the study are outlined peculiarly. The research problem is depicted with respect to the drawbacks of the previous study and what should be covered in the current research. Chapter two discusses the literature review covering the various theories, facts, techniques, methods and algorithms from various researches, text books, published journals and the internet. It includes overview of semantic web technologies RDF, OWL, SPARQL, Ontology Tools, Ontology languages, Ontology Localization, and ontology evaluation mechanisms. In addition, it gives incite to what have been done so far regarding the ontology modeling methodologies The third chapter is the first episode of the body of the thesis. The empirical analysis on emerging ecommerce technologies and domain ontologies related to ecommerce such as GoodRelations vocabulary or schema is studied. This chapter is a recommender chapter to the emerging and developing e business firms and retailers on how to annotate the most popular e commerce ontology i.e. GoodRelations vocabulary on their websites and portals based on the investigated empirical experiment of usage pattern tested on over more than 200 data sets worldwide. The fourth chapter is all about the design and modeling issues of EEPS model .It emphasizes the Methodology, System domain, System specification, System design, Technical design, and implementation. Protégé ontology editor and ontology reasoner like hermiT and FaCT++ are used for knowledge base implementation and reasoning respectively. Evaluation of the research findings and conclusion as well as future recommendations are depicted on chapter five and chapter six respectively.

16 | P a g e

CHAPTER 2 2. REVIEW OF CONCEPTS AND RELATED RESEARCHES 2.1. Semantic Web Engineering In this sub unit; we would be discussing the state of the art beyond semantic web, different lifecycles, and technological standard and engineering aspects of the future web. On the background of this thesis more is synthesized about the ontology engineering and ontology development methodologies. In this chapter, we have reviewed the latest related documents and compiled them in the direction methodological benefits for this research thesis. “The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation”. (YANG, 2006).Although massive researches had identified the deficiencies of the traditional web, the creator of web and the coiner of the semantic web, Berners lee, described this approach according to the above quote. In order to improve the automation on the Web, Tim Berners-Lee, the World Wide Web inventor, gave the idea of the next generation Web in 1988, ‘the Semantic Web”. As he imagined, the Semantic Web is a version: the Web information is defined and linked in a way that could be used by machines. This information is not just for the display purpose, but also for the automation, integration and data reuse across various applications. To make this version into reality, relevant standards, technologies and polices would be investigated. (YANG, 2006) Generally, “The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners. It is based on the Resource Description Framework (RDF), which integrates a variety of applications using XML for syntax and URIs for naming”. (OWL Web Ontology Language, 2013) 2.1.1. Layered Architecture of the Semantic Web In computer science, it is common to express architectures in a layered model. For example OSI and TCP/IP network layer, layered object model, layered cloud model etc. The Semantic Web is developed along with a layered approach aiming at the different level information. So that 17 | P a g e

researches can do separately on each layer as well as implementation will be easy. As a result, the "Semantic Web cake”, a layer stack, was formed. Figure 2.1 shows the "layer cake" of the Semantic Web that describes the main layer architecture of the Semantic Web design and vision (Tim Berners-Lee, 2001).

Figure 2.1 the Layer Cake Model of the Semantic Web (Tim Berners-Lee, 2001) Layer

Name

Function

1

UNICODE &URI

2 3 4 5 6

XML+NS+ xmlschema RDF + rdfschema Ontology vocabulary Logic Proof

Foundation of the whole Semantic Web, Unicode processes coding of resources, URI for resource identification. Used for presenting the content and structure of data. Used for describing Web resources and their classification. Used for describing the relationship of kinds of resources.

7

Trust

Logic reasoning process based on low four layers.

Layers two up to four are the core layers, used for presenting the semantics of Web information Table 2-1 Layer architecture of the Semantic Web In the rest of this section, we make a further investigation on the each layer of the famous "Semantic Web cake" in which the major portion is adapted from KUN YANG’s thesis entitled with “a conceptual framework for semantic web-based ecommerce”. 2.1.2.1. URI and Unicode (Code Layer)

The first layer, URI and Unicode, follows the important features of the existing WWW. Unicode is a standard of encoding international character sets and it allows that all human languages can be used (written and read) on the web using one standardized form. From this consensus we can 18 | P a g e

understand that Amharic language can be encoded on the web in a standardized manner. Uniform Resource Identifier (URI) is a string of a standardized form that allows to uniquely identifying resources (e.g., documents). A subset of URI is Uniform Resource Locator (URL), which contains access mechanism and a (network) location of a document - such as http://www.example.org/. Another subset of URI is URN that allows identifying a resource without implying its location and means of dereferencing it - an example is urn: ISBN: 0-12345678-9. The usage of URI is important for a distributed internet system as it provides understandable identification of all resources. An international variant to URI is Internationalized Resource Identifier (IRI) that allows usage of Unicode characters in identifier and for which a mapping to URI is defined. (Semantic Web Architecture, 2007) Evidently different applications on the Web will make communication one another, such as, the direct or indirect transformation of information and the Web resource descriptions a big portion of these sorts of information, which needs an explicit way for identification. For this reason, the Semantic Web makes use of Uniform Resource Identifier (URI) for the resource and related property identification. As the base of the Semantic Web, the URI & Unicode layer (Code Layer) locates the lowest of the hierarchy, which successfully solved the problem of Web resources coding and locating. 2.1.2.2. XML, Namespace and XML Schema (Syntax Layer)

Today HTML is the standard language in which Web pages are written. However HTML has been observed some unavoidable crises. Unfortunately, these crises have not been taken off with the Web evolution. In the contrary, they are more and more distinct so as to become the obstacle of HTML development. In a conclusion, HTML language has the limitations as follows: 

HTML confuses the data content and data representation. This is the most principal issue of HTML. The same data value can be represented in different formats, which leads to the respective forms of HTML description. As a result, the concept search cannot function through the data meaning.



HTML document does not contain structural information which is the information about the document pieces and their relationships.



The tags of HTML are fixed. Users cannot extend the markup tags independently so that much special information cannot be represented.

19 | P a g e

Consequently, HTML has no capacity to fulfill the maximal information sharing and interoperability. Having analyzed the problems of HTML, researchers decided to adopt XML as the syntax layer. Similar as HTML, XML was also derived from Standard Generalized Markup Language (SGML). Indeed both HTML and XML are markup languages: they allow one to write some content and to provide information about the role. However, the important point of XML is "extensible". Through XML, we may use information in various ways according to users' vocabulary definition proper for the application. Thus XML is considered as a meta-language for markup: it allows users to define their own tags [31] instead of a fixed set of tags, which is especially suitable for sending documents across the Web. A XML document can be defined through writing a Document Data Definition (DTD), or a XML Schema that offers a richer language for defining the structure of thermal document. In general, a XML document may use more than one DTD (or schema). But since each document is developed independently, name conflict appears inevitably. To solve this problem, the XML group of W3C constituted the Namespace standard. For example, the user can state the definition of the "author", which markups as follows: This

definition

specifies

that

the

tag

""

is

declared

in

the

Namespace,"http://foo.bar.com/xml/customer.dtd", which is represented by "K". Thus it is impossibleto produce conflict as long as they have the different Namespaces, even though other users define the same tag . Therefore the Syntax Layer achieves the structured Web document description and syntactic operation via the feature of XML. However there is still a problem that is the tag "" is easy to understand by humans rather than machines. The fact is computers cannot really understand the meaning of the tag "". This is because XML is only the bottom data transformation format. XML merely resolved the syntactic problem (such as content structure), and it did not provide any means of the data semantics (meaning). The tag’s semantic definition will be processed by the higher layer of the Semantic Web.

20 | P a g e

In conclusion, Extensible Markup Language (XML) layer with XML namespace and XML schema definitions makes sure that there is a common syntax used in the semantic web. 2.1.2.3. RDF AND RDF Schema (RDFS) (the Metadata Layer)

XML provides the syntactic support for the Web data coding. And Resource Description Framework (RDF) prescribes the semantic description to the relevant resource. RDF is an entityrelationship model for writing the simple statement about the Web object (resource). The RDF data model does not depend on XML, but RDF has XML based syntax. Therefore it is located on the top of XML, belonging to the metadata layer. RDF has three fundamental elements: 

Resource: resources may be anything that can be identified by URL. They include Webreachable resources (such as, electronic documents, pictures, and Web services) and Webunreachable resources (for instance, concrete physical objects, such as people, companies, and places; or abstract concepts, such as authors).



Property: properties are a special kind of resources, which describe the relationships among resources, (for example, "written by", "age", "title" and so on).



Statement: the statement is the detailed description on the material resource object.

Generally a statement is an object-attribute-value triple, .The "S" specifies a resource object to be described; the "P" shows certain facets of this resource and "O" shows the facet P's value about the resource S. The value "O" can either be literal (such as character string) or other resources. RDF also needs a way to define application-specific classes and properties. Application-specific classes and properties must be defined using extensions to RDF. One such extension is to define application-specific classes and properties are RDF Schema (RDFs).To the contrary, RDF Schema does not provide actual application-specific classes and properties. Instead RDF Schema provides the framework to describe application-specific classes and properties. Classes in RDF Schema are closely similar with classes in object oriented programming languages. This allows resources to be defined as instances of classes, and subclasses of classes. To strengthen the above discussion, let’s Consider RDFS Example from w3schools. It demonstrates some of the RDFS facilities:

21 | P a g e



In the example above, the resource "horse" is a subclass of the class "animal". Since an RDFS class is an RDF resource we can abbreviate and merge the example above by using rdfs: Class instead of rdf: Description, and drop the rdf: type information:

2.1.2.4. Ontology (Vocabulary Layer)

This layer is built on top of RDF designed to be interpreted by computers. A generic term Ontology is the exact description of things and their relationships. It is a term highly related with vocabularies or classes. Its lightweight meaning in case of the web, ontology is about the exact description of web information and relationships between distributed web information. The lower layer RDF schema has the capacity to define Class, Subclass, Super class, and Property, Sub property, and domain and range restrictions. So RDF Schema can be taken as the middle level ontology but it has less capacity to describe the relationship between concepts in a certain applicable domain.

22 | P a g e

The concept Ontology originally is derived from philosophy. In philosophy, ontology refers to the study on the existence of all kinds of entities (abstract or concrete) that make up the world. These years, with the increasing requirements of the computational application, ontology, as a modeling tool, has been gradually used for knowledge representation, sharing, reusing and other relevant domains. In AI (artificial Intelligence) community, a definition of ontology was given by Gruber in 1993, which was "An explicit representation of conceptualization". (YANG, 2006) After that, Studer made a better expressive definition of ontology, and he concluded, that "ontology is a formal, explicit specification of a shared conceptualization". This definition contains four concepts:  Conceptualization: A conceptualization refers to a model which is abstracted the relevant concepts from the external world phenomenon. The conceptualization is independent of the concrete environment.  Explicit: Explicit means that the type of concepts used and the constraints on their use are explicitly defined.  Formal: Formal refers to the fact that the ontology should be machine-readable.  Share: Share reflects the notion that ontology captures consensual knowledge, that is, it is not private to some individual, but accepted by a group and publicly acceptable. Since our research relies mostly on this layer, we will further investigate all the engineering and construction aspects of ontology on section 2.3. 2.1.2.5. Logic, Proof and Trust (Logic Layer)

Many research papers are forwarded about the ontology layer of the semantic web standards like RDF, RDFs, OWL, OWL 2 is now becoming mature enough (i.e. and the next step is to work on a logic layer for the development of advanced reasoning capabilities for knowledge extraction and efficient decision making. Adding logic to the web means using rules to make inferences. Rules are a means of expressing business processes, policies, contracts etc. (YANG, 2006) This layer allows the use of a reasoner (like Racer and pellet of protégé) which can check whether or not all of the statements and definitions in the ontology are mutually consistent and can also recognize which concepts fit under which definitions. So the reasoner will maintain the logic and the hierarchy of classes, object and data properties on the Ethiopian exportable 23 | P a g e

products domain. Finally a reasoner can deduce implicit knowledge so that the correct query results are obtained by the search engines. 2.2. Semantic web technologies In this sub section, the thesis presents and emphasizes standardized knowledge representation languages for modeling ontologies operating at the core of the semantic web. To qualify the presentation of each language, the study explains syntax and underlying perceptions through example. The discussion covers a set of technologies and frameworks that enable the web of data including, RDF Schema, Web Ontology Language (OWL), rules, and query languages, such as SPARQL and we will finally discuss the recent developments concerning the OWL 2 revision . These technologies are created to achieve all visions of the semantic web. Semantic web technology stack includes RDF, RDFS, OWL, and SPARQL. All of these technologies are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain. 2.2.1. Resource Description Framework (RDF) A little information is about this topic is provided on 2.1.2.3.Here only further investigation is added in the form of a graph. The RDF is one of the three foundational semantic web technologies in addition to OWL and SPARQL. It is the main technology acting as a data model of the Semantic Web. This implies that all the information on the semantic web should be represented with RDF. It is equivalent technology for the semantic web as html is designed for conventional web. RDF is different from relational data model and Xml. Xml is like trees structure while RDF is a directed graph. Let us see an example of a simple RDF graph (Adida Ben, 2013) on Fig 2.1.

24 | P a g e

http://www.amu.du.et

Foaf: member http://www.amu.edu.et/people/about/animaw

Fig 2.1simple RDF graph Foaf: name Animaw Kerie

Customized explanation of the graph: Note that we have taken fake URI’s .The nodes of the graph are the ovals and rectangles (ovals and rectangles are a convention that we'll get to shortly). The edges are labeled arrows that connect nodes to each other. The labels are URIs (universal resource identifiers). •

There are three kinds of nodes in an RDF directed graph: (Adida Ben, 2013) o Resource nodes. A resource is anything that can have things said about it. It's easy to think of are source as a thing vs. a value. In a visual representation, resources are represented by ovals. o Literal nodes. The term literal is a fancy word for value. In the example above, the resource is http://www.amu.edu.et/people/about/animaw.

This resource is called a URI and the value of the foaf: name property is the string "Animaw kerie". In a visual representation, literals are represented by rectangles. o A blank node is a resource without a URI. Blank nodes are an advanced RDF topic Edges can go from any resource to any other resource, or to any literal, with the only restriction being that edges can't go from a literal to anything at all. On the above example foaf means friend of friend ontology. It is a machine-readable ontology describing persons, their activities and their relations to other people and objects. So the concepts ‘name’ and ‘member’ are adapted from the foaf ontology. 2.2.1.1. The Ultimate Importance of URI in the Semantic Web

The fundamental value and distinguishing capability of the Semantic Web is the ability to connect things. In RDF, resources and edges are URIs. Literals are not; they are simple values. 25 | P a g e

Blank nodes are not (this is what the "blank" means in the name). Everything else is, including the edges. From the above example we can observe several URI’s •

http://www.amu.edu.et/



http://www. www.amu.edu.et /people/about/animaw



foaf:member (this is shorthand for http://xmlns.com/foaf/0.1/member)



foaf:name (again, shorthand for http://xmlns.com/foaf/0.1/name)

The first one is the URI for Arba Minch University. The second is a URI for Animaw, the author of this thesis. The other two are URIs for the edges that connect the resources. Generally, RDF model has a great impact on our work in which it is used to publish linked data of our domain. In addition to RDF all other Semantic Web technologies were built under the assumption that Different people, in different applications, Written for different purposes, would create related concepts at different times that overlap in any number of ways. 2.2.2. Web Ontology Language (OWL) The W3C Web Ontology Language (OWL) is a Semantic Web language designed to represent rich and complex knowledge about things and semantic relations between them. It is a computational logic-based language such that knowledge expressed in OWL can be reasoned with by computer programs either to verify the consistency of that knowledge or to make implicit knowledge explicit. (OWL Web Ontology Language, 2013) OWL 2 is the latest version of OWL which becomes a w3c recommendation ontology language on October 27 2009.

26 | P a g e

Figure 2.2. The Structure of OWL 2

The difference made between RDF and OWL is the level of expressivity of knowledge. It is possible to represent complex relationships than RDFs in OWL. 2.2.3. SPARQL Query Language for RDF SPARQL stands for SPARQL Protocol and RDF Query Language. It is used to ask queries against RDF graphs. A SPARQL processor finds sets of triples in the RDF graph that match to the required pattern. For example, assume that we want to ask the query “Who is the spouse of William Shakespeare’s child?” from the DBpedia ontology. The query can be expressed in SPARQL as follows. (Mohamed Morsey, 2011) PREFIX dbp: PREFIX dbpedia: http://dbpedia.org/resource/ PREFIX dbo: SELECT ?spouse WHERE { Dbpedia:William_Shakespeare dbo:child ?child. ?child dbp:spouse ?spouse. }

In addition to the above three basic technologies, there are also other supporting technologies or software’s like triple store. A triple store is a software program capable of storing and indexing RDF data, in order to enable querying this data efficiently. Most triple stores support the SPARQL query language for querying RDF data.

27 | P a g e

2.3. Ontology Engineering and Reengineering Ontologies are data models that represent a domain and is used to reason about the objects in that domain and the relations between them. They are developed to facilitate knowledge sharing among heterogeneous disciplines. It is enabling technology for the semantic web which can be used as a building block of the next generation web. The components of ontology building blocks that we implemented in our research are: 

Classes: sets, collections, concepts, types of objects, or kinds of things involved in export business.



Attributes: aspects, properties, features, characteristics, or parameters that objects (and classes) can have.



Individuals: instances or objects (the basic or "ground level" objects)



Relations: ways in which classes and individuals can be related to one another

According to the definition given by Wikipedia, Ontology engineering in computer science and information science is a new field, which studies the methods and methodologies for building ontologies. Originally ontology was a discipline as a philosophy but evolutionary it was changed to engineering. This is due to the creation of new methods invented for tackling philosophical problems, increasing commonness with other disciplines and the need for cross-disciplinary collaboration. “Ontology” a synonym for “Metaphysics” and in philosophy it has a meaning of “the theory of being as being” while in engineering context Ontologies are standardized classification systems which enable data from different sources to be combined. Major activities involved in ontology engineering are ontology specification, ontology design, and development with tools, ontology evaluation (checking reasoning capability, consistency and others), Ontology Localization, Ontology Reusing, and Ontology mapping. 2.3.1. Ontology specification This step is the equivalent step of requirements specification document in software engineering. At the beginning of every ontology development, ontology specification document should be prepared. This document have its own content including Goal of the ontology, Scope of the ontology, Ontology formalism, and Knowledge sources used to build that ontology.

28 | P a g e

2.3.2. Ontology design Ontology design is the major knowledge management tool. It’s difficultly is the absence of agreed upon methodology to build the ontology. The heart of the ontology design is the identification of core concepts in the domain (commodity, export business as part of ecommerce in our case).So, on this phase ontology modeling activities including identification of all terminologies, properties, relationships, and axioms are performed. Generally this is the domain conceptualization step. Below we reviewed presented notes to differentiate the major well known ontology building methodologies although still there is no universal agreement on one of them. Waiting the end of ontology development to measure the quality of our developed ontology, the costs of correction of any errors are likely to be high. Therefore using the best methodology that builds quality into ontology can have significant benefits. 2.3.3. Ontology Building Methodologies Two groups of ontology building methodologies are surveyed. These are Ontology Methodologies for Building Single Ontologies and the other is methodologies for building networked ontologies. Discussing all methodologies is out of the scope of this document. Only the most suitable approach for this thesis is discussed. 2.3.3.1. METHONTOLOGY

Methontology is a methodology created for building ontologies either from scratch, reusing other ontologies as they are, or by a process of reengineering them. The Methontology framework enables the construction of ontologies at the knowledge level. It includes: the identification of the ontology development process, a life cycle based on evolving prototypes (shown in Fig.2.2), and particular techniques to carry out each activity in the management, development-oriented, and support activities. In the case of the ontology specification activity, Methontology proposes the use of competency questions or intermediate representations for describing the requirements that the ontology should fulfill. Methontology based ontology development process identifies which tasks should be performed when building ontologies (scheduling, control, quality assurance, specification, knowledge acquisition, conceptualization, integration, formalization, implementation, evaluation, maintenance, documentation and configuration management). The life cycle identifies the stages through which the ontology passes during its lifetime, as well as the interdependencies with the life cycle of other ontologies. Finally, the methodology specifies 29 | P a g e

the techniques used in each activity, the products that each activity outputs and how they have to be evaluated. The main phase in the ontology development process using the Methontology approach is the conceptualization phase.

Fig.2.3 Methontology ontology life-cycle

The main Methontology contributions to our development approach is in the Identification of the life cycle and gives detailed guidelines for building ontologies from scratch or from the beginning. 2.3.4. Ontology development tools and languages The primary goal of this survey is to introduce and understand several tools and languages through their use in ontology implementation project. Including the pros and cons of each and to decide the best solution for our research.W3C recommendation languages are partially discussed in this chapter part 2.2. They are RDF, SPARQL, RDFS and OWL 2.3.4.1. Ontology development tools

Always the first step towards ontology development is finding an appropriate tool to develop ontology. A lot of ontology development tools are available in the market, however, some are free and some are commercial. Based on an online survey we have decided and selected protégé as the best tool based on the following competency and comparison questions. 30 | P a g e

Which tools are mostly used by users? Is the tool extensible? What reasoning algorithm is applied to the tool? Is it possible to get the tool freely? Are there any drawbacks using the tools? A.

Protégé 4.3

It is a free, open source ontology editor and knowledge-based framework. Protégé ontologies can be developed in a variety of formats including OWL, RDF(S), and XML Schema. It is based on Java, which is highly extensible, and provides a plug-and-play environment that makes it a flexible base for rapid prototyping and application development. (protege, 2013).Different reasoner plugins are available for protégé. The efficiency of the reasoning or inference algorithms implemented for those plugins and selection criteria for this project is discussed after this topic. The Protégé platform supports two main ways of modeling ontologies - frame-based and OWL (protege, 2013). The Protégé-Frames editor enables users to build and populate ontologies that are frame-based, in accordance with the Open Knowledge Base Connectivity protocol (OKBC). In frame based model, ontology consists of a set of classes organized in a subsumption hierarchy to represent a domain's significant concepts, a set of slots associated to classes to describe their properties and relationships and a set of instances of those classes - individual exemplars of the concepts that hold specific values for their properties. OWL ontology may include descriptions of classes, properties and their instances. Our emphasis is on the second model (protégé-owl), since owl is the W3c recommendation our approach shall be on the suggested track. It is quite clear that Ontology development is mainly an adhoc approach. So among several viable alternatives, we find protégé would work better for our project due to its graphical user interface, lightweight operation and its capability to generate RDF/OWL code. The challenge may be losing methodological benefit since protégé doesn’t support any of the ontology development methodologies. We use the latest version of protégé OWL 2 editor. The screenshot of the software can be seen on the design chapter. B.

Protégé OWL DL Reasoner

31 | P a g e

The logic type we use in our project is OWL-DL (ontology language description logic) which is a decidable fragment of First Order Logic. OWL DL performs tasks that allow drawing new conclusions about the knowledge base or checking its consistency. Description logic can be applied in in e-business to reason about products and services descriptions. Logic is decidable if it is possible to design an algorithm that will terminate in a finite number of steps. For example, in Description Logic it is possible to write an algorithm that calculates whether or not one concept is a subclass of another concept in procurement domain, which is guaranteed to terminate after a finite number of steps. OWL-DL ontology can be translated into a Description Logic representation, so that it is possible to perform automated reasoning over the ontology using a Description Logic reasoner. A Description Logic reasoner performs various inferencing services, such as computing the inferred super classes of a class, determining whether or not a class is consistent (a class is inconsistent if it cannot possibly have any instances), deciding whether or not one class is subsumed by another, etc. Currently different reasoner plugins are available for protégé such as RACER, FaCT, FaCT++ and Pellet. (protege, 2013). Our benchmarks to select the best reasoner can been in three dimensions. First the reasoner should support OWL 2 format (the latest owl version) .The 2nd is the response time, i.e. the time that is needed to solve the given reasoning task on the uploaded ontology. The response time can be specifically screened as load time (a time to load the ontology before query) and response time (a duration between query start and query end) and 3rd the tool should provide implementation of the semantic web rule language (SWRL). Another popular reasoner is pellet. Pellet provides standard and cutting-edge reasoning services for OWL ontologies. Pellet has unique capabilities: i.e. data type reasoning and SWRL support. Although racer pro and pellet are excellent reasoner plugins we preferred HermiT 1.3.7 which is the built-in reasoner of protégé 4.3. 2.4. Ontology localization 2.4.1. Introduction Ontology localization is a reengineering activity. It is an interesting topic in our research in terms of weaving Amharic content to linked open data by developing Amharic labeled ontology in ecommerce or e-business domain. Ontology localization is the process of adapting a given ontology to the needs of a certain community, which can be characterized by a common 32 | P a g e

language, a common culture or a certain geopolitical environment. This means translating conceptual structures and original ontology to some other natural language to have a multilingual ontology over a semantic web. (Philipp Cimiano, 2010) One of the specific objectives of this thesis is to develop a formal bilingual ontology model both in English and Amharic languages. The motivation for ontology localization is directly comparative idea with the success of open linked data technology and semantic web. The successfulness of the semantic web will be determined when it is possible to access the large and diversified content easily and the successfulness of linked open data is determined by the more heterogeneous links both in culture and language. So in this respect researches have to be done in the multilingualism problem. Most of the ontologies built so far have mainly English. The usage of multilingual ontologies traverses many disciplines. Among these one of the international organizations i.e. food and agriculture organization (FAO) of the united nation translated and mapped its official documents into Arabic, Chinese, French, Russian, English and Spanish languages in Thesauri format and under development for Marati, Polish, Korean, Farsi, Malay, Amharic, and Catalan languages . (PONSODA, 2009) Different approaches were proposed to solve ontology localization problems including heterogeneity, distribution and cultural specifies. Under this section we will discuss three of the multilingual representation modalities among these: OWL and RDF(S) labeling functionality, The LIR model proposed by (PONSODA, 2009), NeOn model for ontology localization, and lexicon model for ontologies are found. 2.4.2. Trends in modeling multilingualism in ontologies: State of the art It is observed from the title of this thesis that the author is initiated to construct a bilingual ontology. Since most of our literatures are based on multilingual localization approaches we assume that these methods are also appropriate for bilingual case study since “multi” is greater than “bi”. In this section we will discuss different models for representing linguistic information relative to ontologies on the semantic web and finally the challenges of ontology tools such as protégée to represent conceptual hierarchies in Amharic lexicon. Coming back to the ontology field, multiple authors have tackled this topic, and basically distinguished 6 layers in any ontology. (PONSODA, 2009) These are:

33 | P a g e

a. Lexical layer: characters and symbols that make up the syntax (ASCII encoding, UNICODE, etc.) b. Syntactic layer: structure of characters and symbols, i.e., the grammar. It embraces different representation languages (e.g. RDF(S), OWL, etc.) c. Representation paradigm layer: paradigm followed in the representation of the ontology (frames, semantic networks, Description Logics, etc.) that allows for certain ways of expressing and structuring knowledge. d. Terminological layer: terms or labels selected to name ontology elements e. Conceptual layer: related to conceptualization decisions, such as granularity, expressiveness, perspective, etc. f. Pragmatic layer: final layout of the model according to the user's needs According to Ogden and Richards' triangle the first three layers are no affected by the ontology localization activity while only the terminological, conceptual, and pragmatic layers are involved in the Ontology Localization Activity. Ontology labels will be expressed in more than one natural language on the terminological layer. Depending on the above layer(s) involved in the Localization there are three main ways of obtaining a multilingual ontology-based system as it is discussed by. (PONSODA, 2009) 1. Including multilingual data in the ontology meta-model. This implies localization at the terminological layer since the ontology conceptualization remains unmodified. 2. Combining the ontology meta-model with a mapping model. This allows localization at the conceptual layer since conceptualizations in different languages are mapped to each other. 3. Associating the ontology meta-model to a multilingual linguistic model. Localization is performed at the terminological layer, although conceptual layer adaptations are also foreseen. The latest and merged form of (Ponsoda, 2011) was proposed by the authors of (Philipp Cimiano, 2010).They classified the affected layers of ontology localization into two only.i.e lexical layer and conceptualization layer. Lexical layer of an ontology comprises: i) the labels of the concepts, properties and individuals defined in the ontology, ii) natural language definitions of these entities, as well as iii) the documentation accompanying the ontology, which describes 34 | P a g e

its scope and purpose, its usage etc. Generally they concluded that Ontology translation affects the lexical layer of ontology. This layer includes all the natural language description including labels, comments, definitions, and associated documentation to make that ontology understandable for humans. Generally, Ontology labels should be short in nature. Unless, the translation using standard practices of machine translation techniques will be complicated. Reviewing different machine translation techniques are out of the scope of this document. 2.4.3. NeOn approach for ontology localization In order to enable the smooth creation of multilingual ontologies, authors of NeOn project have developed a method for guiding users in the localization of available ontologies. (Philipp Cimiano, 2010) Additionally, they provide a supporting tool, the Label Translator NeOn plug-in, which enables a semi-automatic localization of ontologies. The whole process is illustrated as follows on the diagram below.

Fig 2.4 Ontology localization process of a NeOn approach. (Elena Montiel-Ponsoda)

2.4.4. Performance of protégé to support multilingual ontologies On protégé translation can be done either manually or automatically. For previous protégé versions (version 3.x) ontoling plugin was developed to enrich linguistic ontology in the semantic web. However ontoling is not compatible with the current version of protégé or latest linguistic plugin is not developed. Rather the current version 4.x uses the rdfs: label annotation property for classes to give different translations of the name of the class. (protege multilingual ontologies , 2012). 35 | P a g e

Fig 2.5.Bilingual pizza ontology on protégé 4.3 interface

As observed on the screen above there is a permission to attach multiple labels representing the name of the class in multiple languages. PizzaComQuilijo is the equivalent class name of cheeseypizza in which it is labeled as label: pt. notes that “pt” is the ISO symbol for Portuguese language. This label functionality is open to all UTF 8 encoding languages including Amharic language (am). At the beginning of this chapter we have explored the layered cake architectural model of the semantic web. Then we have discussed the technology entities that can affect the implementation of semantic web including URI, RDF, OWL, and SPARQL. The major portion of state of the art was concerning ontology engineering aspects especially on a discussion of explicit ontology development methodologies, identification of ontology localization approaches with a special verification of ontology editor as a multilingual ontology constructor which is helpful to our project.

36 | P a g e

CHAPTER 3 CRITICAL ANALYSIS OF E-COMMERCE TECHNOLOGIES AND DOMAIN SPECIFIC ONTOLOGIES "Make everything as simple as possible, but not simpler.” Albert Einstein

The focus of the second chapter was the state of art beyond the engineering and reengineering aspects of the semantic web and ontology. Rather this chapter is extended to present related analysis on ecommerce product-service domain ontologies. At the end of this chapter the author’s decision is made to model EEPS (Ethiopian export products) based on a pattern for GoodRelations compliant ontologies. The aim of this chapter is:  To scan the recent trends in ecommerce supportive technologies  To analyze perspectives of semantic web for ecommerce  To present a brief survey of the GoodRelations (standard and famous ecommerce ontology) usage analysis. 3.1. The E-Commerce domain As noted on the first chapter E-commerce’s simplest definition is “buying and selling goods and services over the internet”. Under this topic our focus is of the business to consumer model. Business-to-consumer e-commerce acts between companies and consumers, involves customers gathering information or purchasing physical goods or information goods. E- Commerce is highly subject to enforcement by the government laws. For instance, In United States of America activities such as the use of commercial e-mails, online advertising and consumer privacy are regulated by the government laws. As such in Ethiopia on 2012 the draft of ecommerce law was prepared and ready to be commented by the house of people’s representative’s parliament. According to B2C E-Commerce Worldwide Trends Report 2013 (Global B2C E-Commerce Trends Report 2013, 2013) , one of the trends influencing worldwide growth in online sales is the concept of group buying. In the Middle East particularly, group buying and daily deals

37 | P a g e

websites have boosted B2C e-commerce. Another factor “Social media” is estimated to play an increasing role in the travel segment of the global B2C E-Commerce market. According to the global B2c ecommerce trends 2013 report online shopping is likely to become more personalized, with retailers customizing their services and integrating online sales channels such as websites and social networks on any device that will connect to the Internet. M-Commerce is expected to play an ever larger role in the future, with over half a billion customers following the trend to shop via mobile devices by 2016. Moreover, throughout the world, online shoppers are forecasted to increasingly prefer to pay online when buying over the Internet, causing the online and mobile payment markets to grow strongly, especially in Asia. 360buy.com, Amazon.co.uk, Amazon.com, Hp.com, Netflix.com, Rakuten.com, Samsung.com, Tmall.com, Walmart.com are the companies mentioned as actively participating websites of global B2C 2013. (Global B2C E-Commerce Trends Report 2013, 2013) According to Africa Internet & B2C E-Commerce Report 2012, Ethiopia is not listed under the top 16 African countries among the top rated countries based on a highest number of internet users, online shopping practice, and internet market growth or as strong growing b2c eCommerce revenue. Although Modern ecommerce may encompass a wider range of technologies such as e-mail, mobile devices, social media, and telephones, it typically uses the World Wide Web at least at one point in the transaction's life-cycle. As a result one of the major factors of the evolution of the Web becomes Electronic Commerce: the ability to buy, sell, and advertise goods and services to customers and consumers. One component of B2c ecommerce i.e. online shopping has continued to boom and of course ecommerce technology is vital for any organization that looks to use the web to sell products or services, or simply to engage better with customers. On the next portion of this topic we analyze and highlight the latest developments in B2C ecommerce.

38 | P a g e

3.2. Emerging supportive technologies for e-commerce Ecommerce must be supported by technologies to be more efficient, effective, secured and trusted. According to (Cândea, 2011) ecommerce supporting technologies can be divided into three major categories. These are: 

Information preview or display technologies such as web, html, xml, and other information publishing and software development technologies.



Information transmission technologies which consist of EDI, TCP/IP, WAP, WLAN and Bluetooth technologies.



Information processing technologies which comprise some common used technology such as GPS, GIS, DSS, GDSS, IDSS.

Based on the observation of the author of this thesis, two additional categories named “information or knowledge representation” and “information storage” technologies are added on e-commerce technology stack. The e commerce system is not a simple information processing activity. Rather it is a complicated and huge process executed with various steps in a life cycle manner. It is knowledge to be represented in a structured way to be searchable easily and understood by a machine, human expert or agents. 

Information or knowledge representation technologies which includes the semantic web and linked data technologies.



Information storage technologies comprising cloud computing technology and Big Data systems.

39 | P a g e

EDI

Information transmission

TCP/IP WLAN

DSS

WAP Bluetooth

Information processing

GPS Information Display

E-commerce

GDSS

GIS

tt

Knowledge representation

Semantic Web

IDSS Information storage

Linked data

Cloud computing

Technology Space

Big Data Analytics

Fig.3.1 Divergent technology entities to support ecommerce

The reader of this thesis portion is assumed to have knowledge of the state of art of cloud computing, big data systems, TCP/IP, credit card and other major global payment technologies. The context of the discussion pinpoints on the benefit of each technology implementation rather than the definitions, design issues and development history. Ecommerce practices in Ethiopia are at the infant stage. The newly emerging and developing eretailers need a clear guideline and Knowledge of these new technology trends and of course will definitely help businesses in making strategic decisions and thus increase their business sustainability and competitiveness. Under this topic (3.2) a salient study of our add-on knowledge representation and technologies is presented. Research and implementation of other entities (except our add-ons) is out of the scope of this document. So we have analyzed Semantic web technology only which is an element of knowledge representation category. In addition to this, the concentration on the knowledge 40 | P a g e

representation of e commerce domain comes from the fact that B2C ecommerce model acts as an interface between the customer / consumer and the business entities so that ecommerce knowledge should be clearly structured, represented, mapped, interlinked and modeled in a machine understandable form. 3.2.1. Knowledge representation and semantic web technologies The sole of the thesis acquired from one of the implementation areas in Knowledge representation and semantic web technology in ecommerce. We have seen this ecommerce knowledge representation technology in two dimensions. I.e. in terms of knowledge semantic annotation and in terms of linked data technology. 3.2.1.1. Semantic web technology for e commerce

On the second chapter, a salient review of what a semantic web is, its aim, architecture, major semantic web technologies such as RDF, RDFs, OWL, and SPARQL finally its engineering aspects in general was presented. While in this topic we explored how these technologies are implemented for e commerce application either individually or integrated manner for a paradigm shift of e commerce. The semantic web or web of data has been applied in different applications. One of the beneficiaries of this technology next to the medical domain and e government domain is the electronic commerce or e business. Our discussion in this topic is based on answering questions such as: 

Which e commerce activities can be automated by semantic web?



What are the recorded benefits by the previous implementers or business firms, search engine companies, content management developers etc.?



What major improvements done in this research subject “Semantic web for e commerce”?



How much is its effectiveness and resolution of the current research gap?

Semantic Web Support for both Business-to-Business and Business-to-consumer E-Commerce Lifecycles has been studied by different practitioners and academicians for almost a decade. Under this topic we traverse through the points how both consumers and venders can be affected by the semantics of ecommerce content. The evolution of the semantic web improvement on e41 | P a g e

commerce is highly related with creating web visibility of offers (products). When companies mark their products semantically it will be possible to deliver their offers to their clients in a transparent way, in a structured or clear form, with accessibility of top rated searches among heterogeneous information damped on the internet. Different product ontologies are created for a decade. Over 300 million searches are performed daily. Among these the abundant search queries are performed by the customers or consumers to buy, to view product items, price, brand etc. this is how e commerce is becoming a tremendous reason for growth of the internet. So Semantic web technologies give search engines the ability to search the web for the product that corresponds best to the specific needs of a certain user. The second importance of semantic web on ecommerce next to searching is its light weight reasoning capability through ontologies. This reasoning capability will allow search engines to display the aggregated information in a much more user-friendly way. Customers or consumers are highly beneficial entities of semantic web ecommerce. Previously only central portals like Amazon(by the way amazon did not use semantic annotation still) suggest offers and prices of some portion among the total percentage of product sell available online. So when customers view these portals, the product comparison irrespective of the model, the price and other features is limited from some companies only. To the contrary customers can access all product specification available on the semantic web if and only if the product metadata is available via RDF annotation. GoodRelations vocabulary and Ebsemantics, are among the indispensible ontology projects which created a paradigm shift towards e commerce on the semantic web and even on the traditional web by annotating them in HTML and XHTML web pages. GoodRelations as its name indicates means “a strong relationship between a business entity and a product or service”, sometimes called GoodRelations vocabulary, GoodRelations schema or GoodRelations ontology. We have selected GoodRelations as a case study for empirical analysis experiment undertaken to discover the usage pattern as a partial fulfillment of the first episode of body of this thesis work. GoodRelations is the most expressive, simple and compatible with semantic web technologies as well as popular ontology created by Proff. Martin heap (Hepp M. , Good Relations, 2008) .It has been recently recognized by Google, Yahoo, Bing, Yandex and other search engines and integrated in schema.org. 42 | P a g e

3.2.4.2. Linked data technology for ecommerce

Linked data and semantic web are almost the same technologies created in the same generation. Of course their difference is a parent child relationship. The former is part of the follower (linked data (LD)  semantic web (SW)). We hope Facebook users may observed the link saying “friends you may know”. Linked data is like that. It links the universe knowledge which is similar in nature based on some metrics. So that specific search space will be explored at a time. Linked open commerce (LOC) is the only existing linked open data version of the ecommerce data space available on the semantic web (refer the first chapter to view the diagram). It is based on GoodRelations vocabulary designed to capture or augment structured eCommerce information (company name, product or information, price information etc.) from the web and to make it available on the SPARQL end point. In linked data systems each object (product or business name for example) is treated as a single URI. The complementary reason to this is because linked data is established based on RDF triple data set. So product to product, product to business, business to business, product to store connection possibilities over a search space would be available. In our case, a framework for publication of linked open data: a case for primary product data and business entities involved in export business is built based on linked data principles. The domain specific ontology model named “EEPS” is modeled on the fourth chapter with protégé owl 2 editor. The objective to develop the linked data application framework from the point of view of our project is to make available all the metadata of export product data published bilingually (Amharic and English) in different domains by different organizations in different regions of Ethiopia in a machine readable format and in an interconnected way, so that this data can be further linked to open linked commerce and other universal public procurement and e commerce related ontologies attached on open linked data such as DBPedia. As shown on the figure below the specificity of the linked open product domain is routing the linked data principle. Linked data can be either open or closed. Our focus is adaptation and publication of open or public information within ecommerce arena. By nature ecommerce needs security and closed environment, enough time should be taken to decide on an appropriate application area that should be open to the community and linked to its other open systems. Lastly the author decided the right orientation should be the product domain. This is due to the reason that the more the 43 | P a g e

Medias (internet in this case) talk about our products, the more the export market will be. Our research highly contributes towards the delivery of full information about Ethiopian commodities everywhere in related to every domain with equal opportunity of searching pattern.

EEPS

linked open data(LOD)

Media content(e.g BBC news ontology) |Medical domain - Linked open commerce | egovernment domain ontology, many other domains available

Appropraite application needs: *high expresssive *high semantics *highly interoperable *global interlinkage

Linked open EEPS

Our research Penetration

Fig 3.2.Linked open product project derivation scenario

3.3.Ecomerce Domain Ontologies we have reviewed a two step wise evolution of ontologies designed for automation of ecommerce and business process. Before the invention of real ontologies, product and service categorization standards were developed and utilized across the government and business organizations.After some years, the vision of knowledge conceptualization have been supported by ontologies with the promising foundation of standards.we reviewd only two standards and three basic ontologies.The fruition can be summerized as two phases. 

Phase I -product and service categorization standards(such as UNSPSC and eOTD) and



Phase II -product and services ontologies (unspscOWL,myOntology,eClassOWL, ,GoodRelations).

44 | P a g e

In a simple expression, Ontologies are vocabulary models which represent the domain knowledge in a semi-structured or structured way using different languages such as OWL. Although vast amount of conceptual models,vocabularies,schemas or ontologies are available for free, we have investigated only the selected five(5) ecommerce related standards or ontologies eithery fully or partially built by different research groups, product companies, practitioners and individuals. These are : UNSPSC, unspscOWL , myOntology , eClassOWL and GoodRelations ontology. A research existence on usage analysis and comparative study of ecommerce service ontologies is almost negligible. After we evaluate all the above mentioned projects,we have selected one of the top most community accepted and popular ecommerce ontology for further emperical analysis undergone at the end of this chapter. The usage analysis (Ashraf J. , 2013) investigates how the conceptual model and the ontology component, such as classes, relationships, attributes and axioms is being used to annotate the data. 3.3.1.UNSPSC (United Nations Standard Products and Services Code) ...inadequate spending analysis capabilities are costing businesses $260 billion in missed savings opportunities annually. Aberdeen Group, “The Spending Analysis Benchmark Report: Dissecting aCorporate Epidemic”,January 2003

UNSPSC was created by the United Nations Development Program and Dun and Bradstreet merged their separate commodity classification codes into a single open system. It is a practical business tool product classification code which is an open standard taxonomy of products and services designed with hierarchical tree structure, which enables “drill down” and “roll up” analysis of products. (unspsc, 2013) The general purpose of UNSPSC is to enable collaborative commerce and being responsive to the market place. 

it is multilingual classification system translated to more than 10 languages



Segments exist cover raw materials, industrial equipment, components and supplies, enduse products and services.

Design issues

45 | P a g e

The design of UNSPSC is Hierarchical composed of 4-level tree structures:Segment, Family, Class and Commodity. All Category titles are unambiguous and mutually exclusive.each category has a single parent.The UNSPSC has code,title,definition,and business function components.The coding system looks like the following example specifieed for a printer tonner . Segment 44000000 Office Equipment and Accessories and Supplies Family 44100000 Office machines and their supplies and accessories Class 44103100 Printer and facsimile and photocopier supplies Commodity 44103103 Toner

So the Office Equipment and Accessories and Supplies is the 44th segment among a total of 55 segments, Office machines and their supplies and accessories is the 10th family in the 44th segmnet, Printer and facsimile and photocopier supplies is on the 31th class of 10th family of 44th segmnet,and toner is under the 31th class of 10th family of 44th segment. when we see its applicability,different organizations implemented it for Strategic sourcing, to Purchase volume leverage, Supplier performance tracking, budgeting, planning, Contract compliance, for inventory management, and e-Procurement (e-tendering). Many ecommerce and electronic tender systems websites adopted the UNSPSC approach,only the difference is classification the heirarchy level they use.the more ethe herarchy level the more specific the product retrieval is.for instance amazon.com apply level >=4 heirarchy while the biggest Ethiopian business portal(2merkato.com) uses a single hierarchy of product list. 3.3.2.unspscOWL The RDF-S version OF UNSPSC was released on 2002.But it did not properly reflect the specific semantics, nor do provide support for dealing with versioning dynamics. Both shortcomings in combination limit the usefulness of the RDF-S ontology version.The inspiration to create the owl lite version of UNSPSC was happened at that time to create a fully fledged products anad services ontology.Due to license case we cannot get access of full detail information about unspscOWL. A consistent unspscOWL ontology was derived from UNSPSC standard products and services classification by appling GenTax algorithm .

46 | P a g e

The GenTax algorithm is an approach for deriving consistent RDF-S and OWL ontologies from hierarchical classifications. It allows for the script-based creation of meaningful ontology classes for a particular context while preserving the original hierarchy, even if the latter is not a real subsumption hierarchy in this particular context (Hepp Martin, 2007). One key aspect of the approach is that it suggests using representative random samples of the overall classification for choosing appropriate modeling alternatives.

As a first assumption, they represented a

hierarchical categorization schema as a directed graph where nodes represent categories and edges represent the “broader term”or “has super-category” relation. Depending on the context, a set is related to each category. This set represents the items associated with the category in a particular context.This holds for many hierarchical Knowledge Organization Systems, e.g. the directory structures on our computers or standardized products and services classifications like the UNSPSC. 3.3.3.myOntology Although vast amount of ontologies are developed on the semantic web platform,only few are acceptable by the major community or important.This bottleneck is due to the fact that the community didn’t attended and involved in suppling localized knowledge to the web with no or little technical skill needed.Only some interested groups was tring to conceptualize the world.Recently the technnology called ‘wiki(pedia) was created to fill this gap. myOntology is a Community-driven and collaborative Vocabulary Design and Maintenance for E-Commerce domain based on wiki technology in order to make the vision of a Semantic Web for E-Commerce and other application domains a reality. Generally myOntology contributed a research add-on to the field of collaborative ontology engineering and semantic wikis. Design principles The design principles to create myontology is focusing on large community grounding,ease of accessibility,and the lightweightness of the ontology. 3.3.4. eClassOWl eClassOWl was developed by Digital Enterprise Research Institute (DERI) University of Innsbruck originally initiated by martin hepp in 2003.Now highly modified latest version of 6.1 is available but due to license issues version 5.1.4 is publicly available for research purpose.so, 47 | P a g e

our discussion is totally based on version 5.1.4 achievements. It is OWL ontology for describing the types and properties of products and services on the Semantic Web (also known as the "Web of Linked Data"). eClassOWL is a fully compliant ontology with GoodRelations meant to be used in combination with the GoodRelations ontology for e-commerce. (Hepp M. , Good Relations, 2008) It is based on eCl@ss (sophisticated product and services categorization standard i.e. more than 30,000 product or service types and more than 5,000 properties). eCl@ss includes the following main building blocks: (Hepp, 2006) 1. Products and services concepts (e.g. “TV Set”), 2. product properties (e.g. “screen size”), 3. values for enumerated data types 4. A hierarchy of the product concepts reflecting the perspective of a buying organization 5. recommendations which properties should be used for which type of products, and 6. recommendations which values are allowed for which (object) property Every eCl@ss standard data number is translated into three classes in the ontology (eClassOWL). For example “AKK255002” is translated into the generic concept “Agricultural Machine”, one for the taxonomy concept “Agricultural Machine as an eCl@ss Category”, and one class that is a subclass of both and can be used for the annotation of instances. (Hepp Martin, 2006) 3.3.5.GoodRelations ontology(gr) This ontology is created by the E-business web science research group in germany by proff. Martin Hepp released under the License of Creative Commons Attribution-No Derivative Works 3.0 Germany on 2008. GoodRelations is a lightweight, generic vocabulary for the Semantic Web that allows expressing all typical aspects of offers for goods and services on the Web (Hepp M. , Good Relations, 2008). Through this vocabulary manufacturers or digital webshop assistants can describe the exact meaning of their offers. It is designed as a complementary of eClassOWL. EClassOWL is the biggest product or service ontology which provides classes, attributes, and values for describing what a product or service is while GoodRelations provides everything needed for describing the actual offer and its details,

48 | P a g e

i.e., the relationship between a business entity and a product or service. (Hepp M. , Good Relations, 2008) GoodRelations ontology describes common business terms in the form of web resources, legal entities, service offerings, prices, terms and conditions. This ontology has been developed by answering competency questions related to the location of service offers on the web, availability of services in spatial and temporal dimensions, eligibility of customers, payment options, delivery methods, and tax calculations. It also includes concepts like ProductorServiceClass, QuantitativeValue and properties hasWarrantyScope, hasUnitOfMeasurement to match real world business requirements. Hepp introduced the competency questions in the form of semantic queries to be executed on the SPARQL end point to search from the knowledge base. Since the GoodRelations ontology is limited to some products description, Product Types ontology (PTO) was created by same research group as GoodRelations with large number of implicit products description. Product Types ontology is High-precision identifiers for product types based on Wikipedia. It provides 300,000 precise definitions for types of product or services that extend the schema.org and GoodRelations standards for e-commerce markup. Pto is reasoning based on owl DL (description logic). (Hepp M. , product ontology)

49 | P a g e

Ontology name

Goal

Modeling Focus

Design principle or method

Language

No of classes

No of propert ies

Base technology or complement Ontology

Coding with numbers

20,792 products

-

classification

30,000

5,000

20,792

-

UNSPSC

-

-

Based on Wiki technology

OWL DL

300,000 product or service types

15

-

Generic ecommerce vocabulary

OWL DL

27 concepts (classes)

92

complement to eClassOWL

bilingual onto(a customized Methontology approach)

OWL2 DL

UNSPSC

standard for Product and services domain

Taxonomy of products and services

Hierarchical approach

eCl@ss

Product classification and description standard

Products and services concepts , data types and properties

Mono Tree structure

unspscOWL

Collaboration

myOntology

Product and service Classification Processing community engineered ontologies

Product types(pto)

Taxonomy of products

product types based on Wikipedia

GoodRelations

EEPS

Data structure definition for e commerce

semantic annotation of Ethiopian Export products

Light weight Community driven ontology

Description of business terms i.e. legal entities, service offerings, prices, terms and conditions Domain-specific products and service@ export business

GenTax algorithm -

An Extension to GoodRelations

Coding with number + characters Owl lite -

Table 3.1: Comparison of major standard vocabularies and ontologies developed for ecommerce domain

50 | P a g e

-

GoodRelations compliant

3.5. Pragmatic analysis of GoodRelations ontology usage on the web The following keywords are the requirements that the reader should understand before proceeding to this episode.  Yahoo SearchMonkey is a Yahoo! service which allows developers and site owners to use structured data to make Yahoo! Search results more useful and visually appealing, and drive more relevant traffic to their sites.  Classes in ontology are abstract groups, sets, or collections of objects or they are abstract objects that are defined by values of aspects that are constraints for being member of the class. Some examples of GoodRelations ontology classes include gr: Brand, gr: BusinessEntity, gr: BusinessEntityType, gr: License,

gr: Location,

gr: Offering,

and

many more.  Object properties are Relations existing between any two classes (domain and range

classes) example gr: height.  Data properties are data type properties of any class which gives values for instances of that class. It is a relation between class and a data type value (String, Integer, Float, etc.).  Namespace is full version URI of each ontology location on the semantic web.  Prefix is a small acronym used to represent all ontologies to minimize ambiguity. For example gr refers to GoodRelations and foaf refers to friend of friend ontology.  Instantiation means that a term defined in ontology is being used in different usage scenarios (e.g. semantic annotation, knowledge representation, Semantic Web applications). (Ashraf J. , 2013) The latest generation of search engines such as Google, yahoo and BestBuy and others supported to use GoodRelations ontology to their search engine’s optimization, efficiency and accuracy. Google use rich snippets to embed more structural data in websites to make it more searchable and accessible easily among all information surrounding it. Snippets are the few lines of text that appear under every search result in which it is designed to give users a sense for what’s on the page and why it’s relevant to their query. For example if we search for vegetarian pizza on Google, the following result appears. The snippet for a recipe page shows total preparation time, a photo, and the recipe’s review rating and the amount of calorie in the recipe.

51 | P a g e

Fig 3.3 Google RDFs snippets example

So marking a webpage with structured data using microdata or RDFs format will attract potential buyers while they are searching for items to buy on Google, Submit product listings for free and Control product information. Also the accuracy and freshness of the product information can be maintained. (Ashraf J. a., 2011) In addition to this Google recently used gr: validThrough for price information and other properties too. Not only Google, but also Yahoo SearchMonkey already understands GoodRelations vocabulary, displays price and offering details and other meta-data of any e-commerce Web page if the site owner uses the free GoodRelations vocabulary. Our study found that Yahoo and Google currently includes price, availability (Google only), description and product pictures drawn from GRO annotated structured data as part of their enhanced search results. Without hesitation we can say that the goal of any web site owner is to be the first result in search engine. In e-business, if the web site is not visible in search engines then business lose a lot of customers. In order to reach this goal, a web site owner has to make his/her web site more visible on the web. Research’s shown that most searchers make use of the results that appear on the first results page in search engines without going further to the second or third results page. Another factor that affects web visibility is giving a meaningful appearance to the results. In addition to the latest and well known search engines now a day’s more than 10,000 webpages marked GoodRelations ontology to their content. The objective of our citation on empirical analysis of GR is to investigate the community implementation of said ontology using different metrics and to present the current usage pattern over the web. (Ashraf J. a., 2011) studied GR usage on their paper entitled “Open eBusiness Ontology Usage: Investigating Community Implementation of GoodRelations” on 2010. The paper was studied based on 105 publically available data sets and finally recommended to increase a dataset and experimenting regularly better due to the dynamic nature of eCommerce. ECommerce is dynamic in nature. For example different new products which may need high expressivity may arise in different times. Later on 2012, Jamshaid Ashraf studied the same title (but broad) as a partial fulfillment of his PhD work. 52 | P a g e

(Ashraf J. , 2013) He extended the dataset into 211 for better research result. In another dimension this shows how much the usage of GR is accelerating within some year’s gap. Among the given several business ontology choices, well studied analysis results will help the young Ethiopian retailers to understand the adoption level and to decide on: 

Which ontology should they use for adequate description of their data?



Which ontology or term is suitable for reusability?



To what extent (general or specific) they can structure their web content?

Here we presented main empirical results found from his investigation to understand the community implementation of GoodRelations ontology and finally we presented our exemplary implementation of GoodRelations vocabulary to describe major Ethiopian exportable products and services. 3.5.1. Conceptual Schema and Pivot Concepts The latest version of the GRO ontology comprises 31 concepts (classes), 50 object properties, 44 data properties and 48 named individuals. From a high level view, the GR model is based on three main concepts (Hepp M. , Good Relations, 2008), each focusing on a separate important aspect of the eCommerce domain. These three main concepts are Business Entity, Offering and Product or Service and each is discussed in detail in the following sections. 3.5.1.1. Business Entity

Hepp considered both an instance to represent a single business organization or multi branched organization. The gr:BusinessEntity concept represents a business organization (or any individual) which intends to offer or seek products on the Web. The main purpose of this concept is to provide the necessary attributes needed to describe any business, such as the name of the company, address, location, vertical industry in which it operates and any other identifier which makes it uniquely identifiable on the Web. For large organizations that have multiple outlets or shop

locations,

GRO

provides

concepts

(gr:Location

and

deprecated

gr:LocationOfSalesOrServiceProvisioning) to describe shops or service centers through which products or services are provided. Each shop location has its own operation hours which are described using the opening hour specification (gr:OpeningHoursSpecification).

53 | P a g e

3.5.1.2. Offering

Gr: Offering allows the description of a particular offering a business entity is likely to make or seek on the Web. In the latest version, there are 15 data type properties (all optional) available to describe offer details such as availability, validity, name and description of the offering. Recently, name and description have also been added to make it easy to give any name and description to allow users to know more about the offer itself. Offering can include one or more products with a price specification describable in any possible currency. Also it is possible to attach supplementary details such as warranty promises, customers who are eligible for the offer, shipment options and charges and acceptable methods of payment. 3.5.1.3. Product or Service

GROs main focus is to cover the conceptual model of offering rather than being product ontology. Other product ontologies are available independently. However, gr: ProductOrService and its sub-concepts can be used to describe a product and its qualitative and quantitative properties to describe lightweight product ontology. 3.5.2. GoodRelations on the LOD data space According to ping the semantic web, the GoodRelations namespace is used in more than 651,045 documents. Only 4 (1.36%) out of 295 on LOD cloud are reported to have used GRO and on other hand, PingTheSemanticWeb.com ranks GRO as the third most used ontology after FOAF and OWL. Again from the Original results of the Sindice crawl the following numerical data is observed about GoodRelations vocabulary 

1.72295168E8 RDF triples that used the GR vocabulary



5955196 available graphs that use the GR vocabulary



399 available domains that use the GR vocabulary

By observing the structured eCommerce data landscape (while building the GRDS), Jamshaid Ashraf categorized GoodRelations data publishers into three groups, based on their publishing approach, usage pattern and data volume. These are: A. Large Size Retailers This group includes large online eRetailers and retailers who are traditionally premises-based and have only recently entered the eRetailing business. Such data sources provide more detailed (rich) offerings and product descriptions which is useful for entity consolidation and interlinking 54 | P a g e

with other datasets. Such companies include Volkswagen.com.uk, BestBuy.com, Overstock.com, Oreilly.com, and Suitcase.com, to name a few. B. Web shops These are plugins, modules or components designed to comply with content management systems such as joomla, Drupal, WordPress and Liferay. A large number of semantic eCommerce adopters are small to medium Web shops which offer their products and services mainly through Web channels. Most of these Web shops use Web content management packages such as Maganto, Oxid-eSales, eCommerce13, osCommerce and Joomla Virtuemart add RDFa data in offer-related Web pages. C. Data Service providers (Data spaces) Data space providers are providers that publish RDF data of ecommerce on linked open data for example, Linked Open Commerce (LOD cloud). 3.5.3. Namespace Analysis in GRDS Namespace analysis refers to usage of different namespaces in percentage among the GoodRelations data set (RDF data) collected from 211 websites. Totally 48 namespaces found in the dataset but only 22 necessary namespaces are presented as follows (Ashraf J. , 2013). From this we can conclude that the top most used vocabulary in the LOC data space is goodrelations which covers 97.16 percent.

55 | P a g e

Table 3.2: list of ontologies and their percentage in the GRDS.

3.5.4. The Concept Usage analysis Here we presented the GoodRelations usage analysis made by the recent research (Ashraf J. , 2013) based on the conceptual usage metrics. The conceptual usage level based on three major concepts of GR (gr: BusinessEntity, gr: Offering and gr: ProductOrService) pivotal concepts from the data set is discussed using summary tables. i.

gr: BusinessEntity analysis

In GRO, gr: BusinessEntity represents a business organization (or any individual) which intends to offer or seek products on the web. From their dataset totally 789440 entities in total were available and of these, 54,542 are of the type gr: BusinessEntity concept. This means that 6.9% of the entities are of this type in the GRDS.

56 | P a g e

Table 3.3: usage of gr: businessEntity class in the GRDS

ii.

gr:Offering Analysis

gr: Offering is the concept which enables business entities to publish their offers on the Web, either for selling or buying products. An interesting finding was the use of different but related vocabularies to semantically describe offering-related information. Three vocabularies which supplement offering information, namely media, rev and comm are included, however, two names which are included in the gr: BusinessEntity concept, vCard and schema vocabularies have been excluded. Another interesting finding is the use of product vocabularies to describe the products being offered; therefore, the use of different concepts defined in product ontology as part of the Class Usage can also be seen. Next, the use of label properties by the entities of the gr:Offering type was analyzed. Of 61330 entities, 11% used labeling properties with the following distribution: Entityfl = 4,171 (62% of entities used formal labels) and Entitydl = 2,610 (38% of entities used domain labels). (Ashraf J. , 2013) iii.

gr:ProductOrService Analysis

In GRO, a lightweight description of the products being offered is described through gr: ProductOrService and three of its sub-classes. In total, there are roughly 38,000 entities defined as ‘type of product’. (Ashraf J. , 2013) Since in GRO, product related concepts are arranged in a 57 | P a g e

taxonomical hierarchy to allow users to specify the exact nature of the product being offered, the subsumption axiom is used to include all the instances belonging to the super concept.

Table 3.4: Usage of gr: ProductOrService (Ashraf J. , 2013)

Finally, on the observation part of the analysis on (Ashraf J. , 2013) they ranked the usage of GoodRelations concepts according to the following table. The rank of each term is calculated by incorporating the three aspects, namely the richness of the term in the ontology; the use of each term in the dataset; and the incentives based on the term’s acceptance in different traditional search engines.

58 | P a g e

Table 3.5: GoodRelations Concepts Usage Analysis and their ranks (Ashraf J. , 2013) (CR= considering

richness, CU=usage and incentive measures. The computation of respective numbers for each measure is performed as follows. 

For the computation of incentives, three search engines, Google, Yahoo and Bing i.e. S = {Google, yahoo, Bing} are considered the sources which recognize structured data on web pages and particularly meta-data using Web ontologies. Weights 0.5, 0.3 and 0.2 are given based on their popularity.



An appropriate weight for richness, usage and incentives are 0.3, 0.5, and 0.2, respectively based on the relative importance metrics.



The numeric values of each measurement are calculated by accessing the knowledge base containing both the terminological statements (T-Box) to measure richness and assertion statements (A-Box) to measure the usage of a given ontology.



For example, the CR value of gr:BusinessEntity in Table is calculated by querying the ontology graph which returns 5 relationships and 9 attributes giving 14 as the raw value of CR. For CU, the following SPARQL query returns 62,347 instances of business entity

59 | P a g e

type. Given, t = gr:BuisnessEntity is a concept and its incentive value is 0.433, the normalized values with respective weights based on the following equation.

Fig 3.4: SPARQL query to compute CU metrics value

Rankt = (0.3*14/31) + (0.5*62,347/989,638) + (0.2*.433) = 0.254 where Rank t computes the rank value for t or gr:BuisnessEntity. Finally, as it was observed on table 3.5, among the top rated GRO concepts the 1st (gr:offering), the 3rd (gr: UnitPriceSpecification) and the 5th (gr: productOrService) are highly confortable and reused in our EEPS ontology model.

60 | P a g e

CHAPTER 4 DESIGN and DEVELOPMENT of ETHIOPIAN EXPORT PRODUCTS ONTOLOGY (EEPS) 4.1. Introduction Product information is an essential component in e-commerce. It contains information such as pricing, features, and terms about products traded between partners. Clearly defined product information in a structured manner is a necessary foundation for business intelligence. Furthermore, semantically enriched product information may enhance the quality and effectiveness of business transactions. Ontology plays a vital to this motivation. On the preceding chapter we have discussed the important aspect of GoodRelations ontology and discussed its usage analysis. By considering its acceptance and relevancy to our research, our ontology is designed in order to be compatible with the GoodRelations ontology which is a standard ontology of ecommerce. Additionally, we have reused the following ontologies, thesauri, classifications and taxonomies to construct our own EEPS (Ethiopian export products ontology).The related ontologies are used as it is with their ontology namespace prefixes (e.g. gr: ProductOrService).Rather, the related thesauri, classifications and taxonomies are used after they are converted into RDF and OWL ontology using Gentax algorithm. (Hepp Martin, 2007)  GoodRelations ontology (gr) -global ecommerce ontology  Vcard ontology-ontology of people and organizations.  Product Types Ontology (pto)  Food ontology(food)  eClassOWL  Google products taxonomy In this context eept refers to the export products ontology from Ethiopia only as part of Ethiopian commodity exchanges. Totally there are around nine specific export products selected for top level modeling. These are Coffee, Livestock Products, Live Animals and Meat, Sesame (Oilseeds and Pulses), Chat, Vegetables and Fruits, Flowers, and chat. (Exported Ethiopian Products ) EEPS is product ontology for a vertical industry that extend and comply with GoodRelations which is developed based on the suggested guidelines on (The GoodRelations Wiki): namely, (1) 61 | P a g e

defining all classes of products and services needed; (2) collecting all relevant properties for each class; (3) for every property, determining which type of property it is (either object or data type properties); (4) collecting all relevant value types or predefined values. The development of essential EEPS ontology will add a value to the semantics of products and for search engine optimization. Our ontology model includes classes, definitions, properties, relationships, and axioms of the concepts that are fundamental to describe product ontology (eeps). We have followed METHONTOLOGY development approach (SeeAlso:chapter2). For ease of our work we have customized METHONTOLOGY methodology into four main phases, namely Specification, Conceptualization, Formalization and Implementation. The proposed localized methodology of this research is discussed as follows. 4.2. The proposed Methodology The following diagram depicts the proposed methodology to achieve EEPS ontology model. The journey is mainly composed of three phases of development i.e. pre-development, development and post development phases. The first phase is generally a requirements elicitation phase in which the need for the development of EEPS is clearly stated while the second phase is the body of our work. During formalization the conceptual structure pattern for GoodRelations-compliant ontology (The GoodRelations Wiki) is reused. On the same stage, appropriate and related product classifications, product catalogues, and taxonomies are reused after they are converted into formal ontology using GenTax algorithm. On the final phase the developed system is evaluated based on mechanized techniques such as SPARQL querying to confirm whether the competency questions are answered or not, inference or reasoning with HermiT reasoner to check consistency of the ontology and finally the incorporation of comments offered by commodity exchange marketing domain experts.

62 | P a g e

EEPS Ontology

Scenarios

Functional Requirements

Pre-development Stage

Competency questions

Scope Determination

Specification

Conceptualization Development Stage

Domain capture with UML class diagram

Formalization

Apply GenTax algorithm

Localization

Word by word translation into Amharic Concepts

Product classifications and taxonomies

Reuse

OWL 2

Implementation

Post-development stage SPARQL Query

Evaluation

Domain Expert Evaluation

Extensibility check RDF data model Validation Fig 4.1 The Proposed methodology

63 | P a g e

Reasoning or Consistency Check

4.2.1. Specification phase In this section, we develop the requirements for the EEPS ontology. The goal of this ontology requirements specification activity is to state why the ontology is being built, what it’s intended uses are, who the end users are, and which requirements the ontology should fulfill. We identified the intended users, motivation scenarios and competency questions. Therefore throughout the development of EEPT ontology, the requirements of the specification phase are: 

Capturing the scope of the ontology.



Identify the key users.



Description of some motivating scenarios to spell out the requirements in a more descriptive way.



For each identified scenario, we extracted the key requirements in the form of competency questions (CQs) to develop the detailed ontology requirements specifications.

4.2.1.1. Scope of the ontology

The emphasis of EEPS ontology is mainly to represent the selected export product items in a structured and machine understandable manner in addition to the depiction of some business entity’s information such as the description of Exporter associations and their respective activities. 4.2.1.2. Users and Motivating Scenarios

In the following, we have identified users of EEPS ontology and describe a few typical examples of product offerings made on the Web. User 1: Web shop application developers and publishers Application developers or programmers want to explore available product ontologies which are much related with their domain specific products for shopping so that they incite to have a semantic website. So they are interested in the following queries. 1. Which ontologies are being frequently used and what is their level of adoption? This information will help them in semantically annotating product information on their Websites. 2. Which ontologies are standard and which are not?

64 | P a g e

3. Is there an extension (module, component, plugin) for CMS software’s available for the standard product description ontology? This information will help them to develop their shops with CMS software’s in a simple manner without manually coding the ontology. For instance, Rich Snippets and Semantic SEO with GoodRelations is available as part of a Virtuemart extension. User 2: Semantic Web researchers Ontology engineers and semantic web practitioners are interested in the following points 1. What are list of ontologies that are being used for product categorization and description for literature review and consider reusing? 2. How domain is captured and what methodology is used to construct EEPS? User 3: End users or customers End users are interested to use search engines to retrieve products information with their own query expression and they want to get it visible on the first pages of the search engine they use with full of snippets(image, price information, quality level etc.) Motivating scenarios In the following, we describe a few typical examples of Ethiopian commodity exchange (ECX) and export product offerings made on the Web after a survey of different web shops. Scenario 1: A web resource represents (1) the product list, (2) description of product characteristics and properties. And (3) processing methods of the product Example: http://www.aletalandcoffee.com/products/index.html describes Lekempti Coffee Grade 4&5.This product list is characterized by a fruity flavor and bright acidity. Beans are processed by the dry and wet methods. Scenario 2: A Web resource represents the description of up-to-date market information: (1) for a particular commodity and its properties including Symbol, High price, Low price, Delivery Center, stock position, production Year, Prvs Close and, Volume (Ton).(2) provide a complete or partial directory of business activities in the country.

65 | P a g e

Example: http://www.ecx.com.et/Home.aspx describes ECX coffee contracts which symbol is WYCA3, delivered from DL (Dilla), produced in 2002, has maximum price of 1000 birr per Frenula(17 kg.),and a volume of 12.60 ton. Scenario 3: A particular Web resource represents (1) the description of a particular range of products, determined either by product classes, quality standards, or property ranges, and (2) a concrete offer to provide a certain type of service for this range of products (e.g. washing, Quality testing, packaging, coffee-processing and warehousing services). Scenario 4: A particular Web resource represents the exporter’s information, clients (importers) location and the export commodity. Example: http://www.dayebensacoffee.org/index.html describes about Daye Bensa Coffee Export Plc., exporters of Arabica Coffee whose clients are found in USA, Europe and Asia. 4.2.1.3 .Functional requirements With the guidance of the above listed motivating scenarios here we derived the following functional requirements on the EEPS ontology. R1: The ontology must differentiate between offerings, business entities, and actual product or services instances. R2: The ontology must give explicit knowledge of each business entity involved in export procedure, each export commodities including type, color, characteristics, processing, market, service, delivery information and others. R3: The ontology should be compatible with existing products and services ontologies, namely GoodRelations. R4: The ontology should differentiate between companies which are involved in import, export or general business. R5: For quantitative properties of product models and actual products, it must be possible to represent units of measurements (e.g. kg, ton, meter, etc). R6: Different prices for varieties of a similar product should be specifiable.

66 | P a g e

R7: The ontology should be based on current Semantic Web standards with mature tooling support, so that it can be implemented on current SW technology pillars. R8: The ontology should support the annotation of a product in local language Amharic. 4.2.1.4. Competency questions The developed ontology should answer the following questions and more. (Competency questions are not limited to these) CQ1. List all coffee contracts of export_speciality_washed coffee? CQ2. Which market imports which type | grade | or standard of a product via which export organization | association | person | private limited company? CQ3. Which region coffee has a taste of coca? CQ4. List the name of all coffee | Livestock | Live Animals and Meat | Sesame | chat exporters? CQ5. What are the addresses of each individual exporter? CQ6. List exporting companies based on the arrangement of their major exported products? 4.2.2. Conceptualization Phase The goal of this phase is to structure the partial Ethiopian commodity exchange and export trade domain knowledge in the form of a conceptual model. Necessary vocabularies and conceptual entities which represent the domain are identified. Steps followed in this phase are: 1) First the relevant types of conceptual entities and relationships in the domain are outlined. 2) Second we searched for similar terminologies in existing ontologies and identify similar concepts to include them to our terminology set. 3) Finally the hierarchical subsumption relationship between classes is presented and visualized in the form of onto graph. 4.2.2.1. Domain Capture



eeps:Exporter

The seller of goods and services is referred to as an "exporter" who ships the goods and services out of the port of a country to its one or more clients outside Ethiopia. An exporter can be a private exporter or an association. An association has a similar meaning with vcard:Organization.



eeps:importer

67 | P a g e

The Foreign County or Organization Which Purchases the Ethiopian Commodity. This class has a gr:legal name and vcad:address. 

eeps:ExportProduct

Primary goods produced within country which is shipped to another country for future sale or trade. Major export products which are subclasses of this class and classes of actual products that are similar in function or nature are listed below. 

eeps:Coffee

It is the same concept with Coffea Arabica in this document. Because Ethiopia is the producer and origin of coffee Arabica. It is the roasted and ground seeds of a coffee plant with different types and characteristics. The classification of coffee is based on its origin. These are Sidamo coffee, Illubabor coffee, Bebeka coffee, Djimmah coffee, Yirgacheffee coffee, Limu coffee, Harer coffee and Lekempti coffee. 

eeps:LivestockAndMeat

Domestic animals, such as sheep, hens, cattle or horses, rose for home use or for profit, especially on a farm. 

eeps: HideSkinAndLeather

Hide as used in the leather industry, it refers to a whole pelt from one of the larger animals (cattle, horse, etc.), in contrast to the term "skin," the pelt of young or smaller animals while Skin is Pelt from a young or small animal (sheep, calf, goat, etc.). 

eeps:SesameAndPulse

Sesame is a flowering plant in the genus Sesamum which is considered to be the oldest oilseed crop known to humanity while Pulse is a general category of commodities such as beans, peas, lentils, etc. This class has two subclasses named sesame and pulse. Sesame has variety of types such as Humera type, Gonder type, Wellega type and Redish type. 

eeps: FruitAndVegetable

This class includes Fruits that are exported to European market such as Strawberry, Fresh green beans, Cherry Tomato, Capsicum and fruits exported to Neighboring & Middle East Market such

68 | P a g e

as Banana, Orange, Avocado, Mandarin, papaya, Pineapple, guava, Grapevine Grape Fruit, Lime, Lemon, Prim, and Apple. 

eeps:Flower

Roses are the most widely produced variety of flowers. Other types of flowers currently in production include gypsophilia, hypericum, limonium, chrysanthemum, carnations, static and pot plants. Some of export flower classifications include Lovely Jewel, Candid prophyta, carnetion, Lilium, Solidago, gladious, sonrisa, fresh jasmine flower buds, s, California Flowers, Floral Images. 

eeps:Chat

Chat (Catha edulis, Khat or Qat) is an evergreen plant used commonly for mastication and its sympathomimetic. 

gr:Offering

An Offering represents the demand announcement by gr: BusinessEntity to provide a certain Business Function for a certain Product or Service Instance to a specified target audience. This concept is reused from the global GoodRelations vocabulary. 

gr:BusinessFunction

The Business Function specifies the type of activity or access offered by the Business Entity on the Product or Services through the Offering. Business functions can be obtaining Export Authorization Certificate, make Payment, Provide Service, customs formalities such as VAT registration, sales, marketing, customer service, sales promotion etc. 

eeps: MakePayment is any pre or post product delivery payment undergone between the exporter and importer.



gr:ExportService

This concept indicates that the Business Entity offers to provide the type of Service. For example: Some of the services given by exporters include maintenance, Transportation (sea, air, land, internal waterway, space, and pipeline),Packing, and Paperwork, branding and packaging, merchandise, freight, insurance, travel, royalties, license fees, and other services, such as communication, construction, financial, information, business, personal, and government services. All the above mentioned services are sub classes of eeps:ExportService. 69 | P a g e



eeps: contract

Contract is a standard agreement to buy or sell a specified commodity detailing the amount, grade and price of the commodity and the date on which the contact will mature and become deliverable for purposes of trading on the exchange. Future contract, spot contract are types of contact. Export contract payment method (gr:paymentMethod) can be hand delivery, fax, telex or post. 

gr: DeliveryMethod

A delivery method is a standardized procedure for transferring the product or service to the destination of fulfillment chosen by the customer. Delivery methods are characterized by the means of transportation used, and by the organization or group that is the contracting party for the sending gr: BusinessEntity (this is important, since the contracted party may subcontract the fulfillment to smaller, regional businesses). 

vcard:Organization- is an object representing an organization. An organization is a single entity, and might represent a business or government, a department or division within a business or government, a club, an association, or the like.



vcard:Group - is an Object representing a group of persons or entities. A group object will usually contain hasMember properties to specify the members of the group (exporters). (vCard Ontology, 2013)



eeps:GrowingRegion

Is the location where member of eeps:ExportProduct is grown. Its instances are primary production wereda or zones found in Ethiopia. 

gr:Brand

A brand is the identity of a specific product, service, or business. foaf:logo is used for attaching a brand logo and gr:name or rdfs:label for attaching the brand name.e.g Ethiopian coffee brands such as “Ethiopian Fine Coffee” , “Harar Ethiopian Fine Coffee” , “Yirgacheffe Ethiopian Fine Coffee”, and “Sidamo Ethiopian Fine Coffee”. 

gr:PriceSpecification

70 | P a g e

The

superclass

of

all

price

specifications

such

as

gr:DeliveryChargeSpecification,

gr:PaymentChargeSpecification and gr:UnitPriceSpecification conceptual entities as already defined in GoodRelations ontology. The above mentioned Botang font words and phrases are major concepts of EEPS ontology. Below, major relationships (on the next phase further spited into object and data properties) between concepts or individuals are defined. 

gr:qualitativeProductOrServiceProperty: This is the super property of all qualitative properties for products and services. All properties in product or service ontologies for which gr:QualitativeValue instances are specified are sub properties of this property.



gr:quantitativeProductOrServiceProperty: This is the super property of all quantitative properties for products and services.

 

eeps:hasProductionArea: is a binary relation between eeps:ExportProduct eeps:GrowingRegion eeps:hasCollectionCenter: is a binary relation between eeps:Exporter

and and

eeps:CollectionCenter where eeps:CollectionCenter is the subclass of vcard:organization  

gr:acceptedPaymentMethods: is the gr:PaymentMethod or methods accepted by the gr:BusinessEntity for the given gr:Offering. gr:appliesToDeliveryMethod: This property specifies the gr:DeliveryMethod to which the gr:DeliveryChargeSpecification applies.



eeps:hasGrade:

This

property

is

a

binary

relation

between

subclasses

of

eeps:ExportProduct and different grade instances .For example eeps:coffee has grade values in number 1-9 based on a combined results of physical and taste values. This property shows the quality of a product. 

eeps: hasMarketCountry: is a property which relates the eeps:Exporter with eeps:Importer. This property specifies the geo-political region or regions for which the offer is valid using the two-character version of ISO 3166-1 (ISO 3166-1 alpha-2) for regions or ISO 3166-2, which breaks down the countries from ISO 3166-1 into administrative subdivisions. (Hepp M. , Good Relations, 2008) It is same as gr: eligibleRegions.



eeps:hasVolume is the same as gr:eligibleTransactionVolume.This property can be used to indicate the transaction volume, in a monetary unit, for which the gr:Offering or gr:PriceSpecification is valid.

71 | P a g e



gr:weight is the weight of the gr:ProductOrService.Typical unit code(s): GRM for gram, KGM for kilogram, LBR for pound



eeps:hasIngredient: this represents the composition elements of gr:productOrService .It is a subclass of gr:qualitativeProductOrServiceProperty



vcard:Address: specifies the address of eeps:exporters including phone number, email, fax and others. For this, typical address standards are taken from vcard ontology.



vcard:Email: is used to specify the electronic mail address for communication with the gr:BusinessEntity object. We use vcard:hasEmail object property.

 

It is the subclass of vcard:Address class. gr:isSimilarTo: This states that a given gr:ProductOrService is similar to another product or service. gr:hasBrand: This specifies the brand or brands (gr:Brand) associated with a gr: ProductOrService, or the brand or brands maintained by a gr:BusinessEntity.



gr:hasCurrencyValue: This property specifies the amount of money for a price per unit, shipping charges, or payment charges. The currency and other relevant details are attached to the respective gr:PriceSpecification etc.



gr:hasMaxCurrencyValue:

This property specifies the UPPER BOUND of the amount

of money for a price RANGE per unit, shipping charges, or payment charges.

 eeps:hasScientificName: this property specifies the scientific name instances of eeps:exportProduct sub classes.



vcard:hasTelephone- To specify the telephone number for telephony communication with the object.it is sub class of vcard:Address.

 

eeps:hasHarvestperiod: this property specifies the period of production for eeps:exportProduct gr:datatypeProductOrServiceProperty: This property is the super property for all pure data type properties that can be used to describe a product and services instance

 

gr:Color: is the sub property of gr:datatypeProductOrServiceProperty which describes the color of product. vcard:hasCountryName - Used to support property parameters for the country name data property



eeps:hasFlavour: specifies the distinctive taste of a product



eeps:hasBodyLevel: This property specifies the size of the commodity instance which is used as grading criteria.



gr:LegalName: The legal name of the gr:BusinessEntity.

72 | P a g e



eeps:hasAltitude: Specifies the altitude of the eeps:GrowingRegion where the product grows up.



gr:hasCurrency- The currency for all prices in the gr:PriceSpecification given using the ISO 4217 standard (3 characters).



gr:vatID - The Value-added Tax ID of the gr:BusinessEntity.



eeps:Shape: This property specifies the shape of an individual a product.



eeps:hasShipmentPeriod- Period taken by a product delivery to reach its intended destination. Together with order time, it constitutes the elapsed time between a requisition and the item's availability.



vcard:CountryName - is the country name associated with the address of the object

To sum up, on the conceptualization model different concepts and entities within the export business product and service domain was critically identified after a significant study of different information sources such as websites and finally relationships between each individual was identified. Instances and some sub classes are missed to minimize the size of the document. All concepts have a prefix such as eeps, gr, vcard and foaf. In this case eeps refers to Ethiopian

73 | P a g e

Fig 4.2 Partial OntoGraphical- hierarchical view of concepts

74 | P a g e

Export product and Services ontology, gr refers to GoodRelations ontology, vcard refers to vcard ontology and foaf refers to friend of friend ontology. The namespaces of each vocabulary are sited on the implementation part. In a sense, the combined concepts are derived and reused from those ontologies. The expanded hierarchical (subsumption) view of the major concepts and individuals is visualized with OntoGraph view (the graph visualization component of protégé owl) Fig 4.2. Note: On the graph above only classes (main concepts) and individuals or instances are visualized. 4.2.3. Formalization phase On the conceptualization phase important elements of owl i.e. classes, object properties, data properties, annotation properties and individuals were not identified explicitly. In this phase, the conceptual model presented on the previous phase is formalized independent of the underlying ontology language and platform. 4.2.3.1. Classes

Classes are the main building blocks of ontology. High level product categories are mapped into classes (it doesn’t mean too specific for example product name). Most proposed classes are categories of products or services that provide the same functionality or benefit, or that are of the same nature. gr:productOrService is taken as the super class of all available and future products based on the guide learnt from pattern for GoodRelations compliant ontologies. All qualitative and quantitative values of products are modeled as a sub class of gr:qualitativevalue and gr:quantitativeValue respectively.

75 | P a g e

Fig 4.3 EEPS Classes

76 | P a g e

In owl the class ‘Thing’ is the class that represents the set containing, all individuals. Because of this all classes are subclasses of Thing. On the first version of eeps ontology 49 named classes are captured. Among these, some of them are disjoint classes, so that an individual (or object) cannot be an instance of more than one of these disjoint classes.

Fig 4.4.Description of the eeps:LivestockAndMeat class

As observed on fig 4.4 eeps:LivestockAndMeat is a sub class of eeps:ExportProduct. eeps:Coffee, eeps: chat and eeps:Flower are disjoint classes of eeps: LivestockAndMeat. Therefore an instance or object or individual cannot be more than one of these classes. For example, a flower type “freshJasmineFlower” cannot be either a chat type, or coffee and vice

77 | P a g e

versa is true. Instances stated under the member’s icon are individuals or objects of the said class. 4.2.3.2. Object Properties

Object properties in EEPS ontology describe relationships between any two or more individuals. Proper object property names begin with "has" or “is” prefixes. So that an instance of a products

class can be described in more detail using properties. E.g. Coffee product can be described with its taste or color. As per the guide explaining how ontology’s describing product and services categories plus their attributes must be designed in order to be compatible with the GoodRelations ontology (The GoodRelations Wiki) all properties are grouped under three basic properties

known

gr:datatypeProductOrServiceProperty,gr:qualitativeProductOrServiceProperty,or gr:quantitativeProductOrServiceProperty.

Fig 4.4 EEPS object properties

78 | P a g e

as

Every property has a specified domain and range. Domain and range properties link individuals from the domain to individuals from the range for example, the object property hasMarketCountry has domain eeps:Exporter and has range eeps:Importer.This means individuals from class eeps:Exporter are linked to individuals in eeps:Importer class.Next to identification of necessary domain and range classes, property characteristics such as functional, inverse functional, transitive, symmetric, asymmetric, reflexive and irreflexive properties are applied on some classes. 4.2.3.3. Data type properties

Data type properties are properties that link an individual to any XML Schema data type value or an RDF literal. In other words, they describe relationships between an individual and data values.14 data type properties listed below is identified on the first version of EEPS ontology.

Fig 4.5.EEPS data type properties

79 | P a g e

Fig 4.6 Data type property assertion for Afework International Group

On fig 4.6 an individual Afework International group is a member of the class “PrivateCompany” .This Company has P.O.Box which is XML Schema data type integer value i.e. 101531.On the left panel of the diagram the statement IsExporterOf some eeps: coffee is called the existential restriction. Meaning this company should export at least one type of coffee. All in all, a UML diagram is used as a domain capturing tool to understand the domain with illustration of the interplay between the EEPS languages. OWLGrEd is an interoperable tool with Protégé which is a used for UML style graphical editor for OWL ontology. With this tool whole ontology at a glance is visualized instead of scrolling the text code. Since the model is big (especially lot of individuals) it is not visible on A4 paper size. On this diagram, classes, ontology fragments, objects, equivalent classes, disjoint classes,datatypes ,annotation properties, data restrictions, and different types of connections such as ‘instance of’ are clearly visualized.

80 | P a g e

Fig 4.7. UML Class Diagram for EEPS

81 | P a g e

4.3. Implementation phase or Ontology coding Under this phase the formalized model presented on the preceding formalization phase is coded with a formal web ontology language (OWL 2).Protégé 4.3 version is used to edit the ontology and to generate the RDF, RDFS and OWL2 codes. During the implementation all the necessary ontology prefixes and Uri’s are predefined. The full RDF/owl code is too much and it is not attached in this document. Here is a sample fragment of RDF/OWL 2 code. ]> 82 | P a g e



...

83 | P a g e

4.4. Bilingual ontology development approach Bilingual ontology construction can mean two things. (1) If we want to have an ontology in which the concepts can be presented to both an English reader and Amharic reader, but there's only one base set of concepts. In this case the identity of the concept is expressed as a URI, for example:.There's only one concept denoting coffee, but the only way to present it in either language is adding labels and comments on the annotation property of owl language because adding comments and labels is a basic facility in RDF. rdfs:label "coffee"@en; rdfs:label "ቡና"@am; rdfs:comment "Denotes the class of all coffees"@en ; rdfs:comment " ሁሉንም የቡና አይነቶች ያጠቃልላል "@am. (2) On the other hand, bilingualism refers one want to have one ontology that contains concepts drawn from both languages that can be harder in this research. So defining a single ontology that merges concepts from both world-views can be difficult. Our focus and development approach is the one stated before. Amharic is registered in language sub tag registry as: Type: language Subtag: am Description: Amharic Added: 2005-10-16 Suppress-Script: Ethi

Therefore to construct EEPS Amharic ontology in the case of protégé: 1. we put all the classes, relationships, individuals and values based on the model 2. Choose annotations property, use the label attribute, fill the ISO symbol of Amharic i.e. ‘am’ in the language text box. The letters can be observed on the owl code also. 3. Do this for all concepts manually.

84 | P a g e

CHAPTER 5 EVALUATION OF OUR WORK The developed ontology data model is tested and validated based on four parameters. 5.1. Reasoning based on Description Logics HermiT reasoner is an OWL reasoner based on a novel “hypertableau” calculus which provides much more efficient reasoning than any previously-known algorithm. (HermiT OWL Reasoner) It supports SWRL rules; this reasoner supports Abox (assertion box) and Tbox (terminology box) reasoning over classes and individuals respectively. Since the developed ontology is based on OWL API 3.4.2, HermiT 1.3.7 can handle DL Safe rules and the rules can directly be added to the input ontology in OWL syntaxes supported by the OWL API.During reasoning, we have checked two things via the reasoner service. 1. Subsumption checking is to test whether or not one class is a subclass of another class. By performing such tests on the classes in ontology it is possible for a reasoner to compute the inferred ontology class hierarchy. During testing the bilingually nature does not affect our result. 2. Consistency checking-Based on the description (conditions) of a class the reasoner have checked whether or not it is possible for the class to have any instances. A class is deemed to be inconsistent if it cannot possibly have any instances.

Fig 5.1 Classification result 85 | P a g e

As it can be observed from the screenshot on fig 5.1 no inconsistent ontology is observed. HermiT classified EEPS ontology in 260ms which is recorded as very fast compared to the size of the ontology. This indicates the ontology is efficient. As a result, the inferred class hierarchy is generated as shown on the screen shot fig 5.2.

Figure 5.2 The Inferred Hierarchy Pane alongside the Asserted Hierarchy Pane after classification has taken place

5.2.Extensibility and reusability: The extensibility of an ontology is checked by looking at the generality of the classes defined in the ontology while reusability is checked by looking at the applicability of these classes to other ontologies. Using these criteria, this ontology is found to be extensible as it has general class like gr:productOrService that can be used to extend to other newly generated commodities, products, and services. It is also found to be reusable as it has common classes like eeps:Coffee in which terms like this are well known and common all over the world.so that it can be used in other future ontologies.

86 | P a g e

5.3. RDF validation The RDF data model is validated using W3C RDF Validation Service. Around 700 number of RDF triples are generated without error (such as missing properties,

missing or improper type

arcs, missing referents, inconsistent data, value not in prescribed) is not observed as shown below on the screenshot. Only 7 triples are observed on the first page. Successfully validated RDF document is ready for successful query results and to be reused in linked data systems.

Fig 5.3 RDF validation

5.4. SPARQL Querying After validating the RDF document it is possible to execute SPARQL query to filter out individuals with specific characteristics. The queries we made are direct translation of the competency questions identified on the design phase. We used protégé 4.3 SPARQL tab to do this. Here we have presented a simple example only. Lot of queries can be created and writing all queries is out of scope of this document. The owl API successfully results the query result in the form of predicates. For example, if we want to get all instances or members of a certain classes. 87 | P a g e

It is observed that some types of EXPORT_SPECIALITY_WASHED COFFEE are listed on the right bottom of the screenshot (YIRGACHEFE_A*, SIDAMA_E, GELENA_ABAYA_B** AND OTHERS.)

PREFIX eeps: PREFIX gr: SELECT * WHERE { ?subject rdfs:subClassOf ?object . ?instance a ?subject . }

Fig 5.4 SPARQL Querying members or elements of a class.

88 | P a g e

CHAPTER 6 CONCLUSION AND FUTURE WORKS 6.1. Conclusion In this research, we have shown the limitation of current standards that are used to describe product information over the web specially for vertical industries. These limitation may arise mainly due to the fact that commodities are described using syntactic and semisemantic descriptions. Before some years semantic description was used for academic purpose in the universities. Now days, the paradigm is shifted to industrial applications such as ecommerce. Since 2008, the effect of this paradigm shift was applied on ecommerce domain.i.e the innovation of GoodRelations vocabulary which is the most powerful vocabulary for publishing all of the details of products and services in a way friendly to search engines, mobile applications, and browser extensions. In this research we have studied pragmatically, how this vocabulary or ontology is conceptualized, designed, distributed, adopted and implemented with the semantic web technology pillar. In different countries different product types may arise in different times. Due to this products dynamic nature, GoodRelations was designed to be compatible with specialized vertical industries. Based on our research findings, it is suitable to design our own ontology of Ethiopian export products and services (EEPS) domain. EEPS ontology is very flexible, consistent, while moderate in size. The ontology is built from 49 named classes, 18 object properties, and 14 data type properties while the number of individuals is counted as more than a hundred. The more new product is added to the model, the more the number of ontology elements will be. Our ontology can be reasoned with any Owl reasoner, however when the number of instances increase, the classification time will slow down. In addition to this, EEPS should be compatible with some pragmatic reasoning support for SPARQL. EEPS is not only product or service ontology but it also describes the business entities involved, price specification of products and some procedures such as contract, taxation (VAT) system and other related processes within the export business. Our attempt to publish a bilingual ontology is satisfactory. It is known that Ethiopia has more than 30 types of exchange commodities reliable to export business. In this research few of them such as coffee, chat, pulse, fruits and vegetables etc are described. The ontology is designed to be scalable or extensible for future researches. 89 | P a g e

The method or semantic web language used for modeling is OWL 2. OWL uses an Open World Assumption (OWA) that if a statement cannot be proven true using current knowledge, we cannot draw the conclusion that the statement is false. In other words, there is an assumption that we do not have all of the information and there may, or may not, exist information that makes something true. Due to this fact confidently it is impossible to say that our ontology is perfect to represent or to semantically describe all export product items.

6.2. Future work In this work we proposed our idea of how to develop ontology for e - commerce domain which is compliant with GoodRelations vocabulary. However, the major source of knowledge was websites. Instead the involvement of domain experts is recommended to get better perception of the domain, to understand the relationship between entities, to capture the property of each product or service and to explicitly include all instances. The developed ontology can be publically acceptable if and only if the domain is captured to the maximum effort. So that concepts, object properties, data type properties, annotation properties, axioms, individuals, and rules can be explicitly defined. The creation of special rules with semantic web rule language (SWRL) is another necessary modeling element of a better ontology. We recommend a future research to include SWRL and advanced SPARQL query formulation to show the semantic responses of the knowledge base. With respect to the time given to this research, only the ontology is modeled. To observe the effect of it, in search engine optimization a semantic web application for web shops should be developed for future. At the beginning of the research it was planned to visualize the effect of our ontology in increasing product visibility over popular search engines such as Google. To do this we felt two challenges. First, ontology (RDF structured data) should be embedded into the source code of one of commercial websites. But we were unable to get the source code of a published online shop portal or website due to fear of owners in security. Second, after the inclusion of the structured data the Google indexer takes more than one month to publish and display snippets. Time restriction is one of the factors for this challenge. We recommend the future scholars to apply these implementation issues.

90 | P a g e

Automatically mapping between English and Amharic terms should be construed to publish a bilingual ontology in which it is not explored in this research. A word sense disambiguation technique or any other should be applied to localize concepts. Organizations have begun to publish Linked Data why not Ethiopian government? Our final vision and recommendation is to publish linked open EEPS as a subset of liked open commerce (LOC) with a thorough implementation of the linked data principles.

91 | P a g e

Bibliography Semantic Web Architecture. (2007). Retrieved Augest 2013, from Ontologies and Semantic Web: http://obitko.com Google Metaweb: Semantic Open Linked Data Boost. (2010, july). Retrieved november 2013, from Mutual advantage: http://www.mutualadvantageupdates.com Internet World Stats. (2010, June 30). Retrieved December 2013, from http://www.internetworldstats.com/stats7.htm protege multilingual ontologies . (2012, september). Retrieved December 2013, from protege project: http://protege-project.136.n4.nabble.com (2013). Retrieved from unspsc: http://www.unspsc.org/ Global B2C E-Commerce Trends Report 2013. (2013, october 2). Retrieved December 2013, from CNBC: http://www.cnbc.com/id/101080838 OWL Web Ontology Language. (2013, october). Retrieved january 2013, from w3c: http://www.w3.org/ protege. (2013, April 15). Retrieved December 2013, from protege: http://protege.stanford.edu/ vCard Ontology. (2013, septemebr 24). Retrieved January 2013, from w3c: http://www.w3.org/TR/vcard-rdf/#classes Semantic web. (2014, January ). Retrieved January 2014, from W3C: http://www.w3.org/standards/semanticweb/data Adida Ben, M. B. (2013). RDFa Core 1.1. MIT, ERCIM, Keio, Beihang. Ali Ghobadi, M. R. (2011, April). An Ontology-based Semantic Extraction. The International Arab Journal of Information Technology, 8(2), 1-2. Amjad Farooq, S. A. (2010). Engineering Semantic Web Applications by Using Object-Oriented Paradigm . Journal of Computing. Ashraf, J. (2013). A semantic framework for ontology usage analysis. Curtin University, School of Information Systems, Curtin Business School., Curtin. Ashraf, J. a. (2011, march). Open ebusiness ontology: investigating community implementation of goodrelations. CEUR Workshop Proceedings, 1-12. Bryan, M. (2001). MULECO – Multilingual Upper-Level. The SGML Centre . 92 | P a g e

Buitelaar, P. (2007, Augest). Ontologies and Lexical Semantics in. 19th European Summer School on , (pp. 5-6). Ireland. Cândea, C. (2011). SISTEME PENTRU COMERTUL ELECTRONIC. BUCUREŞTI. David, T. (2013). OWL 101. Retrieved November 2013, from cambridge semantics: http://www.cambridgesemantics.com/semantic-university/owl-101 Dekdouk, A. (2010, January). Ontology-Based Intelligent Mobile Search. The International Arab Journal of Information Technology, 2010, 1-2. E. MONTIEL-PONSODA, G. A.-P. (2009, August). Enriching Ontologies with Multilingual. Natural Language Engineering, 1(1), 1-2. Elena Montiel-Ponsoda, M. E. (n.d.). Ontology Localization. Exported Ethiopian Products . (n.d.). Retrieved January 2014, from Ethiopian exporters institute: http://www.ethiopianexporters.com/products.html Grigoris Antoniou, F. v. (2008). A Semantic Web Primer (2nd ed.). London, England: The MIT Press. Grilo, A. (2013, June 28). Intelligent Decision Making in the Era of. KES IDT-IMSS. Hepp. (2006). eClassOWL 5.1 Products and Services Ontology for e-Business. 1-18. Hepp Martin. (2006). Products and Services Ontologies: A Methodology for Deriving OWL Ontologies from Industrial. Int'l Journal on Semantic Web & Information Systems (IJSWIS), 2(1). Hepp Martin, J. d. (2007). GenTax: A Generic Methodology for Deriving OWL and RDF-S Ontologies from Hierarchical Classifications,Thesauri, and Inconsistent Taxonomies. 115. Hepp, M. (2008). GoodRelations: An Ontology for Describing Web Offerings. Muchen. Hepp, M. (n.d.). product ontology. Retrieved october 2013, from productontology: http://www.productontology.org/ HermiT OWL Reasoner. (n.d.). Retrieved January 2013, from HermiT OWL Reasoner: http://hermit-reasoner.com/ Jarrar, M. (2005). Towards methodological principles for ontology Engineering. Vrije universiteit, Faculty of science, Brussel.

93 | P a g e

Kingsley Idehen, M. H. (2009). Linked open commerce. Retrieved December 2013, from Linked open commerce: http://linkedopencommerce.com/ Li, L. (2002). A SOFTWARE FRAMEWORK FOR. University of Manchester, Computer Science, Manchester. microsecommerce. (2012, April). Semantic Search: A Primer. Retrieved November 2013, from microsecommerce: http://blog.microsecommerce.com Mohamed Morsey, J. L. (2011). DBpedia and the Live Extraction of Structured Data from Wikipedia. 4-6. Philipp Cimiano, E. M.-P. (2010). A Note on Ontology Localization. Cognitive Interaction Technology Excellence Center, 1(0), 7-9. PONSODA, E. M. (2009, Augest). Enriching Ontologies with Multilingual Information. Natural Language Engineering, 1-2. Ponsoda, E. M. (2011). Multilingualism in Ontologies. Universidad Politécnica de Madrid E.T.S.I. Montes, Madrid. Sabou, M. (2006). Building Web Service Ontologies. Dutch Graduate School for Information and Knowledge Systems. Saraf, G. (2008, February 10). SEMANTIC WEB UNFOLDIND THE UNDERLYING TECHNOLOGY. SEMANTIC WEB. Tessema Mindaye, M. S. (2009). The Need for Amharic WordNet. 3-4. The GoodRelations Wiki. (n.d.). Retrieved January 2014, from Good relations: http://wiki.goodrelations-vocabulary.org/Main_Page YANG, K. (2006). A CONCEPTUAL FRAMEWORK FOR SEMANTIC WEB-BASED ECOMMERCE. ZAHRA, D. (2000). THE STRATECIC USE OF ICTs IN THE ETHIOPIAN TRADE SECTOR FOR IMPROVED INTECRATION IN THE CLOBAL ECONOMY. Addis Ababa Univesity, SCHOOL OF INFORMATION STUDIES FOR AFRICA, Addis Ababa.

94 | P a g e

Declaration I, the undersigned, declare that this thesis is my original work and has not been presented for a degree in any other university, and that all source of materials used for the thesis have been duly acknowledged.

Declared by: Name:

Animaw kerie_______________________________

Signature: ___________________________________________ Date:

January 30th, 2014_________________________ Confirmed by advisor:

Name: DR. DURGA PRASAD SHARMA (DR.D.P.SHARMA) (Prof) Signature: _________________________________________ Date: _____ January 30th, 2014________________________

Place and date of submission: Arba Minch University, January 2014.

95 | P a g e