A multi-sources knowledge management system

A multi-sources knowledge management system Inaya Lahoud *. Davy Monticolo **, Vincent Hilaire *, Samuel Gomes***, Eric Bonjour**

* SET Laboratory, University of Technology of Belfort-Montbeliard, Belfort, France (e-mail: [email protected], [email protected]) **ERPI Laboratory, Polytechnic National Institute of Lorraine, Nancy, France (e-mail: [email protected], [email protected]) *** M3M Laboratory, University of Technology of Belfort-Montbeliard, Belfort, France (e-mail: [email protected]) Abstract: Nowadays the development of a product involves different types of actors (employees, managers, board of directors) which must be able to share knowledge, experiences and work together efficiently. Each actor has a professional specialty and use one or several software tools (CAO, project management, PLM tools ...) dedicated to his specific skills. Each of these software tools produces different information sources (databases, XML files, text files) which are distributed through the enterprise network. We present in this paper a Knowledge Management System (KMS) which allows the capitalization of the distributed and heterogeneous knowledge all along the development of mechanical product projects on the basis of the heterogeneous and distributed information sources. Keywords: multi-sources knowledge management, ontology, semantic web service. 1.

INTRODUCTION

During engineering activities, business actors use their own knowledge and expertise to develop new products. This business knowledge comes from information used and derived from data created by actors by using their tools. Knowledge is derived from information stored in different sources and distributed on different sites across the whole network of the extended enterprise because the development of a mechanical product involve multidisciplinary teams (mechanics, automation, designers, engineers and technicians methods, etc..) (Chella et al., 2004). This information is heterogeneous, since they come from different sources. They are also distributed throughout the enterprise network since each business actor uses his own software tools on his workstation connected to the entire corporate network. The problem of multi- sources knowledge often arises today in companies that want to share parameters between different business applications or extract knowledge from different databases. Hence companies need a KMS that will be applied on the development of mechanical product projects to free up time on routine engineering and encourage innovative engineering and so increase productivity. The approach proposed in this paper contributes to this problem and is based on semantic web service, a technology that allows the manipulation of heterogeneous and distributed information. The web service facilitates interoperability among heterogeneous systems in distributed environments. However, if the web service provides the publication and discovery of applications, they do not allow the automation of these tasks. That’s why they require a semantic description based on ontology. In this context, the notion of ontology as a conceptualization of a given area provides a solution

to define semantics for the elements of a Web service description, way of access and elements. The combination of these two technologies constitutes the Semantic web service (SWS). We used one SWS in our KMS. We have four phases in the knowledge life-cycle. The first one is for creating ontologies by business experts. The second one is for receiving these ontologies which will be transformed into queries by using our own transformation rules based on model driven engineering transformation techniques. Data resulting from these queries will be annotated and stored in RDF files. The third one consists in evaluating information through a semantic wiki. The semantic wiki allows the structuration of knowledge, it diffusion it and to obtain a feed-back of business actors. The last phase consists in creating a custom ontology by the business actors. The rest of the paper is structured as follows: Section 2 presents some existing knowledge management systems, multi sources knowledge management, and the application of SWS approach for knowledge management. Then we present in the section 3 our overall approach of knowledge management based on the model of Maier, and then we explain our architecture for multi-sources knowledge management using SWS. Section 4 details the approach for defining knowledge to be extracted through the OCEAN software (Lahoud et al.,2010). Section 5 presents an example of an ontology, its transformation into a SQL queries and the knowledge resulting from these queries which are stored in RDF files. Section 6 concludes this article. 2.

BACKGROUND

2.1. Knowledge Management Systems

We have presented in the introduction the importance of knowledge management for companies to identify and capitalize their know-how to organize and distribute them. (Alavi & Leidner, 2001) explain that a KMS is the “IT (Information Technology)” based systems developed to support and enhance the organizational processes of knowledge creation, storage/retrieval, transfer, and application”. (Maier, 2002) expanded on the IT concept for the KMS by calling it an ICT (Information and Communication Technology) system that supported the functions of knowledge creation, construction, identification, capturing, acquisition, selection, valuation, organization, linking, structuring, formalization, visualization, distribution, retention, maintenance, refinement, evolution, accessing, search, and application. KMS use a variety of technologies designed to enhance knowledge storage and knowledge communication/transfer. Grundstein proposed also a knowledge management lifecycle. His cycle is divided into four facets which are: identify, formalize, value and update (Grundstein & Barthès, 1996). We will present in the section five our knowledge management system which is based on the system of Maier because we found that it is the most completed system which fit with our objective. At the end we can consider that our KM was successful if we can reuse knowledge to improve organizational effectiveness by providing the appropriate knowledge to those that need it when it is needed (Jennex, 2005). 2.2. Multi-sources Information management We have already explained that knowledge are derived from information stored in different databases corresponding to each business application in various information systems (Monticolo et al., 2007) and which are distributed on different sites across the whole network of the extended enterprise which makes very difficult to search knowledge manually. The management of multi sources information being the objective of many researches like the TAMBIS project (Stevens et al., 2000) which yield an integrated solution to the problem of disparate biological databases and analysis tools. So they did a common schema (Biological Knowledge Base) represented in a Description Logic, presenting the user with a rich description of the domain from which they may flexibly and intuitively construct and modify queries. Until now, multi-sources knowledge management is the aim of many researches in all domains even in the agricultural one where Qiu in (Qui & Yue, 2010) presents a semantic web to management agricultural knowledge which are distributed, heterogeneous and polymorphism. This makes it hard to get useful information effectively so he managed the agricultural knowledge by using ontology.

We saw that the management of multi-sources knowledge being the subject of several studies since long time until now, but each one in different field such as biology. Our work also raises the issue of heterogeneous and distributed knowledge management and but in the field of industry where we need a system that enables the exchange of information between different business applications to design and develop a mechanical product. We present in this paper a complete model of a system of multi-source knowledge management with an experiment. 2.3. SWS for Management Multi-source knowledge SWS represent a new technology derived from the combination of two technologies: the Semantic Web (Berners-Lee et al., 2001) and Web Services (Booth et al., 2004). The combination of these areas can take advantage of these two technologies with the aim of developing new applications that can handle heterogeneous and distributed information such as parameter the Web Services (Paolucci & Sycara, 2003) (Shafiq et al., 2006) as well as take into account knowledge models thanks to the Semantic Web. McIlraith in (McIlraith & Narayanan, 2002) was the first to combine Web Services and SWS languages to develop applications taking into account the definition of knowledge areas for users. Indeed, SWS is designed and described on the basis of a semantic model (Agarwal et al., 2004). In the field of knowledge management, SWS approach is used by Brunel University (Yang et al., 2008) for designing applications for web businesses that provides a convenient and scalable way for sharing Information and knowledge management. They propose a SWS-oriented model, where resources and services are described by an ontology which is an explicit specification of a conceptualization (Gruber, 1993), and processed through SWS, enabling an integrated administration, interoperability and automated reasoning. In the same context, (Nemirovskij et al., 2008) applied this approach to manage knowledge in the field of education. This work aimed to model a platform for the common area of education based on SWS Approach and a Framework for SOA-deployment and treatment. We can also cite the work of (Che Cob & Abdullah, 2008) who proposed a structure of the SWS ontology based for a knowledge management system. His framework consisted of four main layers: User Interaction, Interface, Mediator, and Ontology. Today we can integrate an ontology developed using the OWL format to the Web Service Semantics. This approach allows the service to detect all the information that must be extracted. SWS becomes a tool to extract information from software applications used by project teams, communicate with these applications and ensure interoperability which means the exchange of information between business applications. We present in the next section the overall architecture MMSK based on the use of SWS, which aims to manage

knowledge in the projects of development of multi-source product. 3.

SYSTEMS PRINCIPLES

3.1. Systems hypothesis Our KMS is based on the knowledge life-cycle phases identified by (Maier, 2002). These phases are illustrated in the figure 1. It begins with defining the knowledge needed for the development of a product by creating ontologies by the business experts. The creation of knowledge validates the first part of acquisition to move to the second part which is the capitalization of knowledge. The main goal of knowledge identification is to define basically the knowledge needed to create the ontologies and from which business applications this knowledge will be captured. Once we have the data resulting from querying the business application, we put these data in a context by annotating them in a manner to respect the transition from data to information. This information will be collected in RDF files, stored in organizational memories which are available for all business actors of the community to validate, evaluate and evolve them. (Huber et al., 1998) summarize OM as the set of repositories of information and knowledge that the organization has acquired and retains.

Figure 1. Knowledge Management System approach We consider that knowledge is information validated by the business experts. Knowledge will be converted in RDF files, organized and stored also in the OM. This knowledge must be formalized to be transformed into exploitable knowledge. At the end, knowledge stored in OM can be diffused or shared by reusing this knowledge, presented by a semantic wiki, exploitable for the decision support, and reviewed by evolving and updating knowledge in the semantic wiki. 3.2. MMSK architecture As we explained earlier, knowledge comes from different business applications and used during product development projects should be capitalized to be structured and reused to provide decision support to business actors.

We propose architecture (Figure 2), called MMSK (Management of Multi-source knowledge), and based on four phases and one SWS, ensuring the operationalization of the knowledge life cycle given in figure 1. These four parts are the identification and formalization, validation and updating, manipulation, exploitation and diffusion. The first phase called “Knowledge Creation” (KC) aims to provide an interface to the business experts to facilitate the formalization of their knowledge. Business experts create their ontologies according to their business domain (mechanical, ergonomic, acoustic, material, etc...). The KC module contains several ontologies for the different areas of the company. It also contains two other modules; the first one aims at managing the consistency between ontologies which is also called the alignment of ontologies. For example this module avoids the sending of synonym or equivalent concepts to the KE module. The second module helps the merging of ontologies. It enables business experts to merge several ontologies into one. The created ontologies will be stored in OWL files in the heart of our system. The second phase which based on SWS is called “Knowledge Extraction - SWS” (KE-SWS) enables the communication with the different business applications like (PLM platform, PDM, CAD, calculation tool, requirements management tool, etc.). There are as many service connections (SC) as connections to business applications. Each SC requires a specific development based on the communication protocol with the business application. The first objective of the SC is to extract the information stored in databases or files. This module takes as input the ontologies created by business experts. Each ontology is transformed into a specific query according to the different connection protocols of business applications by the way of model driven engineering transformations. For instance, if the information searched by the ontology O1 are in the database of the PLM business application so an SQL query, defined by the transformation of the O1 ontology, will be created to create individuals for O1. It is an SQL query because the database of this application is managed by an SQL server as shown in figure 2. Therefore, if the information results from business application are stored in Excel files, the transformation will produce another type of query compatible with these Excel files. The data resulting from the interrogation of each business application will be placed in a context by annotating them to become information. This information will be stored in RDF files to build a RDF base. We have then an RDF file for each query results. The third phase « Knowledge Validation » (KV) facilitates the transformation of information into knowledge. The information comes from KE service (Knowledge Extraction) which ensures the extraction from business applications. The mission of KV is to structure the information in organizational memories. This organizational memory is represented by a Semantic Wiki which was developed in a previous work

(Monticolo et al., 2011). Indeed, each actor can confirm, reject, or modify Knowledge using the Semantic Wiki.

players craft then it will be destroyed and will no more appear in our knowledge base. « Knowledge Sharing » (KS) is the fourth phase of our cycle. KM is a manager of searching and selecting

Each validated or modified information becomes knowledge. If information is rejected by the majority of

SWS -KC

SWS -KE SWS -KM

OWL

Query

Query

SWS -KV

Figure 2. The overall architecture of MMSK knowledge. It allows a business actor to select a set of concepts from ontologies defined by the experts, to build its own knowledge base (ontology). Our experiences within companies have shown us that a designer needs knowledge from different areas (materials, ergonomics, automation, etc.) in which he is not necessarily an expert. The KS allows him to select from the existing ontologies the configuration of knowledge that will enable him to consult the information he needs. And so the business actor consults ontologies created by the business experts and searches the knowledge he needs and then selects the concepts necessary for the development of its product from one or more ontologies. These concepts will be merged to form a new custom ontology. This ontology is also transformed in SPARQL query executed on the RDF files stored in our database. The business actor receives at the end the knowledge he has requested, presented in a web page. The flows of information in MMSK use RDF files that contain the values of knowledge, their types and the annotations specifying their contexts. We detail the contents of RDF files in the section illustrating our

prototype. We present in the next section a description of the functioning of this SWS. 4.

INDUSTRIAL DEMONSTRATOR

4.1. Knowledge creation As we mentioned in the introduction, the development of a product requires the involvement of business experts’ multidisciplinary (mechanics, control engineers, designers, engineers and technical methods). The heterogeneous and distributed of project teams implies that knowledge is also distributed and heterogeneous as they are derived from information stored in databases or files results obtained by business applications. Heterogeneous and distributed nature of knowledge requires an approach of formalization, definition and Modeling in order to extract, and capitalize them. We propose in the architecture MMSK a first service facilitating the construction of domain ontologies by business experts. This first service communicates with the knowledge extraction (KE) services by providing structured ontologies. Under the ADD project, this

service is called OCEAN (Ontology Creator Extractor & Annotator kNowledge) (Lahoud et al., 2010).

tables and the key primary-foreign in the relationship table.

In OCEAN, we allow the experts to "build" their ontologies by defining its classes, attributes and relations between classes. OCEAN allows the experts to view a description of each class created and relations between classes, and attach a photo so that the business actors can get a concrete idea of the element modeled. Associations will be defined by the experts; otherwise, the associations will be by default subsumptions. We present in the section nine a simple example of bike ontology created by a business expert. Thus the expert describes his domain of knowledge in terms of classes, subclasses, attributes and relationships which will be saved in owl format.

R3 If there is an association which is surrounded by a * on the first side and a 0 or 1 on the other side we look for the primary key of the table next to *. This key is added as foreign key in the table next to 0 or 1

Each business expert can build as many ontologies as he wants. Then the OWL file generated by OCEAN will be sent to the first service "SWS-KE ". We present in the next section rules applied on the owl file to transform it to SQL query. 4.2. Transformation rules To extract knowledge from data stored in the database of business applications we have applied a method to transform ontologies into SQL queries which is an extension of the one proposed in (Astrova et al., 2007). This method allows us to communicate with all business applications that store their data in SQL Server databases. To transform an ontology into SQL, it is necessary to go through a meta-level and build a metamodel. This step consists in defining ontology models and their correspondence to obtain SQL models.

R4 Store the name of tables in a table R5 Search the primary and foreign keys of the tables in the list (the relations between them) and store these relations in a table. R6 Forming conditions: We look at the range of each DatatypeProperty a.

If it is an attribute positiveInteger then the condition is check> 0 b. If it is a DataRange which includes a list so check attribute in [value1, value2 ...] c. If there are restrictions on DatatypeProperty and it is not a cardinality restriction as HasValue so the condition is: check attribute = value R7 Inverse functional property is the “Distinct” constraint R8 Required property mean check fields not null (not 0 or not = "") R9 Symmetric property is a recursive table R10 Store conditions in a table. R11 Build the query a.

b.

c.

Figure 3. Model of transformation from ontology to SQL query Thus we considered the model of the ontology as a source model and SQL as a target model. The transformation between these two models is obtained by transformation rules. We created our own rules for transforming an ontology into an SQL query that we have experimented with ontologies defined in OCEAN. We use eight rules of transformation to go from the ontological model to the model SQL. These rules are described below: R1 A class is a table R2 If there is an association which is surrounded by the cardinality * on both sides, we search the primary keys that correspond to these 2 tables. Then we make a second search to find these 2 keys correspond to which table as a compound primary key. Once we found it, we store the name of this table in the list of

Select all DatatypeProperty proceeded by the first letter of the domain table. E.g.: p.nom, And followed by a comma From all tables stored in the table (R4) followed by the first letter as naming the table by a letter Where all the relationship stored in a table to make the connections between the tables + all conditions sored in the table separated by “and”

We illustrate the mechanism of transformation by the bike ontology example (section 6); a little example that explains how an ontology created by an expert turns into an SQL query, by applying the rules cited in the previous section, using SWS. This return us an RDF file with annotated knowledge derived from the execution of the SQL query. 4.3. Annotation of information Each data extracted from SQL queries are formatted and annotated. The data becomes information through the annotation which positioned it in a context. We characterize the environment by the name of the current project, the role of the actor who created the information and the business tool used. The service KE - SWS creates a candidate knowledge base consisting of several RDF files readable by each part of our KMS. Indeed the knowledge candidates

(annotated information and extracted by the service KE) must be validated and evaluated by professional actors. This functionality is the main function of the KV Service which is the next service we will develop in the industrial project. 5.

EXAMPLE: THE BIKE ONTOLOGY

The cycling ontology describes the cycling world. The ontology defines a vocabulary and a semantic to structure, organize, detail all the characteristic of a bike, all the roles of the professional actors in a development project of a new bike and all the processes used to develop and industrialize a bicycle. Seat tube length ? Head tube length ?

Frameset

Size = ‘M’

Bicycle Name ?

Name=‘SpeedMax’ Color =[‘white’,‘green’]

Handlebar

Weight ? Type= ‘road’

Figure 4. An example of the cycling ontology This Ontology was transformed into an SQL query which will return information from business application. For example if business experts want to know how to design a SpeedMax CF bike size M so he transforms the bike ontology (figure 4) in specifiying these details. The bike ontology will be transformed into an SQL query such as the example below (figure 5). SELECT Handlebar.TypeHandlebar, Frameset.Size, Frameset.SeatTubeLength, Frameset.HeadTubeAngle, Bicycle.Color, Bicycle.Name, … FROM Handlebar, Frameset, Bicycle, … WHERE Bicycle.IdHandlebar = Handlebar.IdHandlebar AND Bicycle.IdFrameset = Frameset.IdFrameset AND … AND Bicycle.Name like ‘SpeedMax%’ AND Bicycle.Color in ('White','Green') AND Frameset.Size ='M' AND Handlebar.TypeHandlebar != 'road'

Figure 5. Extract of the generated SQL query The information returned with this query becomes knowledge and stored in RDF files as shown in the figure 6. …. SpeedMax CF 9.0 The Speedmax CF 9.0 is a top of the line aero bike. Manufactured by Canyon, the Speedman 9.0 was conceived to give triathletes the supreme cycling experience when running against the clock. M

550 125 Alu X 927g …

Figure 6. Example of knowledge annotated in RDF file 6.

Conclusion & Perspectives

Under the ADD project we developed and tested the first two phases of our MMSK (Knowledge Definition and Knowledge Extraction- SWS). The first phase is aimed to business experts and used to formalize a domain of knowledge in the form of OWL. In the second phase we use formal ontologies to extract information from business applications and annotate them to build a knowledge base in RDF format. Through these early demos we've seen that it will be necessary to give more opportunities for business experts to describe the relationships between the concepts of their domains of knowledge. Indeed, we are limited today to a certain number of relationships that do not take into account all the complexity of the specification of business knowledge. The second observation concerns the service of extraction and transformation of ontologies into SQL model. We compared our results and we are inspired by the work of (Astrova et al., 2007). However we feel that it is necessary to continue our research on the transformation of ontologies into SQL models to transcribe the entire semantics described in ontologies. The second point of improvement the service KE will be the extraction of knowledge from XML files. This point seems less problematic because the types of data described by the XML tags allow us to associate the tags to ontology concepts. The rest of the ADD project will focus on developing the two other phases of MMSK (Validation and Mining) that will cover the whole knowledge management process: formalization, identification, extraction, storage, update, exploitation, diffusion and reuse. We estimate to two years of development time the whole MMSK architecture. REFERENCES Agarwal, S., Handschuh, S., Staab, S. 2004. Annotation, composition and invocation of semantic web services. in the international journal of Web Semantics (2004) 31-48. Alavi, M. and Leidner, D.E. 2001. Review: Knowledge management and knowledge management systems: Conceptual foundations and research issues. MIS Quarterly, 25(1), 107–136.

Astrova, I., Korda, N., and Kalja, A.2007. Storing OWL Ontologies in SQL Relational Databases. World Academy of Science, Engineering and Technology 29 Berners-Lee, T., Hendler, J., and Lassila, O.2001. Scientific American, May 2001 Booth, D., Haas, H., McCabe, F., Newcomer, E., Champion, M., Ferris, C., and al. 2004. Web services architecture. W3C working group note. http://www.w3.org/TR/ws-arch Che Cob, Z., Abdullah, R. 2008. Ontology-based Semantic Web Services Framework for Knowledge Management System. Information Technology, 2008. ITSim 2008. International Symposium on, Malaysia. Chella, A., Cossentino, M., Sabatucci, L. and Seidita, V., 2004. From PASSI to Agile PASSI: Tailoring a Design Process to Meet New Needs. In 2004 IEEE/WIC/ACM International Joint Conference on Intelligent Agent Technology (IAT-04), Sept. 2004, Beijing (China). Gandon, F. 2002. Distributed Artificial Intelligence and Knowledge management: ontologies and multiagent systems for a corporate semantic web. Phd Thesis, University of Nice - Sophia Antipolis, 2002 Gomes, S., Serrafero, P., Monticolo, D., Eynard, B. 2005. Extracting engineering knowledge from PLM systems: an experimental approach. International Conference on Product Lifecycle Management, LyonFrance, September 2005, 10p Gruber T. 1993. Towards Principles for the Design of Ontologies Used for Knowledge Sharing. In N. Guarino et R. Poli (Eds.), Formal Ontology in Conceptual Analysis and Knowledge Representation, Deventer, The Netherlands. Kluwer Academic Publishers. Grundstein, M., Barthès, J.-P. 1996. An industrial view of the process of capitalizing knowledge. Proc. ISMICK’96, Rotterdam (1996), 258-264 Huber, G.P., Davenport, T.H., and King, D. 1998.Some perspectives on organizational memory (Working Paper for the Task Force on Organizational Memory). In Proceedings of the 31st Annual Hawaii International Conference on System Sciences. Jennex, M., Smolnik, S., Croasdell, D. 2007. Towards Defining Knowledge Management Success. System Sciences, 2007. HICSS 2007. 40th Annual Hawaii International Conference on , vol., no., pp.193c, Jan. 2007 Lahoud, I., Monticolo, D., Gomes, S. 2010. OCEAN: A Semantic Web Service to Extract Knowledge in EGroupwares. Signal-Image Technology and InternetBased Systems (SITIS), 2010 Sixth International Conference on , vol., no., pp.354-362, 15-18 Dec. 2010 Maier, R. 2002. Knowledge management systems: Information and communication technologies for knowledge management. Berlin: Springer-Verlag. McIlraith, S., Narayanan, S. 2002. Simulation, verification and automated composition of web

services. Proceedings of the 11th international conference on World Wide Web, New york, USA, 12P. 2002 Monticolo D., F. Demoly, S. Gomes. 2011. Collaborative Knowledge Evaluation with a Semantic Wiki: WikiDesign. in the international Journal of eCollaboration, vol 6, June 2011 Monticolo, D., Hilaire, V., Koukam, A. and Gomes, S. 2007. A Multi Agents model to support the Knowledge Management Process inside Professional Activities. Conference talk in the Second IEEE International Conference on Digital Information Management (ICDIM 2007) Workshop on Agent supported Cooperative Work (ACW 2007), Lyon, France. Nemirovskij, G., Wolters, M. & Heuel, E. 2008. Distributed Study: a Semantic Web Services Approach for Modelling a Common Educational Space. In J. Luca & E. Weippl (Eds.), Proceedings of World Conference on Educational Multimedia, Hypermedia and Telecommunications 2008,pp. 2007-2014. Chesapeake, VA: AACE. Paolucci, M., Sycara, K. 2003. Autonomous Semantic Web services. Internet Computing, IEEE , vol.7, no.5, pp. 34- 41, Sept.-Oct. Qiu, X., Yue J.2010. Ontology based distributed agricultural knowledge management. Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on , vol.6, no., pp.2858-2861, 10-12 Aug. 2010 Shafiq, O., Suguri, H., Ali, A., & Fensel, D. 2006. A first step towards enabling Interoperability between software agents and semantic web services: Multi agent systems adapting web services standards. IBIS – Interoperability in Business Information Systems, 2(2), 97–117. Stevens, R., Baker, P., Bechhofer, S., Ng, G., Jacoby, A., Paton, N., Goble, C., Brass, A. 2000. TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources. Bioinformatics 16 (2) pp.184-186. Yang, H., Qingping, Y., Xizhi, S., Peng, W. 2008. Applying semantic web services to enterprise web. The 6th International Conference on Manufacturing Research (ICMR08), Brunel University, UK