implementing a rosettanet business-to-business ... - CiteSeerX

Helsinki University of Technology Department of Computer Science and Engineering

IMPLEMENTING A ROSETTANET BUSINESS-TO-BUSINESS INTEGRATION PLATFORM USING J2EE AND WEB SERVICES

Juho Tikkala

HELSINKI UNIVERSITY OF TECHNOLOGY Department of Computer Science and Engineering

ABSTRACT OF MASTER’S THESIS

Author

Date

Juho Tikkala

August 11 2004 Pages

111 Title of Thesis

Implementing a RosettaNet Business-to-Business Integration Platform Using J2EE and Web Services Professorship

Professorship Code

Information Technology

T-86

Supervisor

Professor Timo Soininen Instructor

M. Sc. Paavo Kotinurmi The need for connecting information systems of collaborating organizations has become increasingly common. Significant advantages, such as increased speed, efficiency, and reliability, can be obtained by automating inter-organizational business processes. To achieve this, business-to-business integration, i.e. facilitating interoperation of disparate information systems of different organizations, must be performed. E-business frameworks are generic solutions for performing such integration. RosettaNet is an industry consortium that maintains an e-business framework that specifies inter-organizational business processes for multiple industries. Process specifications include messages that are exchanged between organizations, and related messaging choreography. RosettaNet Implementation Framework (RNIF) is the messaging portion of the RosettaNet framework, specifying how messages are exchanged. The purpose of this work is to gather experience on implementing RNIF with currently available tools. Research questions include how can RNIF be implemented in practice, how suitable are current tools and what could be improved in them, how much effort is required, what level of performance can be obtained, and are there any interoperability problems with different RNIF implementations. To answer these questions, a prototype system is developed using J2EE (Java 2 platform Enterprise Edition) and web service technologies. The prototype is a middleware system that provides RNIF functionality, on top of which RNIF-enabled applications can be constructed with less effort than creating an equivalent RNIF implementation from scratch. The prototype has two high-level goals. The first is to create a learning platform that students can use to practice RosettaNet-based integration: the platform provides RNIF-compliant messaging functionality, and students are free to concentrate on higherlevel issues, such as semantics of messages that are exchanged. The second goals it to create a realistic RNIF software implementation, providing services that are assumed useful in a typical business-tobusiness integration scenario, to gather experience on implementing such systems. The prototype is tested in laboratory environment. Testing is done for RNIF-compliance, interoperability with a commercial RNIF implementation, and performance. In addition, a student group applies the prototype to implement an integration system for a simple RosettaNet-based integration case, to test the prototype’s practical usability and ease of use. The results show one possible way to implement RNIF in practice. The tools were suitable for the task, but some issues exist: some tools did not function perfectly and some improvement suggestions could be made. Performance level of processing in order of one RNIF message per second is relatively easily attainable; current tools do place limitations on how large messages can be transferred. Interoperability problems were detected: a commercial RNIF implementation produced invalid RNIF messages and some details in related specifications leave room for interpretation. Keywords

RosettaNet, RNIF, B2B integration, implementation, J2EE, XML

II

TEKNILLINEN KORKEAKOULU Tietotekniikan osasto

DIPLOMITYÖN TIIVISTELMÄ

Tekijä

Päiväys

Juho Tikkala

11.8.2004 Sivumäärä

111 Työn nimi

RosettaNet-pohjaisen yritystenvälisen integroinnin alustan toteuttaminen käyttäen J2EE- ja web service -tekniikoita Professuuri

Koodi

Tietotekniikka

T-86

Työn valvoja

Professori Timo Soininen Työn ohjaaja

DI Paavo Kotinurmi Yhteistyötä tekevien organisaatioiden tietojärjestelmien yhdistämistä tarvitaan yhä useammin. Automatisoimalla organisaatioiden välisiä liiketoimintaprosesseja voidaan saavuttaa merkittäviä etuja, kuten nopeutuminen, tehostuminen ja luotettavuuden paraneminen. Tämän mahdollistamiseksi tarvitaan yritysten välistä integrointia eli organisaatioiden erilaisten tietojärjestelmien yhteensovittamista. Sähköisen liiketoiminnan kehykset ovat yleisiä ratkaisuja tällaisen integroinnin toteuttamiseen. RosettaNet on teollisuuskonsortio, joka ylläpitää sähköisen liiketoiminnan kehystä, joka määrittelee organisaatioiden välisiä liiketoimintaprosesseja useille teollisuuden haaroille. Prosessimäärittelyt sisältävät vaihdettavat viestit ja niihin liittyvän koreografian. RosettaNet Implementation Framework (RNIF) on kehyksen viestintäosa, joka määrittelee kuinka viestien vaihto tapahtuu. Tämän työn tarkoitus on tuottaa kokemusta RNIF:in toteutuksesta nykyisillä työvälineillä. Tutkimuskysymyksiä ovat kuinka RNIF voidaan käytännössä toteuttaa, kuinka sopivia nykyiset työvälineet ovat ja mitä niissä voitaisiin parantaa, kuinka suuri työmäärä tarvitaan, minkälainen suorituskyky voidaan saavuttaa ja onko eri RNIF-toteutusten välillä yhteensopivuusongelmia. Näihin kysymyksiin vastaamiseksi laaditaan prototyyppijärjestelmä käyttäen J2EE- (Java 2 platform Enterprise Edition) ja web service -tekniikoita. Prototyyppi on välitason ohjelmistokomponentti, joka tuottaa RNIF-toiminnallisuuden. Sen päälle voidaan luoda RNIF-yhteensopivia sovelluksia pienemmällä työllä kuin vastaavan RNIF-toteutuksen luominen tyhjästä. Prototyypillä on kaksi päätavoitetta. Ensimmäinen on luoda opetusjärjestelmä, jota opiskelijat voivat käyttää RosettaNet-pohjaisen integraation harjoitteluun: järjestelmä tuottaa RNIF-yhteensopivan viestintätoiminnallisuuden ja opiskelijat voivat keskittyä korkeamman tason asioihin, kuten vaihdettavien viestien merkitykseen. Toinen tavoite on luoda realistinen RNIF-ohjelmistototeutus, joka tuottaa palveluja, joiden uskotaan olevan hyödyllisiä tyypillisessä yritystenvälisessä integraatiossa, kokemuksen saamiseksi tällaisten järjestelmien toteuttamisesta. Prototyyppi testataan laboratorioympäristössä. Testaus tehdään RNIF-määrittelyn mukaisuuden, yhteensopivuuden kaupallisen RNIF-toteutuksen kanssa ja suorituskyvyn osalta. Lisäksi opiskelijaryhmä soveltaa prototyyppiä toteuttaakseen yksinkertaisen RosettaNet-pohjaisen integraation mahdollistavan järjestelmän, millä testataan prototyypin käyttökelpoisuutta ja helppokäyttöisyyttä. Lopputuloksena esitetään yksi mahdollinen tapa toteuttaa RNIF käytännössä. Käytetyt työvälineet olivat tehtävään sopivia mutta joitain ongelmia löytyi: osa työvälineistä ei toiminut täydellisesti ja joitain parannusehdotuksia voitiin tehdä. Noin yhden RNIF-viestin käsittelyn sekunnissa suorituskykytaso on suhteellisen helposti saavutettavissa; nykyiset työvälineet asettavat rajoituksia siirrettävän viestin koolle. Yhteensopivuusongelmia löytyi: kaupallinen RNIF-toteutus tuotti viallisia RNIF-viestejä ja jotkin yksityiskohdat määrittelyissä ovat tulkinnanvaraisia. Avainsanat

RosettaNet, RNIF, yritysten välinen integrointi, toteutus, J2EE, XML

III

Preface This thesis was written in the Helsinki University of Technology Software Business and Engineering Institute (SoberIT) between March and July 2004, work described was performed between December 2003 and May 2004. The work was conducted in partnership with International Business Machines (IBM). I would like to thank my instructor, Paavo Kotinurmi, for his valuable suggestions and extensive comments that were very helpful in all portions of this work. The supervisor of the thesis, Professor Timo Soininen, provided valuable feedback and assistance on writing the thesis. In addition, I would like to express my gratitude to Hannu Laesvuori for his comments on the thesis and helping in interoperability testing and to Patrik Ajalin, Unai Briongos, Ari Kesäniemi, and Outi Tarvainen for testing the prototype system.

Espoo, August 11 2004

Juho Tikkala

IV

Table of Contents ABSTRACT OF MASTER’S THESIS.....................................................................................................................II DIPLOMITYÖN TIIVISTELMÄ........................................................................................................................... III PREFACE ................................................................................................................................................................. IV TABLE OF CONTENTS ...........................................................................................................................................V 1

2

INTRODUCTION .............................................................................................................................................1 1.1

BACKGROUND .............................................................................................................................................1

1.2

RESEARCH PROBLEM AND OBJECTIVES .......................................................................................................3

1.3

RESEARCH METHOD....................................................................................................................................4

1.4

SCOPE .........................................................................................................................................................6

1.5

STRUCTURE OF THIS REPORT ......................................................................................................................6

CURRENT KNOWLEDGE..............................................................................................................................8 2.1

BUSINESS-TO-BUSINESS INTEGRATION .......................................................................................................8

2.2

XML TECHNOLOGIES .................................................................................................................................9

2.2.1

XML .......................................................................................................................................................9

2.2.2

XML Schema Languages .....................................................................................................................10

2.2.3

XPath ...................................................................................................................................................11

2.2.4

XSLT ....................................................................................................................................................11

2.3

BUSINESS-TO-BUSINESS INTEGRATION FRAMEWORKS..............................................................................11

2.3.1

EDI ......................................................................................................................................................12

2.3.2

XML-Based Business-to-Business Integration Frameworks................................................................12

2.3.3

xCBL ....................................................................................................................................................13

2.3.4

RosettaNet............................................................................................................................................14

2.3.5

ebXML .................................................................................................................................................17

2.3.6

Comparison .........................................................................................................................................18

2.4

MESSAGING IN AN XML-BASED BUSINESS-TO-BUSINESS FRAMEWORK ...................................................19

2.4.1

Web Services........................................................................................................................................19

2.4.2

RosettaNet Implementation Framework ..............................................................................................23

2.4.3

ebMS ....................................................................................................................................................25

2.4.4

Comparison .........................................................................................................................................26

2.5

ABOUT J2EE .............................................................................................................................................27

2.5.1

The Java Programming Language ......................................................................................................27

2.5.2

J2EE Overview ....................................................................................................................................28

2.5.3

J2EE Core APIs...................................................................................................................................29

2.5.4

Application Components......................................................................................................................29

2.5.5

Implementations ..................................................................................................................................30

V

2.5.6

Developing with J2EE .........................................................................................................................30

2.5.7

Software Design in J2EE .....................................................................................................................31

2.5.8

Web Services in J2EE ..........................................................................................................................32

2.5.9

Alternatives..........................................................................................................................................33

2.6

3

4

2.6.1

Architecture of Previously Reported Systems ......................................................................................33

2.6.2

Implementation Methods .....................................................................................................................41

REQUIREMENTS FOR THE PROTOTYPE SYSTEM .............................................................................46 3.1

PROBLEM DEFINITION ...............................................................................................................................46

3.2

SOLUTION OVERVIEW ...............................................................................................................................47

3.3

REQUIREMENTS .........................................................................................................................................48

3.3.1

Functional Requirements.....................................................................................................................48

3.3.2

Use Cases ............................................................................................................................................49

3.3.3

Technical Requirements ......................................................................................................................49

PROTOTYPE DESIGN AND IMPLEMENTATION..................................................................................51 4.1

EXTERNAL INTERFACES ............................................................................................................................51

4.1.1

RNIF Interface.....................................................................................................................................51

4.1.2

Web Service Interface ..........................................................................................................................51

4.2

INTERNAL SOFTWARE ARCHITECTURE ......................................................................................................55

4.2.1

Static Structure ....................................................................................................................................55

4.2.2

Implementation of Use Cases ..............................................................................................................57

4.3

J2EE IMPLEMENTATION ............................................................................................................................58

4.3.1

J2EE Modules......................................................................................................................................58

4.3.2

Router Implementation ........................................................................................................................60

4.3.3

WS Implementation..............................................................................................................................60

4.3.4

RNIF Implementation ..........................................................................................................................61

4.4

5

IMPLEMENTING BUSINESS-TO-BUSINESS INTEGRATION ............................................................................33

CLIENT APPLICATIONS ..............................................................................................................................61

4.4.1

Structure ..............................................................................................................................................61

4.4.2

Activity Management ...........................................................................................................................62

4.4.3

Example Source Code Fragment .........................................................................................................63

4.5

TOOLS USED IN IMPLEMENTATION ............................................................................................................65

4.6

IMPLEMENTATION EFFORT ........................................................................................................................67

4.6.1

Development Time ...............................................................................................................................67

4.6.2

Prototype Size......................................................................................................................................67

TESTING THE PROTOTYPE.......................................................................................................................69 5.1

RNIF-COMPLIANCE ..................................................................................................................................69

5.1.1

Introduction .........................................................................................................................................69

5.1.2

Setup ....................................................................................................................................................69

5.1.3

Results .................................................................................................................................................70

VI

5.1.4 5.2

Introduction .........................................................................................................................................70

5.2.2

Setup ....................................................................................................................................................71

5.2.3

Results .................................................................................................................................................72

5.2.4

Analysis................................................................................................................................................74 INTEROPERABILITY ...................................................................................................................................74

5.3.1

Introduction .........................................................................................................................................74

5.3.2

Setup ....................................................................................................................................................75

5.3.3

Results .................................................................................................................................................75

5.3.4

Analysis................................................................................................................................................76

5.4

PERFORMANCE: MAXIMUM MESSAGE SIZE...............................................................................................76

5.4.1

Introduction .........................................................................................................................................76

5.4.2

Setup ....................................................................................................................................................77

5.4.3

Results .................................................................................................................................................78

5.4.4

Analysis................................................................................................................................................78

5.5

PERFORMANCE: MESSAGE THROUGHPUT..................................................................................................79

5.5.1

Introduction .........................................................................................................................................79

5.5.2

Setup ....................................................................................................................................................79

5.5.3

Results .................................................................................................................................................81

5.5.4

Analysis................................................................................................................................................82

DISCUSSION ...................................................................................................................................................83 6.1

PROTOTYPE SYSTEM .................................................................................................................................83

6.1.1

Prototype Quality ................................................................................................................................83

6.1.2

Prototype Design .................................................................................................................................85

6.1.3

Development Approach .......................................................................................................................86

6.1.4

Improvement Possibilities....................................................................................................................87

6.2

EXPERIENCE FROM PROTOTYPE IMPLEMENTATION ...................................................................................88

6.2.1

RosettaNet Specifications ....................................................................................................................88

6.2.2

Tools ....................................................................................................................................................90

6.3

7

PRACTICAL USABILITY..............................................................................................................................70

5.2.1

5.3

6

Analysis................................................................................................................................................70

RELATION TO PREVIOUS WORK ................................................................................................................95

6.3.1

Architecture and Implementation Methodology ..................................................................................95

6.3.2

Comparison to Research Implementations ..........................................................................................96

6.3.3

Comparison to a Commercial Product................................................................................................98

6.4

RELIABILITY OF THE RESULTS...................................................................................................................99

6.5

FURTHER RESEARCH ...............................................................................................................................100

CONCLUSIONS ............................................................................................................................................101

REFERENCES ........................................................................................................................................................104 APPENDIX A ..........................................................................................................................................................110

VII

1 Introduction 1.1 Background E-business1, connecting information systems of partnering organizations using Internet-based technologies, has become an essential part of today’s business environment — in fact, having an e-business strategy can almost be seen as a requirement for successful competition (Rodgers et al. 2002). Significant advantages can be obtained by automating inter-organizational business processes, because of increased speed, efficiency, and reliability — typical applications include automating buying/selling processes, and updating and sharing of business and product data (Shim et al. 2000, Rodgers at al. 2002). Business-to-business interactions mean automated interaction of information systems between organizations, a requirement for automated inter-organizational processes. To achieve this, despite heterogeneous information systems that are involved, business-to-business integration must be performed. Integration, in this context, means facilitating interoperation of disparate information systems. (Medjahed et al. 2003) Business-to-business integration frameworks are generic solutions that enable performing such integration. Extensible Markup Language (XML) based business-to-business integration frameworks employ XML technologies and the Internet to provide this functionality. These are also often referred to as e-business frameworks. (Shim et al. 2000) RosettaNet is an industry consortium that aims to create e-business process standards2 for industries such as information technology, electronic components, semiconductor manufacturing, telecommunications, and logistics. These standards form the RosettaNet e-business framework. RosettaNet Implementation Framework (RNIF) is the messaging portion of the RosettaNet ebusiness framework. (RosettaNet 2004)

1

To be exact, e-commerce refers to buying/selling over the Internet, while e-business refers to a more general activ-

ity of connecting an extended organization using the Internet, which encompasses e-commerce (Rodgers 2002). However, it seems that many authors do not make this distinction. In this work, the term e-business is used exclusively. 2

In this work, the term “standard” is used to refer to any specification issued by an industry consortium or other

major organization, not just official standard documents by national or international standardization organizations.

1

Commercial RNIF implementations exist from vendors such as BEA, Fujitsu, IBM, Microsoft, and webMethods. There is also literature discussing e-business frameworks and their relative merits, and some reports on implementing e-business frameworks or their portions. However, there seems to be few scientifically documented experience reports on implementing e-business frameworks with current tools described from a technical point of view, as pointed out by e.g. Nurmilaakso (2003). Experience on implementation is important to evaluate the significance of any approach. The purpose of this work is to produce such experience on implementing RNIF by constructing a prototype system. The prototype has two high-level goals. The first, and more concrete, is to create a learning platform for students of Information and Communication Technology (ICT) enabled commerce at Helsinki University of Technology Software Business and Engineering Institute. Using this platform, students are able to practice realistic business-to-business integration scenarios with RosettaNet. The platform makes basic RNIF-related tasks, such as sending and receiving of business messages, relatively easy, so students can concentrate on higher-level issues of their assignments, often related to e.g. semantics of the messages being exchanged or transformations that are required. Another goal is to create an RNIF implementation as a generic software component, providing services that are assumed useful in a typical business-to-business integration scenario. This is done in order to gather and report experience on implementing such systems: to assess maturity and practical suitability of chosen implementation technologies, and to provide documented experience on issues associated with designing RNIF-capable systems, deciding which tools to use, and estimating required effort. The work described in this report was conducted at Helsinki University of Technology Software Business and Engineering Institute (SoberIT) between December 2003 and May 2004. The problem definition is based on previous work conducted at SoberIT in context of the NetData project (2002–2004). In the NetData project, a RosettaNet-based integration of Product Data Management (PDM) systems was performed using a commercial RNIF implementation (Kotinurmi et al. 2004). In work described in this report, ideas and experience from the NetData project are applied, but with a slightly different focus: instead of performing research on issues associated with a single integration domain, a general-purpose RNIF messaging platform is constructed, with emphasis on issues such as ease of use, portability, and scalability. This work was conducted in partnership with International Business Machines (IBM).

2

1.2 Research Problem and Objectives Research problem can be broadly specified as implementing a software system that provides messaging functionality as defined in the RosettaNet Implementation Framework (RNIF) specification in operational environment resembling a typical business-to-business integration scenario — the main problem is how to create such a system with available tools. To be more specific, the research problem can be formulated with following questions: 1. How can RNIF be implemented in enterprise software systems? 2. What are the main issues in implementing RNIF with software development tools commonly available in early 2004? How suitable are the tools, what could be improved in them? How much effort is required? 3. What level of performance can be obtained, in terms of message throughput and maximum message size? 4. Are there any interoperability problems with RNIF implementations? Software interoperability can be defined as two or more software systems being able to communicate and work together to attain a common goal, despite being implemented in varying software and hardware platforms (Wileden et al. 1992) — in this case, following the RNIF specification provides certain degree of interoperability. Interoperability problems are issues that cause desired interoperability not to happen. Disregarding possibility of plain erroneous implementation, e-business frameworks are complex, and not every single operational situation that can occur might be unambiguously specified in related specifications. Therefore, interoperability problems between two implementations might be possible, even though both implementations follow the same specifications. The objective of the research is to provide answers to these questions by creating a prototype system, a RosettaNet business-to-business integration platform. The prototype is a middleware3 system that provides RNIF functionality, on top of which RNIF-enabled applications can be constructed. In addition to the previously stated two high-level functional goals, serving as a learning platform and a usable business-to-business integration software component, additional goals set for the prototype system are, in order of decreasing priority: 3

The term “middleware” is not in general very well defined. In this work, it is used to mean an entity that serves as

a layer between a user application and a data resource, resembling the mediator concept as described by Wiederhold (1992).

3

1. Compliance with the RNIF 2.0 specification. Without this, the prototype is useless. 2. Ease of use for developer of applications employing the prototype. Developing applications on top of the prototype should be significantly easier than creating an RNIF implementation from scratch: this is the value the middleware provides to justify its existence. 3. Portability, the prototype should be usable in as wide variety of hardware/software environments as allowed by the selected implementation tools, with minimal configuration required 4. Scalability, level of performance that can be gained, should be maximized to degree possible within limits of available implementation resources

1.3 Research Method The research is largely constructive in nature, centered on building a prototype system. Briefly, the constructive research method can be defined to have the following phases: finding a suitable problem of both practical and research interest, obtaining a general understanding of the problem, innovatively constructing a solution, demonstrating that it works, showing the theoretical connections and research contribution of the solution, and examining the applicability of the solution (Kasanen et. al 1993). The research problem is defined in subsection 1.2, along with some motivation. A general understanding of problem is obtained by first performing a brief literature study, described in section 2, and then analyzing the problem further and deriving a set of functional and technical requirements for the prototype implementation in section 3, based on what can be implemented using available tools and resources. Research problem definition is largely based on earlier research conducted in the NetData project, derived from discussions with NetData project members and inspection of project reports (Kotinurmi et al. 2004, Laesvuori 2003). The intuitive phase of this work is constructing the prototype, based on defined requirements, by applying typical software design practices. Main implementation technologies are Java 2 platform Enterprise Edition (J2EE) generic component-based enterprise application architecture, and web services, a group of technologies for XML-based messaging in Internet ― these are selected because experience of these popular implementation technologies is desired. The NetData project has previously implemented RNIF using Microsoft tools (Laesvuori 2003). Software architecture of the prototype is presented in section 4 and experiences from the implementation phase in section 6. 4

In addition to written description, the system and software architecture is documented using UML diagrams where applicable. See 2.5.7 for a brief introduction to UML, Appendix A for a very brief summary of UML notation needed to understand the diagrams used in this report, and the book by Booch et al. (1999) for a real introduction to UML. Demonstrating that the solution works is done by applying following testing to the prototype: •

RNIF-compliance of the prototype is tested by employing the RosettaNet Ready™ SelfTest Kit, available from the RosettaNet consortium. This testing verifies the prototype is able to interoperate with other RNIF-compliant software to some degree.

•

Practical usability of the prototype is tested by a student group in the spring 2004 T86.301 (Project Course on ICT Enabled Commerce) course at Helsinki University of Technology. The group employs the prototype in an integration scenario, in which they transform SAP Intermediate Document (IDOC) XML files (SAP 2004), the proprietary interchange format of the SAP enterprise resource planning system, into RosettaNet messages and send them to another, fictitious, RosettaNet-capable party using the prototype. The purpose of this testing is to show the prototype is usable in solving an integration problem in practice.

•

Interoperability with standard RNIF 2.0 capable software is tested by performing interoperability testing against a commercial product, Microsoft BizTalk Accelerator for RosettaNet version 2.0, selected because of availability of the software. This testing both further validates RNIF-compliance and attempts to point out possible interoperability issues between the two RNIF implementations.

•

Performance level, level of scalability, gained is analyzed by stress testing the prototype by generating high processing loads and measuring performance. This testing gives some indication on what kind of performance level can be obtained in practice.

Test arrangements and results obtained are elaborated in section 5. Theoretical connections are established in the section 2 of this report, the brief literature study. Applicability of the solution is discussed in section 6, and finally research contribution in the Conclusions section.

5

1.4 Scope The literature study is focused mainly on explaining related technologies and their background to degree that is required for understanding this work. In addition, available experiences on developing similar systems are studied, to review implementation approaches that have been previously tried, and similar systems that have been built. The viewpoint of the literature study is strictly technical; the business context of integration is not addressed. Commercial products providing similar functionality are mentioned, but only one product is described in some detail. Only messaging, composing and transferring messages, related to business-to-business integration of information systems is addressed in this work. More specifically, business process issues, or structure or semantics of business information that is transferred, are not addressed. This is because the objective is to produce an RNIF implementation and RNIF is only focused on the messaging aspect of business-to-business integration. The prototype is constructed and tested in laboratory environment, and its operation is only verified with tests described in the preceding subsection. The available implementation effort is fixed to be roughly three person-months.

1.5 Structure of This Report The structure of this report goes as follows. 1. Introduction. This section describes the research problem, method, and scope. 2. Current Knowledge. A brief survey of related research work is presented. Business-tobusiness integration, XML technologies, XML-based business-to-business integration frameworks, and messaging portions of these frameworks are introduced. A brief introduction is given to the Java programming language and J2EE. Finally, previously reported experiences on implementing XML-based business-to-business integration frameworks are summarized. 3. Requirements for the Prototype System. Requirements for the prototype system are defined here, based on the problem the prototype is aimed to solve. 4. Prototype Design and Implementation. This section describes prototype implementation and related design decisions. First, external interfaces of the prototype system are presented. Then, an overview of internal software architecture of the prototype is given, including describing how it addresses the use cases that have been specified. After that, 6

J2EE implementation of the prototype and implementation of client applications that employ the prototype is presented in more detail. Finally, tools that were used in implementation are listed and a summary of effort that was spent in implementation is given. 5. Testing the Prototype. Describes how the prototype was validated. Test arrangements, results obtained, and conclusions that can be drawn based on the results are presented. 6. Discussion. This section contains evaluation of the prototype system that was developed and development methodologies. Then, experiences of related specifications, tools, and technologies are reported. After that, the relation of this work to previous research is evaluated. Finally, reliability of obtained results is analyzed and some further research possibilities are identified. 7. Conclusions. A brief evaluation of the results obtained and their significance is given in this section.

7

2 Current Knowledge This section introduces the concepts and technologies this research is based on, and gives an overview of what is currently known on the research topic. First, business-to-business integration in general and some related concepts that are used throughout this work are introduced. Then, a quick overview of Extensible Markup Language (XML) and some related technologies is given. After this, business-to-business integration frameworks, especially the XML-based ones, are introduced, with emphasis on messaging aspects of these frameworks. At this point, a brief introduction is given to the Java programming language and Java 2 Platform Enterprise Edition (J2EE), as they are an important implementation technology for this work and utilized by some of the implementations that are introduced in this section. Finally, previously reported experiences on implementing XML-based business-to-business integration frameworks are summarized, including RNIF implementations that have been previously constructed and technologies that were used in creating them.

2.1 Business-to-Business Integration Business-to-business interactions mean automated interaction of information systems between organizations. To achieve this, relevant information systems must be made to communicate with each other. This is not trivial, as very heterogeneous hardware and software components are often used in these systems. Therefore, business-to-business integration must be performed to bridge this gap. Here, integration refers to interoperation of disparate information systems. (Medjahed et al. 2003) Bussler (2002) introduces some important concepts related to business-to-business integration that are also used in this work. Adapted from his listing, some core concepts in business-tobusiness integration are: •

Trading partner is a participant in business-to-business interaction, which might not necessarily involve trading in the financial sense

•

Business message is a message communicated across trading partners

•

Business document is the “payload” portion of a business message: the data that the parties actually want to communicate, without any “envelope” information such as headers 8

etc. that might be present in a business message, and without any encryption or other transfer encoding. This data is often in XML format. •

Transformation is the act of transforming data from one business message format to another, or from one business document format to another

•

Business-to-business protocol defines the protocol, types of exchanged business messages and messaging choreography, performed between trading partners. Issues such as timeouts, retries, exception handling, etc. are also addressed.

•

Application protocol is analogous to business-to-business protocol, except it is performed within single trading partner with back-end applications and might not be so well specified

•

Private process is the internal business process of a single trading partner. The private process defines how an event received from business-to-business protocol is processed, typically by applying application protocol.

2.2 XML Technologies In this subsection, core concepts of the Extensible Markup Language (XML) and some related technologies are briefly introduced: schema, expression, and transformation languages that are used with XML.

2.2.1 XML XML, defined by the World Wide Web Consortium (W3C), is a language for expressing hierarchical structured documents. The original XML specification is a subset of Standard Generalized Markup Language (SGML). XML is commonly used in two ways. The first is to describe structure and formatting of a document meant to be readable by humans. Another is to serve as an interchange format for exchanging data between applications ― data that is human-readable, but not necessarily meant for human consumption. (Zisman 2000) See Figure 1 for an example of core XML concepts. An XML document consists of elements. Elements can contain textual data or other elements, forming a tree-like hierarchy. Elements contained within an element are called its children, whereas the containing element is a parent. Elements start and end with tags that also express the name of the element. Elements can also have 9

attributes, which are name-value pairs. A document begins with a root element, which contains all other elements and data in the document. (W3C 2004) Element and attributes names can be separated into namespaces by prefixing them with a namespace prefix separated by colon, e.g. “soap:Envelope”. These prefixes are mapped to namespace URIs (Universal Resource Identifiers), such as "http://schemas.xmlsoap.org/soap/envelope/", using a specialized form of attribute syntax. URIs include the more familiar URLs (Universal Resource Locators) which are used to locate resources in the Internet, but a URI can also be just a name without specifying an access mechanism (IETF 1998b). The namespace prefixes itself are not significant, except in some special cases. Namespaces are employed so that elements defined by various organizations can be used in the same XML document without having to worry about defining duplicate element names. (W3C 1999a) root element

element

attribute

Barney namespace prefix

opening tag

closing tag

Figure 1: XML concepts illustrated

2.2.2 XML Schema Languages XML schema4 languages are methods of expressing structure of XML documents. In XML schema languages defined by W3C, this is done by specifying allowed content for each element that can be used in a document, including which elements can be used as the root element — as an XML document is contained within one root element, possible content for the whole document is thereby defined (W3C 2001a, W3C 2004). Using an XML schema language, XML schema (instance) documents specify a subset of all possible XML documents that are valid when validated against that XML schema instance.

4

Following convention set out in many other papers, in this paper “XML Schema” refers to the W3C XML Schema

specification, “XML schema” refers to any schema language used with XML, and “schema” refers to the general concept of a schema.

10

Two general-purpose XML schema languages have been defined by W3C, also other XML schema languages exist. The original W3C-defined XML schema language is Document Type Definition (DTD), which has been a part of the XML specification since the beginning. It does not support e.g. namespaces and custom data types, and has relatively limited power for expressing element structure. W3C XML Schema is the newer specification of the two. It improves on DTD by e.g. adding support for namespaces, custom data types, and inheritance-based data type hierarchies, and has more advanced capabilities for expressing element structure. In addition, W3C XML Schema has an XML-based syntax that is more intuitive, although lengthier, than the DTD syntax. (Lee & Chu 2000)

2.2.3 XPath XPath (W3C 1999b) is a notation used to select a subset of an XML document, using queries such as “/zoo/animal”, which would select all “animal” child elements of the “zoo” root element. More advanced features include various methods of specifying the selected subset, for example syntax such as “/zoo/animal[@species=’dinosaur’]” would select all animals of dinosaur variety in the zoo.

2.2.4 XSLT XSL Transformations (XSLT) is a language that can be used to express transformations from one XML schema to another. A transformation is presented as a set of rules, XPath expressions are often employed when specifying transformations. XSLT is a part of Extensible Stylesheet Language (XSL), which is a method of expressing human-readable presentation, or layout, of an XML document, but XSLT can also be used outside this context as a general-purpose XML-toXML transformation mechanism. (W3C 1999c)

2.3 Business-to-Business Integration Frameworks Business-to-business (integration) frameworks are generic solutions that provide mechanisms allowing information systems of organizations to communicate with each other: they standardize business-to-business protocol. Previously, pre-XML Electronic Data Interchange (EDI) has been the most widely used of these frameworks. Following EDI, XML-based business-to-business integration frameworks have appeared, employing XML technologies and the Internet. These frameworks are also often referred to as e-business frameworks. (Shim et al. 2000) 11

In this subsection, EDI is first introduced to establish the need for XML-based solutions. After that, some properties and categorization of the XML-based frameworks is presented, and then three such frameworks are presented as examples: xCBL, RosettaNet, and ebXML are selected as being typical representatives. For each framework, an introductory section presents background information of the framework and specifications provided by that framework are then reviewed. RosettaNet is addressed in somewhat more detail than the others are, as it is the primary focus of this work

2.3.1 EDI There are two major EDI standards: ANSI (American National Standards Institute) X.12 is used especially in the United States and the ISO (International Standards Organization) EDIFACT (Electronic Document Interchange for Administration, Commerce, and Transportation) is the international standard. In addition to these core standards, also industry-specific EDI variations have been developed. A common example is SWIFT (Society for Worldwide Interbank Financial Telecommunications) for international banking; more localized examples include VDA, SEDAS, and DAKOSY, used in German automotive, consumer goods, and transportation industries, respectively. (Westarp et al. 1999) EDI supports predefined business documents and processes; any change to them requires validation by a related standard or EDI guideline committee, which can be considered somewhat inflexible. EDI-based communication was originally performed using specialized private networks; specifications have been devised to allow EDI-based communication over the Internet. (Medjahed et al. 2003) EDI-based solutions tend to suffer from varying implementations, and prohibitive implementation cost because of required specialized infrastructure and software. Therefore, modern XMLbased solutions look promising to become the next industry standard for business-to-business integration to complement EDI. (Shim et al. 2000)

2.3.2 XML-Based Business-to-Business Integration Frameworks While all XML-based business-to-business integration frameworks do employ XML, many other technologies and mechanisms are also involved in addition to the markup language, and the term “XML-based” should probably be understood primarily as a category name identifying these frameworks from other types of business-to-business integration frameworks. 12

Nurmilaakso & Kotinurmi (2004) identify the following three basic issues an XML-based business-to-business integration framework, an e-business framework, might address: •

Document issues, including a vocabulary of terms that can be used in business documents and structure of such documents

•

Process issues: what business documents are exchanged, in which order, and what are the roles of parties in the exchange

•

Messaging issues describe how messages are packaged and transferred, including issues such as reliability and security

Not all e-business frameworks address all these aspects, however, and some are more concentrated on a subset of these. There are many e-business framework initiatives. A brief listing of those that are currently somewhat active, adapted from one by Nurmilaakso & Kotinurmi (2004): •

Cross-industry frameworks, such as cXML (commerce XML), OAGIS (Open Applications Group Integration Specification), and xCBL (XML Common Business Library) provide business vocabularies that should be usable across industries. They do not typically define messaging and process very well.

•

Industry-specific frameworks, such as RosettaNet, and papiNet for the paper and forest industry, provide industry-specific vocabularies and process definitions, also specifying messaging

•

Process-centric frameworks, such as BPML (Business Process Modeling Language), ebXML (electronic business XML), and XPDL (XML Process Definition Language), focus primarily on business processes. Of these, only ebXML is clearly targeted towards addressing full business-to-business integration, specifying messaging and related issues.

In this section, one framework of each of these groups is selected for closer study.

2.3.3 xCBL 2.3.3.1 Introduction The history of XML Common Business Library (xCBL) goes back to 1997, when a research project called CBL was started at Veo Systems to identify requirements for e-commerce. 1999, after Veo Systems was acquired by Commerce One, a new version of CBL was released, called 13

xCBL 2.0, which added EDI interoperability to the specification. xCBL 3.0, released 2000, broadened the scope of the specification. More recently, OASIS (Organization for the Advancement of Structured Information Standards) has continued this work by using xCBL as a base of their Universal Business Language (UBL) specification. The latest version, xCBL 4.0, replaces DTDs with W3C XML Schema as the schema language used to define business documents, and aligns xCBL more closely with the UBL work. (xCBL 2004) 2.3.3.2 Specifications xCBL is a library of grouped XML schema documents; the latest version employs the W3C XML Schema syntax. The grouping is done based on intended area of usage: eight groups, such as materials management, order management, preorder management, etc. exist. Each of these groups defines a separate XML namespace. In addition, a “core” namespace is employed by other namespaces. From XML elements in each namespace, a set of business messages is defined. For example, the order management group defines messages such as “Order” for a purchase order, or “ChangeOrder” for changing a previously placed order. Mappings from xCBL versions up to 3.5 to EDI have been defined. For some messages, process documents exist that specify possible messaging choreographies that might be useful with these messages. These are just guidelines, not mandated by the xCBL specification. (xCBL 2004) It is easy to see that xCBL only addresses business document issues in above categorization.

2.3.4 RosettaNet 2.3.4.1 Introduction RosettaNet is an industry consortium that attempts to provide standards for electronic businessto-business integration that meet the needs of multiple industries. Development and deployment of RosettaNet specifications is driven by a set of global industry councils. (RosettaNet 2004) After the original 1998 launch with only the information technology council, electronic components council was formed in 1999, followed by semiconductor manufacturing in 2000, solution provider council in 2001, telecommunications in 2003, the latest addition being the logistics council in 2004.

14

Perhaps because of this decision of not trying to address all possible industries at once, RosettaNet5 has actually been implemented in practice and some cost savings, mainly compared to EDI, have been reported — see, for example, a RosettaNet consortium case study (RosettaNet 2002a) for a field study of 12 companies describing such results. 2.3.4.2 Specifications Specifications defined by RosettaNet fall into three categories (RosettaNet 2004): •

Dictionaries define common sets of XML elements used in business documents. The RosettaNet business dictionary defines elements related to transactions between trading partners, while the technical dictionary defines elements used for describing products and services. PIP specifications employ elements defined in dictionaries.

•

Partner Interface Processes (PIPs) define business processes as XML-based business documents and messaging choreography. A single PIP specification defines these for a single business process, such as PIP 3A4 for Request Purchase Order. Business documents are specified using an XML DTD schema document and additional textual description of the XML elements in a message guidelines document. A PIP specifies a set of activities, business document exchanges, which consists of sequentially executed actions ― currently activities are limited to 1–2 actions. Each action corresponds to a business document that is sent in context of that action. A PIP also specifies a set of roles that trading partners play when executing an activity. These concepts are used to define the messaging choreography using standardized tables and UML (Unified Modeling Language) activity and interaction diagrams. The specification does not facilitate fully automatic generation of implementations, as pointed out by Sayal et al. (2001). A hierarchical identification of a PIP (such as 3A4) consists of a cluster number that specifies the process category, a segment letter that specifies the cross-enterprise process, and a number that specifies the individual PIP. (RosettaNet 2002b)

•

RosettaNet Implementation Framework (RNIF) defines a connectionless XML-based messaging scheme used to perform messaging specified by RosettaNet PIPs between trading partners, with features such as reliable messaging, digital signatures, and encryption. Amongst other things, RNIF defines message structure and header XML documents for business messages — for these, the specification is similar to PIP business document

5

In this work, the term “RosettaNet” collectively refers to all specifications issued by the RosettaNet consortium.

15

specification, with DTDs and message guidelines. Latest version of RNIF is version 2.0. (RosettaNet 2002b) Viewed through the above definition of e-business frameworks, RosettaNet addresses business document issues in dictionaries and PIPs, business process issues in PIPs, and messaging in the implementation framework. A simple example on how PIP execution is performed is presented in Figure 2: two PIPs belonging to order management cluster 3 and its quote and order entry segment 3A6, 3A4, Request Purchase Order, and 3A7, Notify of Purchase Order Update, are used as examples.

Figure 2: Left, an example RosettaNet business document exchange with the 3A4 PIP as a UML activity diagram. Right, an example exchange with the 3A7 PIP.

In the left diagram of Figure 2, a trading partner that has “Buyer” role in this instance of PIP 3A4 initiates the PIP instance by sending a purchase order request business message to a partner in “Seller” role. This business message includes information related to a purchase order, such as what products to purchase, shipping details, etc. The seller receives this message and checks whether the purchase order can be fulfilled, returning a purchase order confirmation message that summarizes the purchase order and has a status indicator specifying whether the purchase order can be fulfilled. The messages are exchanged in asynchronous fashion — in this particular case, the recipient has 24 hours to confirm the purchase order. This time is specified in the PIP specification. This high-

6

Segment 3A is favored among writers of introductory RosettaNet examples, as these PIPs are simple enough to

grasp and still a relatively typical example. This work keeps with the tradition.

16

level diagram does not show acknowledgments that occur for every business message that is exchanged, and possible retries. This two-step activity is actually one of the longest business processes defined by RosettaNet — currently activities consist either of two business messages that are exchanged in request-reply fashion such as in PIP 3A4, called a two-action activity, or just a single business message, a single-action activity. The right diagram of Figure 2, PIP 3A7, gives an example of a single-action activity: only one business message is sent. PIP specifications may list PIPs that typically precede or follow the current PIP. For example, the request purchase order PIP 3A4 might be followed by a change, PIP 3A8, or cancel, 3A9, of a purchase order. These are only suggestions and not dictated by the specification. Therefore, longer business processes can be defined by creating larger-scale choreographies consisting of sequentially executed PIP activities, but how exactly this is done is not standardized.

2.3.5 ebXML 2.3.5.1 Introduction The ebXML initiative was launched by OASIS and UN/CEFACT (United Nations Centre for Trade Facilitation and Electronic Business) in 1999 with a 1.5-year mission to create global ebusiness standards usable by all organizations. The ebXML specifications are still being developed. March 2004 some of the core ebXML specifications were approved as ISO standards. ebXML specifications have been implemented in practice to some degree, case studies of successful deployments are available. (ebXML 2004) 2.3.5.2 Specifications Following specifications have been defined (ebXML 2001): •

Business Process Specification Schema (BPSS) is a schema used to specify business process as messaging choreography between trading partners. This schema is available as DTD and W3C XML Schema. BPSS is a subset of UN/CEFACT Modeling Methodology (UMM).

•

Collaboration Protocol Profile (CPP) describes technical capabilities and supported business collaboration types of an organization. A schema for CPP is available as DTD and W3C XML Schema. A Collaboration Profile Agreement (CPA) is an agreement of business collaborations and technical capabilities two organizations have agreed to use in 17

mutual communication, these must be supported by CPPs of both organizations. Similar to CPP, a schema for CPA is also available as DTD and W3C XML Schema. CPPs can refer to business processes specified as BPSS. •

Registry Service (ebRS) is an entity that can be used to store XML documents that are required to initiate communication with another organization, such as CPPs. The registry can be searched for this information.

•

Registry Information Model (ebRIM) specifies the information model employed in the registry service

•

Messaging Service (ebMS) specifies messaging-related functionality. The specification defines features such as reliable messaging, digital signatures, and encryption.

As can be seen, ebXML is a bit more complex than RosettaNet. The ebXML registry concept or CPPs/CPAs have no equivalent in RosettaNet.

2.3.6 Comparison In Table 1 below, e-business frameworks that have been reviewed are compared in relation to issues listed previously.

18

Table 1: xCBL, RosettaNet, and ebXML compared

xCBL

RosettaNet

ebXML

Business document

W3C XML Schemas

PIP specifications de-

Guidelines for model-

issues

define business docu-

fine business docu-

ing business docu-

ments

ments: DTDs accom-

ments exist, but no

panied with message

globally standardized

guideline documents

documents

Business process

Guidelines for possi- 1–2 step processes de-

Processes can be de-

issues

ble processes

fined informally in

fined formally with

PIPs

BPSS

RNIF: reliable and se-

ebMS: messaging

cure messaging frame-

over SOAP, reliable

work over HTTP

and secure messaging

Messaging issues

None

as extensions

Summarizing, xCBL only addresses business documents. RosettaNet addresses all areas by specifying enough to fulfill its goals. ebXML attempts to maximize general applicability.

2.4 Messaging in an XML-based Business-to-Business Framework Although there are many e-business frameworks, fewer messaging schemes exist. This might be because messaging is considered an easier and better-understood problem, as e.g. computer network communications protocols have been around for a while. Web services, RNIF, and ebMS are listed as messaging schemes utilized by e-business frameworks by Nurmilaakso & Kotinurmi (2004). Next, each of these schemes is briefly reviewed.

2.4.1 Web Services 2.4.1.1 Introduction Web Services (WS) are a bit different from RNIF or ebMS, as they are not directly related to any e-business framework, but instead are a mechanism for performing general-purpose XML-based messaging and Remote Procedure Call (RPC) operations over the Internet. A general-purpose messaging technology, web services can also be used in e-business framework context. 19

2.4.1.2 Core Technologies Web Services consist of the following three core technologies, in addition to XML (Newcomer 2002, p.16): •

WSDL (Web Services Definition Language) is an XML-based language that is used to describe web services in terms of operations they expose and what input/output parameters the operations take. Parameters are specified as messages, which in turn can be specified using schema languages such as W3C XML Schema. (W3C 2001b) Current version of WSDL is 1.1, with version 2.0 in development.

•

SOAP7 is a messaging protocol used to communicate with a web service. SOAP specifies XML-based message syntax and bindings to Internet-based transport protocols8, most notably HTTP. Also specified are guidelines on how SOAP can be made to perform generic RPC operations. A SOAP message is an XML document. The SOAP version 1.1 specification (W3C 2000a) did not define how to transfer attachments, arbitrary binary content that can be transferred with SOAP messages similar to attachments in email. Currently there are two alternative ways to do it: the WS-Attachments (IBM 2002b) specification by Microsoft and IBM, and the SOAP with Attachments (W3C 2000b) specification by Hewlett-Packard and Microsoft. Both define a wrapper around SOAP, suitable for transferring binary attachments in addition to the XML-based message. The latest SOAP version is 1.2.

•

UDDI (Universal Description, Discovery, and Integration) specifies a registry that can be searched for web service descriptors

2.4.1.3 SOAP and WSDL The SOAP WSDL binding, the part of WSDL specification specifying mapping WSDL concepts to SOAP structures, has two independent variables that can be altered for four different communication patterns transmitting the same information but with different operational semantics and

7

SOAP used to stand for “Simple Object Access Protocol”, but this interpretation was dropped in SOAP 1.2 —

nowadays it is just “SOAP”. An XML-based messaging/RPC mechanism, SOAP has relatively little to do with objects. 8

Transport protocols, in this work, do not refer to network transport-layer protocols such as TCP (Transmission

Control Protocol), but instead refer to any protocol that can be used to transport business messages, e.g. HTTP(S) and SMTP.

20

data representation. Only three of these patterns are usable, as discussed below. The two variables are: •

“Style”, which can be either “RPC” or “document”, specifies whether Remote Procedure Call semantics are in place, or whether XML documents are transferred as-is without RPC semantics

•

“Use”, which can be either “encoded” or “literal”. Encoded implies that SOAP encoding is used to encode the parameters of RPC calls. This does not make sense with the “document” style. Literal implies that the WSDL-defined XML data types are transferred as-is. SOAP encoding, a standardized conversion of structures and primitive data types to XML, is most useful when encapsulating an API originally designed for programminglanguage-style method calls.

To sum it up, the three possible communications patterns are RPC/encoded, RPC/literal, and document/literal. 2.4.1.4 Auxiliary Technologies In addition to the core technologies, auxiliary technologies provide additional functionality. Some of these are introduced here. This list is not exhaustive — various other technologies exist and some of them overlap with the ones introduced. •

Web Services Security (WS-Security) enables secure message exchange using SOAP. The specification defines a SOAP header that specifies security parameters that are used in a message. The header may contain a digital signature for some XML elements in the SOAP message body. The message body can also contain encrypted blocks. In addition, attachments can be encrypted. Traditional public key cryptography (Rivest et al. 1978) security model is used. (IBM 2002a)

•

Security Assertions Markup Language (SAML) enables exchanging authentication and authorization information. It can be used with several messaging schemes, including SOAP. SAML provides means of exchanging security assertions, which are facts that are known about a subject, or authorization a subject holds. These security assertions are cryptographically verified by a party all members of the exchange trust. (OASIS 2004)

•

Web Services Reliability (WS-Reliability) provides reliable messaging on top of SOAP. This is accomplished by defining extension headers for transferring message identity and

21

acknowledgments that, together with appropriate policies, facilitate reliable delivery and guaranteed message ordering. (Oracle 2003) •

Business Process Execution Language for Web Services (WS-BPEL) allows formally defining business processes that consist of multiple sequentially executed web service interactions. These sequences may contain conditional logic based on business document content, and handlers for exceptional conditions. Participants, two or more, in a process have distinct roles, each of which has its own process. (IBM 2003a)

These technologies are still evolving, and not standardized or accepted by any single central body; various standardization organizations and industry consortiums are free to develop specifications, which might then become de-facto standards if they gain enough popularity in implementations. For example, a global security infrastructure for XML has not yet been achieved: a number of different and related proposals exists, many of which have not been fully adopted (Damiani et al. 2002). Some current WS implementations support just WSDL and SOAP with both flavors of attachments, employing HTTPS, secure variant of HTTP, for transport layer security. To provide a more exact definition for the relatively loose term “web services”, the Web Services Interoperability (WS-I) organization has defined a set of “profiles” that specify categories of web service support. These profiles define what technologies should be supported and iron out differences in implementations by specifying additional guidelines and prohibiting some structures that would be allowed by relevant standards. (WS-I 2003) 2.4.1.5 Applications The WS standards only specify relatively low-level operations. No message content or process is specified, but the creator of a web service needs to fully specify how messaging is used. WS can be thought of as a generic scheme of achieving XML-based RPC and message exchange in platform-independent manner, suitable for inter-organizational use with firewalls and high-latency connections. Because of this generality, web services are not very useful by themselves. Additional agreements and guidelines are necessary for any implementation scenario. In addition to business-tobusiness frameworks, web services have a wide number of possible uses in many integration scenarios, in which no global agreement of business documents exchanged is needed, but instead only a mechanism of getting a known subset of systems to communicate with some case-specific semantics is required. 22

Perhaps due to this generality, the WS specifications have gained some popularity. Web services have been an integral part of Microsoft .NET architecture since its inception (Meyer 2001), and the recent Java 2 platform Enterprise Edition (J2EE) version 1.4 (Sun Microsystems 2003a) includes standardized interfaces for Java-based web services. The Java API for XML-based RPC (JAX-RPC) specification defines a subset of WSDL structures that can be used to create a meaningful Java RPC interface — obeying these limits, one can automatically generate a WSDL description from a Java interface, and vice versa (Sun Microsystems 2002a). Common generalpurpose enterprise application development environments, such as the IBM WebSphere product, have already supported web services for some time (IBM 2004).

2.4.2 RosettaNet Implementation Framework 2.4.2.1 Introduction The RosettaNet Implementation Framework (RNIF) is a messaging scheme designed for exchanging RosettaNet business documents between trading partners. In addition to RosettaNet, RNIF can also be applied to transferring OAGIS business documents (OAG 2001). The latest version of RNIF is 2.0 (RosettaNet 2002b). RNIF 2.0 implementations need not be compatible with previous versions of RNIF. RNIF specifies: •

XML-based header documents that must accompany all business messages in addition to the PIP-specific business document, which define things such as sender and receiver of the message, business process type, process instance identity, message identity, etc.

•

How a business message is packaged. Packaging is done by including headers, a business document, and optional binary attachments into a multipart MIME (IETF 1996) container. MIME (Multi-purpose Internet Mail Extension) is the encoding that is used in most Internet emails today. It was originally developed as a method for transferring arbitrary binary entities in email message bodies, allowing things such as email attachments, but has since proven to be useful in other contexts that require textual data with binary attachments to be transferred.

•

How a packaged message is encrypted and/or digitally signed. The Secure variant of MIME, SMIME (IETF 1998a), is used — in practice, this means employing regular public-key cryptography with binary encoding of content, as specified by SMIME. See the paper by Rivest et al. (1978) for an introduction to public-key cryptosystems.

23

•

How a packaged message is transported over transport protocols such as HTTP and SMTP. Two transportation variations exist: asynchronous, and synchronous. An asynchronous message does not require immediate reply, while synchronous messages do.

•

Signal messages: RNIF signals are positive or negative acknowledgments that are used to achieve reliable messaging on top of an unreliable transport medium. There is one positive acknowledgment, the receipt acknowledgment message, and one negative acknowledgment, the exception message.

•

How to handle the messaging choreography, including retry and acknowledgment logic: usually each business message must be acknowledged, and if not acknowledged within a certain time, retries must be attempted. If the remote fails to respond within specified timeout intervals or responds with invalid content, either an instance of the 0A1, Notification of Failure, PIP must be initiated, or an exception message must be sent, depending on the phase of the process in which the protocol violation happened.

2.4.2.2 Message Flow The RNIF specification (RosettaNet 2002b, section 2.6) divides PIPs activities into three classes: •

Asynchronous single-action activity consists of sending a business message and optionally waiting for an acknowledgment

•

Asynchronous two-action activity consists of sending a business message and receiving a response message, both of which are optionally acknowledged

•

Synchronous single/two-action activity consists of sending a business message and optionally receiving a response, an acknowledgment or a business message, in synchronous manner

In the synchronous cases, all communication is done within one transport-level connection to the remote party. For asynchronous cases, the operation is more complex. Summarizing from the RNIF specification (RosettaNet 2002b section 2.6), sending an asynchronous single-action message consists of following steps: 1. Send the message 2. Wait for acknowledgment of receipt. If the acknowledgment comes in time specified for this message, the send operation is successfully completed.

24

3. Otherwise, the process is retried the number of times specified for this message. When all retries are done and there has not been an acknowledgment, sending the message is deemed to have failed. Acknowledgments are optional for an RNIF message, depending on PIP, but are usually used. If no acknowledgment is specified, sending of a message consists of just step 1 in the preceding list. Recipient of an asynchronous single-action message only needs to acknowledge it. Instead of an acknowledgment, the recipient might respond by sending an exception, in which case the result of the operation is immediately deemed a failure. Exceptions are used in cases where the sender is, or should be, still executing the PIP, in other cases an instance of Notification of Failure PIP 0A1 is initiated. The recipient might choose not to respond at all, which would cause the sender to eventually timeout and the result of the operation would be a failure. Somewhat similar to the single-action case, sending an asynchronous two-action message consists of following steps: 1. Send the message 2. Wait for acknowledgment of receipt. If the acknowledgment comes in time specified for this message, the send operation is successfully completed. 3. Otherwise, the process is retried the number of times specified for this message. When all retries are done and there has not been an acknowledgment, sending the message is deemed to have failed. 4. Wait for the return message. Acknowledge it. Step 4 in the above list might actually happen before step 2 or 3, as transport protocol is allowed to reorder messages. The transaction is not successfully completed until both acknowledgment and response have been received. Receiver’s side in a two-action PIP is similar to sender’s side in a single-action PIP.

2.4.3 ebMS OASIS ebXML Messaging Service (ebMS) version 2.0 (ebXML 2002) is designed to support largely the same types of use scenarios as RNIF. The two specifications are so alike that there has been discussion of convergence: whether RosettaNet should start supporting ebMS in addition to RNIF. According to a mid-2002 position paper, the RosettaNet consortium “plans to con25

tinue participate in the ebMS initiative to evaluate potential standards convergence opportunities over long term”. ebMS specifies: •

Format of business messages that are sent, including XML-based control information that specifies message receiver and sender identification, message identification, conversation identification, etc.

•

Packaging a business message to a MIME container. Control information is transported in SOAP header and body parts, while the business document is transported as a SOAP attachment.

•

Digital signatures and encryption of business message parts can be done using SMIME, like in RNIF, but the specification does not mandate this

•

Unlike RNIF, which is built on top of transport protocols such as HTTP and SMTP, ebMS utilizes a separate messaging protocol, SOAP version 1.1 with attachments

•

A reliable messaging scheme, with acknowledgments, retries, and duplicate elimination, which closely resembles that specified by RNIF

Perhaps the largest difference to the RNIF specification is that ebMS does not specify how to map any particular types of business documents into the specified messaging scheme, unlike RNIF, which is specific for purpose of transporting PIP business documents. Similarly, issues related to messaging choreography are not addressed.

2.4.4 Comparison The three messaging technologies that were introduced, web services, RNIF, and ebMS, are briefly compared in Table 2 below.

26

Table 2: RNIF, ebMS, and web services compared

Transport protocol

Web Services

RNIF

ebMS

SOAP (HTTP,

HTTP, SMTP

SOAP (HTTP,

SMTP)

SMTP)

Packaging

SOAP (MIME)

MIME

SOAP (MIME)

Reliable messaging

WS-Reliability

custom

custom

Authentication

WS-Security

SMIME

XML Signature

Confidentiality

WS-Security

SMIME

not defined

Message choreography

not defined

defined for PIPs

not defined

2.5 About J2EE This subsection introduces Java 2 platform Enterprise Edition (J2EE), the primary implementation technology chosen for this work. First, a brief overview of the Java programming language is presented. Then, J2EE is introduced, and its core functionality, application components, J2EE implementations, and software development and modeling in J2EE are described in more detail. Last, implementing web services with J2EE is briefly described, as combining J2EE with web services is an important implementation aspect of this work.

2.5.1 The Java Programming Language The Java programming language is an object-oriented language with basic syntax similar to the C language; main differences to C include adding support for object-oriented constructs and removing pointer arithmetic. Java is an interpreted language with garbage-collected memory management. The former means that Java source code is compiled into hardware-independent byte code, which is then executed in a Java Virtual Machine (JVM), so the same Java binary will run on any platform that has a JVM. The JVM does stricter correctness verification than typical traditional compilers; for example, it is not possible to write into an illegal memory address. Attempt to perform an illegal operation will result in a runtime exception being thrown, which will show trace of call stack at the moment of failure, even including source code line numbers if the program is compiled with debug information. The latter means that Java programs do not explicitly de-allocate memory. Rather, they just stop referencing objects that are no longer needed. The 27

objects are then automatically recycled by an entity known as the garbage collector. (Sun Microsystems 2000) These features have two implications. First, under garbage-collected memory management programs are not very vulnerable to two classes of common programming errors: random memory corruption caused by writing into illegal memory addresses, or memory leak caused by never deallocating memory (Wilson 1992). Lack of pointer arithmetic and strict validation of memory accesses help increase this reliability. Second, garbage-collected memory management leads to somewhat non-deterministic level of performance, due to required garbage-collection pauses (Wilson 1992). In addition, because of the interpreted nature of the byte code, Java programs remain slower to execute than traditional compiled programs until dynamic compilation techniques reach same level of efficiency as traditional compilers. These properties, among other reasons such as its platform-independent nature, make Java especially suitable for those enterprise applications to which reliability is absolutely critical and gaining maximum performance less so. These applications typically run on high-powered server hardware on few locations — upgrading to faster hardware is often relatively inexpensive compared to software development work in these scenarios.

2.5.2 J2EE Overview J2EE is a top-level definition for a set of independent but very interrelated APIs provided on top of Java 2 platform Standard Edition (J2SE). J2SE is another set of APIs created on top of the Java programming language. APIs needed by most applications are provided by J2SE, whereas APIs needed only by “enterprise-level” distributed application belong to J2EE. Therefore, some J2EE features are actually produced by the underlying J2SE subsystem or its extension modules, but when discussing only J2EE this distinction needs not be made. Originally created by Sun Microsystems, these platforms are currently being coordinated by an industry consortium known as the Java Community Process (JCP) program, with representatives from major software industry organizations. J2EE is versioned independently of J2SE. The newest published version of both J2SE and J2EE specifications is 1.49, but most implementations are for J2SE 1.4 and J2EE 1.3. (Sun Microsystems 2004)

9

The name “Java 2 platform” refers to versions 1.2 and above of these specifications, which may be a bit confusing.

28

2.5.3 J2EE Core APIs J2EE is a very broad standard, offering many independent and interrelated APIs for the application developer. Briefly, some important features offered by J2EE are (Sun Microsystems 2001a): •

Enterprise JavaBean (EJB) container is perhaps the most widely known aspect of J2EE. JavaBeans are Java classes implementing a particular calling convention. EJBs are, in addition to obeying the JavaBean convention, distributable software components that run in a container and use facilities such as container-managed persistency and transactions. EJBs usually provide business logic for an application. They expose a Java RPC interface for other components. There are three main kinds of EJBs. An entity bean abstracts a persistent “entity” — typically, such entity corresponds to a row in a relational database, but might also correspond to e.g. a particular connection to an external information system. Session beans are just generic remotely accessible objects that may or may not maintain arbitrary conversational state with a client application accessing them. They are not persistent, but created for a client session, hence the name. Message driven beans do not expose a Java interface, but instead are connected to a JMS queue and are activated by messages entering that queue — this facilitates asynchronous processing.

•

Servlet container provides facilities required to run Java servlets, and JavaServer Pages (JSPs), used for creating web-based interfaces. Servlets are Java classes that produce responses for incoming HTTP requests. JSPs are another way of writing servlets, by writing a markup document that includes portions of code, instead of the other way round.

•

Java Message Service (JMS) allows distributed synchronous/asynchronous, publisher/subscriber or point-to-point messaging. It facilitates asynchronous communication between J2EE applications or application components. (Sun Microsystems 2001b)

2.5.4 Application Components J2EE can be viewed as component-based software architecture, as J2EE applications are typically constructed of self-contained software components. It is also a distributed architecture, as these components may reside on separate computers. (Emmerich 2002) In J2EE terms, components, such as servlets and EJBs, are packaged into J2EE modules, which are independently deployable binary archive files with certain layout. A J2EE module therefore consists of one or more components. Two basic types of J2EE modules are a web application, 29

containing web documents, servlets, and JSP pages, and an EJB archive, containing a set of EJBs. (Sun Microsystems 2001a)

2.5.5 Implementations Full implementations of J2EE are often called application servers by vendors. All J2EEcompliant application servers provide an implementation for the minimal set of interfaces required (Sun Microsystems 2001a). Vendors might add services to provide additional value; these are either standardized Java APIs that are not required by the J2EE specification, or proprietary APIs. In theory, an application written to comply with the J2EE specifications will run on any application server. In practice, minor changes may be required when transferring from one applications server to another. The standardization field is huge, and the standards documents, being only thousands of pages long, cannot address every minor detail. Popular commercial application servers include IBM WebSphere Application Server and BEA WebLogic Server. On the open-source side, the JBoss application server is becoming increasingly popular.

2.5.6 Developing with J2EE Although the J2EE environment does provide many advanced services, it is still built around the Java programming language and is, fundamentally, an object-oriented software development environment. A J2EE application consists of a set of Java classes, possibly belonging into thirdparty software libraries, and XML-based descriptors providing metadata on how the classes should be used (Sun Microsystems 2001a). Therefore, the development model is quite traditional: edit the source code, compile, deploy on server, run. For development, no very specialized tools other than the application server are required, but a text editor and a command line Java compiler, possibly combined with some vendor-specific command line tools, are sufficient. However, to increase productivity, tools known as Integrated Development Environments (IDE) are often used. They typically contain custom editor and validation tools for editing various file formats, integrated build systems, compilers, etc. all wrapped in a convenient graphical user interface. Some IDEs go further than just providing simple convenience tools: to make things easier, supposedly, they provide means to hide the underlying Java code and XML descriptors from the 30

developer. “Wizard” style dialogs are available for creating and editing properties of objects such as EJBs. Still, a developer can usually edit relevant XML descriptors and Java source code “by hand” to gain the same effect. J2EE IDEs can be divided into two main categories: some IDEs are general-purpose, supporting multiple application servers, and others are vendor-specific, requiring application sever from the same vendor to be used. Vendor-specific IDEs are able to provide higher level of integration to the application server in question, and more advanced application server specific functionality because of this, but the downside is vendor lock-in for application server, which might not always be desirable.

2.5.7 Software Design in J2EE When designing software for object-oriented programming languages such as Java, objectoriented modeling approaches have become popular as a methodology in analysis and design phases of the software development process. The Unified Modeling Language (UML) has become a de-facto standard as the language to use in object-oriented modeling. (Engels & Groenewegen 2000) See the book by Booch et al. (1999) for an introduction to UML. Very briefly, UML defines nine diagram types. Class diagrams are used to describe static structure of software: classes and their relationships. Object diagrams are somewhat alike class diagrams, but describe relationships of instances of classes, or objects. A use case diagram describes a set of use cases and actors involved in them. Sequence and collaboration diagrams present an interaction between objects by specifying messages that are passed between them in this interaction. These two have the same expressive power, but are alternative representations. A statechart diagram can be used to describe a state machine. An activity diagram describes flow of activity in a system; this type of diagram is also often known as a flowchart. A component diagram describes relationships between system’s components. In UML terminology, a component is “physical”, binary portion of the software, e.g. a file, as opposed to a conceptual entity. Finally, a deployment diagram describes how components are distributed into nodes. UML nodes are entities with some computing capability, typically computers. Sophisticated Computer Aided Software Engineering (CASE) tools allow generating code from a UML model of software, and vice versa. The process in which code is generated from a UML model and changes to the code are subsequently reverse-engineered back to the UML model is called round-trip engineering. (Medvidovic et. al 1999) 31

When moving from object-oriented design to modeling distributed component-based software architectures, such as J2EE applications, things get a little more complicated. Component architectures platforms such as J2EE and Microsoft .NET are different from each other in many aspects, and definition of a generic component can become difficult. The Model Driven Architecture (MDA) specified by the Object Management Group (OMG) addresses this problem by specifying a platform independent model, from which a platform dependent model can then be derived. The platform dependent model is specified using UML with standardized UML profiles, approaches on applying the UML extension mechanisms, defined for different component architectures. (Emmerich 2002) Still, some experts feel that UML version 1 is fundamentally unsuitable for properly modeling component-based systems such as J2EE. The upcoming version 2.0 of the UML language specification will add features that should aid in modeling these systems. (Björkander & Korbyn 2003)

2.5.8 Web Services in J2EE In J2EE environment, web services can be developed using the following APIs. These APIs are not part of the J2EE 1.3 specification, but are obtainable as add-ons or readily provided by some vendors, such as by IBM in their WebSphere Application Server product (IBM 2004). •

SOAP with Attachments API for Java (SAAJ): provides low-level primitives for processing SOAP messages (Sun Microsystems 2003b)

•

Java API for XML Messaging (JAXM): provides light message-oriented primitives for XML-based messaging on top of SAAJ (Sun Microsystems 2002b). SAAJ used to be a part of the JAXM specification before being made its own specification.

•

Java API for XML-based RPC (JAX-RPC): provides web service RPC semantics with native Java data type conversions. This enables converting an EJB, or any JavaBean, into a web service by automatically generating a WSDL descriptor for it, or alternatively automatically generating Java classes to access web service specified by an existing WSDL descriptor. (Sun Microsystems 2002a)

Many current web service enabled software development tools, such as IDEs, seem to favor implementing the JAX-RPC API. With current tools it is possible to simply enter a URL for the WSDL descriptor into a web service enabled tool, which will then generate all Java code required to access this interface and the web service can thereafter be employed just like a regular 32

Java-based component many developers are most familiar with. Still, XML-based messaging is usually also possible using the JAXM and SAAJ APIs directly. In theory, when creating a web service one simply uses a web service enabled tool to create a WSDL representation from an existing Java component, such as an Enterprise JavaBean. In practice, this is not always quite so easy, as not all Java constructs, such as map-type structures, have a standardized XML representation and not all constructs are well suited for interoperating with other, non-Java web service implementations.

2.5.9 Alternatives Creating full enterprise-class distributed component-based software architecture takes significant amount of resources, so there is not much competition in this field. Currently, the only notable alternative to J2EE is the Microsoft “.NET” architecture (Meyer 2001).

2.6 Implementing Business-to-Business Integration There are a few scientific papers describing implementations of business-to-business integration cases or systems from a technical point of view. Most papers tend to focus on a single business or technical issue — the prototype system that was implemented is merely used to demonstrate validity of constructed solution to a particular sub-problem, and the implementation is not claimed to be otherwise optimal. In this subsection, relevant issues from these papers are reviewed. Architecture of previously reported integration systems is first presented. After that, tools and methodologies used to implement business-to-business integration and reported experiences on their use are reviewed.

2.6.1 Architecture of Previously Reported Systems Here, generic business-to-business system architecture models that were found from literature are first inspected. Then, approach taken by three previous research projects is analyzed in more detail; these were selected because architecture used in these projects is documented in sufficient detail. In addition, system architecture as specified by one commercial product, IBM WebSphere Business Integration Connect, is studied, as the product shares some underlying implementation technologies with this work — this is by no means an extensive description of all features available in that product, but merely a brief overview of its possibilities. 33

2.6.1.1 Business-to-Business System Architecture Models An important architectural concept is the mediator, or middleware, described by Chan et al. (2002). They describe a mediator-based architecture, applying the general idea proposed by Wiederhold (1992): use of domain-specific mediator applications as a method for developing multi-tier architecture, thereby avoiding a problematic situation with two-tier architecture where either too much responsibility is put to client applications, making updates difficult, or to the server application, consuming too much server resources. Chan et al. claim that XML is well suited for interfacing such mediator. All component-based architecture approaches reviewed here can be viewed as applications of the mediator approach. Pendyala et al. (2003) present a framework for XML-based message-oriented Enterprise Application Integration (EAI), integration of disparate applications within an organization. They present a framework, which is divided into three portions: 1. XML interfaces of enterprise applications. An XML interface might not be provided by the enterprise application itself, but might be a separate component, and might not be physically co-located with the respective enterprise application instance. 2. An enterprise application integration engine that provides core functionality, such as managing private process between applications and applying necessary transformations 3. An EAI portal, which is used to configure the engine This framework is illustrated in Figure 3. PDM

XML

ERP

XML

X

XML

EAI engine

EAI portal

Figure 3: EAI framework presented by Pendyala et al. (2003) as a UML component diagram

In business-to-business integration, this architecture has to be extended to support business-tobusiness protocol. Compared to the previously described model by Pendyala et al., two variants of placing this functionality can be found in literature (Shim et. al 2002, Medjahed et al. 2003): either the business-to-business protocol is implemented in the EAI component, or alternatively separate EAI and business-to-business components are used. The second approach is probably better in scenarios in which an EAI system is already in place and replacing it is not desirable, or 34

employing two independent components is desirable for other reasons. Using notation similar to Pendyala et al., these two approaches are illustrated in Figure 4. An important point that is not directly visible in the figure is who controls the private process: it is the role of the EAI engine component in both cases.

Figure 4: Two approaches to business-to-business integration architecture as UML component diagrams

The first approach is described by Shim et al. (2002). Their prototype system has basic knowledge of possible private processes encoded in custom XML format: it knows which message it should send next in response to external invocations and is able to invoke enterprise applications to obtain information required forming these messages. As for the second approach, private processes involving several applications are traditionally managed using ERP systems such as SAP, BAAN, PeopleSoft, workflow systems such as IBM MQSeries, or case-specific integration. These approaches often fail to address the need to synchronize application protocol with business-to-business protocol. One method to work around this is to implement process wrappers. Process wrappers are sets of predefined activities that can be used by private process of an organization to send/receive messages using a business-tobusiness protocol — in other words, representing business-to-business protocol in a way suitable for use by an organization’s private process. (Medjahed et al. 2003) If there is a one-to-one match between business-to-business protocol and application protocol, or only trivial conversions are required, a degenerate case of connecting a business-to-business engine directly to an enterprise application might be possible, without any entity controlling the private process. This approach allows connecting only one application to a business-to-business engine instance, as merely routing messages to applications based on their content is a primitive form of private process control. This approach is illustrated in Figure 5.

Figure 5: Third approach to business-to-business integration architecture as a UML component diagram

35

To sum it up, three system architecture approaches for business-to-business integration have been identified: 1. Single EAI/B2B component 2. Separate EAI and B2B components 3. Only B2B component 2.6.1.2 System Architecture of Previous Research Projects Here, three examples of system architecture of previous research projects are presented. These were chosen because sufficient data was available from these projects and each of them has a bit different goal and approach. The NetData project describes RNIF-enabling a single existing application, the work by Nurmilaakso et al. a system that is able to construct business-to-business messages from data from various corporate information systems, and the work by Sayal et al. RNIF-enabling an existing EAI component. As the first example, the architecture of the NetData project (Kotinurmi et al. 2004) conducted at Helsinki University of Technology Software Business and Engineering Institute is described. The purpose of this project was to integrate PDM systems using the RosettaNet e-business framework, featuring automated delivery of design document updates between trading partners to guarantee everyone always has the latest versions. Many issues had to be addressed in the project; here, the approach taken on messaging and related system architecture is inspected, which is described in more detail in a separate report (Laesvuori 2003). In the project, RosettaNet messaging was implemented using a commercial tool, Microsoft BizTalk Accelerator for RosettaNet version 2.0. Very briefly, the Accelerator includes a “partner agreement wizard” that can be used to set up an RNIF connection, a serializer/deserializer component that validates messages and converts between RNIF and an internal XML format, a messaging component, and an orchestration component that coordinates a PIP process. Integration to enterprise applications is done using facilities provided by Microsoft BizTalk Server 2002, which is used to run the Accelerator. A Microsoft product, the BizTalk Server 2002 is programmed using the Microsoft proprietary COM (Component Object Model) architecture. In the NetData architecture, see Figure 6 for an overview, a PDM system is fitted with an adapter component, developed using J2SE. It is responsibility of this adapter to transform messages originating from the PDM system into RosettaNet PIP business documents. The adapter delivers these business documents to a custom NetData system, called the RosettaNet (RN) adapter, using a SOAP web service interface; actual product data is transferred over HTTP. The RN adapter is 36

developed using Microsoft .NET architecture (Meyer 2001) and resides in a single assembly, the primitive unit of deployment in .NET. The adapter applies further transformations to data and saves it to disk and a relational database, from which it is subsequently picked up by another custom portion, an Application Integration Component (AIC) running inside the BizTalk server, converted into a RNIF business message, and sent. Path to the other direction is similar, but reverse. Trading Partner A

SOAP HTTP PDM adapter

Trading Partner B Microsoft BizTalk Server

.NET

RN out adapter

Initiator AIC

RNIF

SOAP HTTP Receiver AIC

RN in adapter

RNIF

PDM System

Hard disk

Microsoft SQL Server

Figure 6: Architecture of prototype system implemented in NetData project as a UML deployment diagram

Nurmilaakso et al. (2002) examined complementing existing EDI-based business-to-business integration with XML-based one in context of one company. They created a system that could perform various types of transactions related to supply chain integration, extracting or updating information in corporate information systems. They used xCBL for business document representation, and as xCBL only defines structure of the business documents, required messaging was to be performed in custom manner, by sending these documents over HTTP or SMTP. The prototype system, called the Communication Application, was developed using J2EE. Only a single web application module was used, without attempting to build a distributable J2EE component model of the application. See Figure 7 for an overview of the system. The Communication Application itself is a monolithic J2EE web application module, run in Jakarta Tomcat servlet container. Inside the organization, the application communicates primarily with corporate databases. The Communication Application receives interaction requests from clients over HTTP, analyzes them, and based on this analysis, launches configurable processing, which is a series of actions including database lookups or updates, communication with external applications, or transformations. For example, if a client wishes to obtain a list of purchase orders, it sends a request for this 37

over HTTP — the system fetches the list of purchase orders from database, transforms from EDI to XML, and returns the list to requestor. Therefore, the Communication Application manages the private process in this case. It is also very closely integrated to corporate information systems. Trading Partner B

Trading Partner A Jakarta Tomcat

Communication Application

HTTP HTTP

Database

Figure 7: Architecture of system by Nurmilaakso et al. (2002) as a UML deployment diagram

As the third example, an RNIF integration approach using an existing EAI component is presented by Sayal et al. (2001). They wanted to demonstrate seamlessly RNIF-enabling an existing system that relies on HP Process Manager workflow management system for coordinating interorganizational processes between applications. A workflow management system is a system that defines, manages, and executes “workflows” by executing software in configurable order (Hollingsworth 1995). Sayal et al. performed this by constructing a custom extension module to the workflow management system to handle RosettaNet messaging. Using this module, inter-organizational PIP processes were made available to the workflow management system, and could then be used the same way as regular processes within the organization. This corresponds somewhat to the process wrapper approach described by Medjahed et al. (2003). See Figure 8 for an overview of the architecture. Managing the private process and communicating with enterprise applications is left as responsibility of the Workflow Engine component of the HP product. Activating a business-to-business service in the Workflow Engine triggers the custom Trade Partners Conversation Manager (TPCM) component, which initiates the corresponding RNIF PIP instance, contacts the remote party, and returns the received response. TPCM performs necessary transformations between RNIF business messages and the format returned to the process engine: these transformation rules must be developed for each supported PIP type. TPCM also needs to keep track which service in the Process Manager initiates an outgoing request in order to route the response messages properly. 38

Trading Partner B

Trading Partner A

External Application TPCM

Workflow Engine

RNIF

RNIF

External Application

Figure 8: Architecture of system by Sayal et al. (2001) as a UML deployment diagram

2.6.1.3 System Architecture of a Commercial Product The IBM WebSphere Business Integration Connect (IBM 2003b) is a business-to-business integration product; latest version is 4.2.1. In addition to RNIF, the product supports also EDI, cXML, SOAP, and plain XML messages with HTTP(S), FTP, and SMTP transport protocols. Towards back-end applications, HTTP(S), JMS, and file system can be used to transfer RosettaNet business documents, EDI, cXML, SOAP, and plain XML. The product is available for Microsoft Windows operating system, and some Unix/Linux variants. This product has received the “RosettaNet Ready” badge for RNIF-compliance from the RosettaNet consortium, signaling that the product has been tested and found compliant with the RNIF specification. The product is designed to work in a “hub” model, in which one trading partner acts as a hub with which rest of the trading partners communicate. This set of communicating trading partners is called a “community”. The product consists of three main components: •

Receiver receives messages from trading partners and stores them into the file system for consumption by the Document Manager. Similarly, it receives documents sent within the organization and routes them to the Document Manager.

•

Document Manager obtains documents from the file system, typically placed there by the Receiver, validates them, and delivers them to their destination. In addition, message packaging/unpackaging, transformations, validation, encryption/decryption, and verifying digital signatures are done within this component. The Document Manager contains the following sub-components: o Document Processing Engine is responsible for operations related to a business message, such as packaging operations, transformations, validation, and cryptographic operations

39

o State Engine maintains state in business-to-business protocol dependent manner, handling e.g. retries o Alert Engine provides e-mail notification functionality in abnormal operational situations o Delivery Manager handles transporting messages to their destinations, including a destination-specific queuing mechanism •

Community Manager is a tool for monitoring flow of documents within a community

Both Document Manager and Community Manager require a Relational Database Management System (RDBMS) for persistent storage, IBM DB2 and Oracle products are supported. In addition, the WebSphere MQ product is required for JMS-based messaging between components. As they employ JMS for communication, all the components are independently deployable. They are also independently scalable for additional performance, using mechanisms provided by the underlying applications server. All components are built around the IBM WebSphere Application Server product and each includes an instance of that product. Applications can utilize services produced by WebSphere Business Integration Connect using the back-end interfaces it provides in Receiver and Document Manager components. See Figure 9 for an example deployment. If there is a need for more advanced control of private process, other IBM products, such as the WebSphere InterChange Server, an EAI product, can be integrated. Trading Partner A

Trading Partner B IBM WebSphere

Back-end application

SOAP

Document Manager RNIF

IBM WebSphere SOAP Receiver

RNIF

IBM WebSphere Hard disk

RDBMS Community Manager

HTTP

Figure 9: Example deployment of IBM WebSphere Business Integration Connect as a UML deployment diagram

40

2.6.2 Implementation Methods Here, implementation methods used for developing business-to-business integration components are described. First, general issues related to designing business-to-business software components are presented. Then, relevant development tools are listed. Finally, methods of implementing transformations between business document or message formats are introduced. 2.6.2.1 Design of Business-to-Business Integration Components High-level design, or internal architecture, of a software component used in business-to-business integration is typically only visible at source code level, not necessarily manifesting itself as externally visible software entities. In general software engineering, there are some wellestablished architectural styles (Garlan & Shaw 1994). In contrast, there are not very many papers specifying what internal architecture of a business-to-business integration component could look like. Sundaram & Shim (2001) propose a software infrastructure for RosettaNet business-to-business exchanges: their infrastructure is based on modules, each of which consists of a variable number sub-modules ordered sequentially. These sub-modules can be shared between modules, facilitating reuse. When an operation, such as one PIP activity instance, needs to be invoked, an entity known as “module manager” takes care of coordinating execution of proper sub-modules in order dictated by the corresponding module. This is illustrated in Figure 10.

Module Manager

Submodule 1

Submodule 2

Submodule 3

: Module Manager

: Submodule 4

: Submodule 1

: Submodule 2

Submodule 4

The Module Manager knows all submodules

In modules, the Module Manager executes submodules sequentially

Figure 10: software architecture by Shundaram & Shim (2002) illustrated as UML class and sequence diagrams

Something a bit similar is presented by Nurmilaakso et al. (2002). They have generic, reusable “processor” components that perform tasks such as database lookups or updates, communication with external applications, transformations, etc. The system is configured with “interactions” that specify a sequence of “operations”, each specifying which processor to execute and the configu41

ration to use for the processor. If a processor returns some output, this is fed to the next processor as input, in pipeline fashion. Nurmilaakso et al. report that one nice property of this approach is its flexibility: operation of the system is to some degree configurable by editing relevant configuration files that specify the interactions, without code-level changes. These approaches resemble the “pipes and filters” architectural style described by Garlan & Shaw (1994), the “pipeline” style to be more specific. In the pipes and filters style, a set of independent filter components transforms data, brought to them by pipe components. A pipeline is a specialized variant of this style, in which the sequence of filters is linear. However, filters in this style are defined by their inputs and outputs, so these module-based architectures might not match this style exactly. 2.6.2.2 Tools When implementing business-to-business integration, perhaps the first decision that needs to be made is whether to go with a commercial business-to-business integration product, or to employ generic software development tools. A commercial integration product should imply less development effort, if a suitable product can be found, but applying generic software development solutions offers more flexibility. Kotinurmi et al. (2004) performed the integration using the Microsoft BizTalk Accelerator for Rosettanet product, with some custom-developed components, whereas Sundaram & Shim (2001), Buxmann et al. (2002), Nurmilaakso et al. (2002), and Shim et al. (2002) relied on J2EE. It is not clear in which situations each approach is preferable. Commercial business-to-business integration software packages are often large integration products providing much functionality. They typically support multiple e-business frameworks, and often address EAI issues in addition to business-to-business integration. For example, commercial products that support the RosettaNet e-business framework include: •

BEA WebLogic Integration (BEA 2003)

•

Fujitsu Interstage Integration Manager (Fujitsu 2003)

•

IBM WebSphere Business Integration Connect (IBM 2003b)

•

Microsoft BizTalk Accelerator for RosettaNet (Microsoft 2004)

•

webMethods Integration Platform (webMethods 2003)

As of generic software development tools, there are currently two major architectures available for developing enterprise software components: J2EE and Microsoft .NET. These two are very broad component-based software development environments — there are many possible ways to 42

approach the problem using these architectures. E.g. Kotinurmi et al. (2004) applied .NET, as they were developing extensions a Microsoft product, other systems such as those reported by Sundaram & Shim (2001), Buxmann et al. (2002), Nurmilaakso et al. (2002), and Shim et al. (2002), utilize J2EE. None of these sources justifies their choice of implementation tools with technical arguments, nor do they attempt to demonstrate utilizing their possibilities to maximum extent, e.g. none of the papers that applied J2EE attempted to create a distributable J2EE component model. J2EE implementations are available from many vendors. In addition to the main software development tool, there often are smaller software components that may be obtained from different vendors, such as XML-related tools, e.g. the Apache Xerces XML parser and Xalan XSLT processor that are mentioned by Buxmann et al. (2002). For example, a J2EE application might utilize tens of such smaller components, so there are many choices to be made when selecting tools. 2.6.2.3 Transformation Methods Transformations from one business document or message format to another are required e.g. when transforming messages from an internal format of an application to that of another application, from application’s format to that of an e-business framework, or between e-business frameworks. These transformations are often done from one XML schema to another. Internal data format of an application might not be XML-based, but it is often possible to convert from a non-XML data format into XML representation as a separate step (Rawlins 2003). Buxmann et al. (2002) present a transformation approach where they have a predefined classification of XML documents, such as invoice, catalog query, delivery confirmation, etc. In these classifications, conversions are made between internal business document format of an organization and those of e.g. xCBL, OAGIS, and eBIS-XML e-business frameworks. Generally, for conversions between multiple business document formats, Buxmann et al. present the following three approaches; see Figure 11 for an illustration. •

Ring structure: for formats A, B, C, and D, make conversions A-B, B-C, C-D, D-A, start from the source format, and apply these conversions in sequence. Advantage is only two additional conversions per format; disadvantages are slow speed because of many conversions and requirement of lossless conversions.

•

Conversion for all possible cases is the fastest possible approach. The disadvantage is requirement of creating n(n-1) conversions for n supported messaging formats in general case. 43

•

“Super standard”, to/from which all formats are converted. Advantage is only two conversions per message format, disadvantage is the inherent difficulty in creating such a standard that encompasses all possible cases for all other standards A

A

B

D

B

D

C

C

A

B

S

D

C

Figure 11: Three transformation approaches identified by Buxmann et al. (2002). Top left, the ring structure. Top right, conversion for all cases. Bottom, the super standard (marked with “S”) approach.

XSLT is proposed by Buxmann et al. as means to perform these XML-to-XML conversions. XSLT was used to similar purpose also by Nurmilaakso et al. (2002), Pendyala et al. (2003), and Kotinurmi et al. (2004). Rawlins (2003, p. 12) asserts that XSLT might not be the most efficient possible solution, so it is not well suited for situations where attaining maximum performance is essential. Indeed, it is not likely that any general-purpose solution can attain maximum performance in a specific conversion. Shim et al. (2002) present another idea, using the Java programming language as a method of mapping arbitrary business documents to their RosettaNet equivalents. This is done through a master object model: a Java object hierarchy, to and from which all messages are converted, employing a third-party tool to translate between a Java object structure and arbitrary XML documents using a custom schema language. The paper does not compare this approach to XSLT, so it is not entirely clear what the relative benefits of these two approaches are. One could assume that a solution based on a Java object hierarchy would be able to express complex arbitrary trans44

formations, while being more time consuming to create in general, due to greater expressive power of the Java programming language and its less specialized syntax. But then again, many XSLT implementations can be extended with custom extension functions written in e.g. Java.

45

3 Requirements for the Prototype System This section identifies the requirements that were placed to the prototype system. The general problem solved by the prototype is first briefly defined. Then, an overview of the proposed solution is given. After that, functional and technical requirements are derived based on the solution overview and goals that were placed to the prototype system in the introductory section.

3.1 Problem Definition The prototype system has two high-level goals. The first is to serve as a learning platform for students of information and communication technology enabled commerce. Using this platform, students are able to practice realistic business-to-business integration scenarios with RosettaNet. If students are only given generic software development tools, e.g. IBM WebSphere, it takes quite a bit of familiarizing with the tools and some software development work before any RNIF messages can be exchanged. Using this type of platform, basic RNIF-related tasks, such as sending and receiving of business messages, are made relatively easy for students, so that they can concentrate on the actual objectives of their assignments, which are often related to e.g. semantics of the messages being exchanged or transformations that are required. The second goal is to create a system that could theoretically be useful in a realistic enterprise systems integration scenario: a software component that provides RNIF-compliant messaging functionality, which is a required part of any RosettaNet-based integration. This problem definition still leaves too much room for choice to formulate concrete requirements, as it is possible to envision many ways to implement RNIF if no other requirements are posed. More contextual information must be defined before starting to formulate a concrete solution: what services need to be provided and how are they accessed. Previous experience from research conducted at the NetData project at Helsinki University of Technology Software Business and Engineering Institute was used as a guideline when defining the research problem in more detail. This information was obtained from discussions with NetData project members and by inspecting project reports. System architecture models inspected in previously identify the business-to-business integration functionality as a separate middleware-type software component. According to prior research

46

conducted at NetData, RNIF implementation as a middleware-type component has worked in the past (Kotinurmi et al. 2004). This approach is also chosen for this work. Another question is what interfaces this middleware should have. This is important, as the value produced by an RNIF messaging middleware component can be thought as the difference in ease of implementation for applications implementing its interface, compared to implementing RNIF messaging directly. The goal is to have as simple interface as possible within the organization, but provide security and reliability features for inter-organizational communication, where such features are most necessary. It is obvious that at least an RNIF interface is required. Again, according to previous work at the NetData project, the web services technology has been proven applicable for this type of interface in the past (Laesvuori 2003). Currently available web service tools are significantly more mature and easier for the developer than those available to RNIF. Web services are platform independent, which also makes them an especially good candidate. In addition, while web services technologies have received a lot of publicity, there is limited reported experience on applying them in practice. Web services are used here as a general-purpose RPC mechanism, not for business-to-business integration. Based on these additions, a more detailed problem definition can be formulated: the system should implement RNIF as a middleware component that exposes an RNIF interface outside the organization and a web service interface within the organization. After placing these additional requirements, a solution can be formulated.

3.2 Solution Overview The prototype system is a middleware component that exposes an RNIF-compliant interface to applications outside the organization and a SOAP Web Service (WS) interface within the organization. Applications within the organization use the WS interface by supplying it RosettaNet business documents and related metadata for outgoing business message and receiving business documents and metadata from it for incoming messages. The middleware is able to create RNIF business messages from business documents and metadata and vice versa, perform business message validation, routing, and provide services such as digital signatures and message encryption. Figure 12 illustrates a system fulfilling these requirements.

47

Trading Partner B

Trading Partner A IBM WebSphere SOAP

RNIF

service application

prototype RNIF

SOAP

Proprietary SOAP 1.1 w/attachments interface, PIP XML documents and routing infromation

RNIF 2.0 interface

Prototype provides: SOAP/RNIF message conversion Message validation Message routing Encryption/decryption Digital signatures

Figure 12: Overview of the integration system as a UML deployment diagram

Viewed in context of system architecture approaches identified previously, the middleware component can be used as a separate B2B component. It can also be used as an EAI/B2B component, if required internal process is limited to routing messages between applications.

3.3 Requirements Based on solution overview of the previous subsection and the goals for the solution listed in the introductory section, concrete functional and technical requirements are formulated here.

3.3.1 Functional Requirements Based on the solution overview, the following requirements functional are set for the prototype. 1. It should be a middleware component, having the following interfaces: •

“RNIF” interface: implementing RosettaNet Implementation Framework (RNIF) version 2.0 specification (RosettaNet 2002b)

•

“WS” interface: proprietary web service interface implemented using SOAP 1.1 (W3C 2000a) with Attachments (W3C 2000b)

2. It should be able to route messages between these interfaces in RNIF-to-WS and WS-toRNIF directions based on message receiver and sender information 48

3. The proprietary WS interface should allow sending and receiving all valid RNIF business messages in both asynchronous and synchronous manner. The prototype needs not maintain state related to the PIP activity in progress other than in context of a single business message. 4. RNIF encryption and digital signatures should be supported 5. The system should have a web-based monitoring tool for testing and debugging purposes

3.3.2 Use Cases Based on the above requirements specification, following primary use cases are identified: 1. Incoming message from WS interface to RNIF interface a. Asynchronous: a business message is sent, without receiving any immediate response of the result of the operation from the recipient b. Synchronous: a business message is sent, and possibly an acknowledgment or a response message is obtained from the recipient within the same operation 2. Incoming message from RNIF interface to WS interface a. Asynchronous b. Synchronous

3.3.3 Technical Requirements In addition to the functional requirements, the following technical requirements are set based on goals listed in the introductory section. These are not absolute requirements, but design goals that should be facilitated to degree possible. 1. Implementation using J2EE 2. Easy installation with minimal configuration: the prototype should require entering only minimal configuration information required for performing its function, entered only once. This applies to both installation and operation of the prototype. In other words, the prototype should resemble a product, not a custom system built for a single integration case.

49

3. J2EE platform independence to degree possible. The prototype must work with IBM WebSphere Application Server version 5.1 under Red Hat Enterprise Linux 3 operating system — these were chosen because of availability of the software. Operation with other application servers and operating systems is desirable. 4. Appropriate level of performance with modern server hardware: a. Maximum size of business messages that can be transferred: should be at least in order of 10 megabytes, with one gigabyte of available system memory. This should be enough for many use scenarios, as e.g. PIP 3A4 business messages are typically less than 100 kilobytes in size, but might not be enough for all scenarios, as attachments having size of over 100 megabytes are realistic in some use scenarios (Kotinurmi et al. 2004). b. Message throughput should be at least in order of one message per second, with a single-CPU system having a 2 GHz AMD Athlon XP or equivalent CPU. This is an arbitrary number specifying a level of performance that is enough for at least some operational scenarios; it would correspond to an organization sending or receiving 28800 business messages during business hours within one day.

50

4 Prototype Design and Implementation High-level design and implementation of the prototype is described in this section. Moving from the externally visible interface of the prototype to its internal implementation, the section starts by describing the external interfaces the prototype exposes to applications, especially the web service interface; see Figure 12 for an overview of how these interfaces are connected to external systems. After that, internal software architecture of the prototype, its internal components and their connections, is described, followed by description of the J2EE implementation of this architecture. Then, a brief overview on writing client applications that use the prototype is given, as this is important in assessing advantage the prototype provides. Finally, tools that were used in creating the prototype are described and a summary of effort required constructing the prototype is presented.

4.1 External Interfaces This subsection describes the external interfaces exposed by the prototype middleware. The RNIF interface is defined by the RNIF specification. For the web service interface, first, operations provided by this interface are described. After that, technical aspects of how WSDL and SOAP technologies were used are elaborated.

4.1.1 RNIF Interface The RNIF interface is defined by the RNIF specification (RosettaNet 2002b), so it needs not be described here in detail. The implementation allows delivering both asynchronous and synchronous messages and supports encryption and digital signatures.

4.1.2 Web Service Interface 4.1.2.1 Overview The prototype middleware provides operations to send and receive RNIF business messages over a SOAP web service based interface. The interface is symmetric for bidirectional communication, these operations are used for messages both sent and received by a client: the interface is implemented by the prototype for outgoing messages, and a client application must implement 51

the interface to be able to receive any incoming messages. By using these operations, it is possible to send and receive all valid RNIF business messages. This interface is described in Figure 13 as a UML class diagram. The interface is XML-based: classes in the diagram correspond to XML schema element definitions. To be more specific, classes with attributes correspond to XML elements containing other elements, the names of contained elements are defined by attributes’ names and their content by schema definitions referred by attributes’ types, analogous to W3C XML Schema complex type definitions. Classes with «datatype» or «enumeration» stereotype correspond to elements containing textual data, analogous to W3C XML Schema simple type definitions. The UML “composition” relationship means that the element with the black diamond symbol encloses the other element. Inheritance corresponds to inheritance in W3C XML Schema type hierarchy. Elements defined by the RNIF specification are denoted by the “rnif” namespace, those from the RosettaNet Business Dictionary by the “bd” namespace, and XML Schema data types with the “xsd” namespace — definitions for these elements are not present in the diagram. Some details present in actual XML encoding are omitted for simplicity.

52

«interface» RosettaNetPortType +submitAsynchronous(in header : BusinessMessageHeader, in body : xsd:string, in attachments : xsd:string) : Status +submitSynchronous(inout header : BusinessMessageHeader, inout body : xsd:string, inout attachments : xsd:string) : Status +failureNotification(in header : BusinessMessageHeader, in failureNotification : FailureNotification) : Status

BusinessMessageHeader

FailureNotification

+sender[1] : EndPointDescriptor +receiver[1] : EndPointDescriptor +initiator[0..1] : Initiator +fromRole[1] : rnif:fromRole +toRole[1] : rnif:toRole +fromService[1] : rnif:fromService +toService[1] : rnif:toService +isSecureTransportRequired[0..1] : xsd:boolean +messageTrackingID[1] : rnif:messageTrackingID +BusinessActivityIdentifier[0..1] : rnif:BusinessActivityIdentifier +inReplyTo[0..1] : rnif:inReplyTo +ActionIdentity[0..1] : rnif:ActionIdentity +SignalIdentity[0..1] : rnif:SignalIdentity +GlobalUsageCode[0..1] : rnif:GlobalUsageCode +ProcessIdentity[1] : bd:ProcessIdentity

+ActionControl[1] : rnif:ActionControl +fromRole[1] : bd:fromRole +ProcessIdentity[1] : bd:ProcessIdentity +reason[1] : Reason +toRole[1] : bd:toRole

1

«enumeration» Status +OK = OK +ACK = ACK +RE = RE

«exception» MalformedFault

«exception» UndeliverableFault

«exception» ServerFault

1 1 «datatype» Reason

«datatype» xsd:string

1 2 «datatype» UniformResouceLocator

EndPointDescriptor +GlobalBusinessIdentifier[0..1] : rnif:GlobalBusinessIdentifier +UniformResourceLocator[0..1] : UniformResouceLocator

1

0..1

Initiator +GlobalBusinessIdentifier[1] : rnif:GlobalBusinessIdentifier

0..1

Figure 13: Data model of the web service interface of the prototype as a UML class diagram. The interface is XMLbased: classes in the diagram correspond to XML schema element definitions.

4.1.2.2 Operations The “RosettaNetPortType” interface corresponds to the web service interface through which all operations are accessed. Calls to this interface are encoded using SOAP RPC conventions. If a JAX-RPC or equivalent SOAP RPC tool is used, a class hierarchy resembling one in the figure can be automatically generated. Summarized, the operations provided by the interface are: •

submitAsynchronous: send a business message asynchronously. Client specifies a header structure as an argument to the operation, used to construct headers of the RNIF business message, an arbitrary XML business document, which depends on the PIP, and optionally any binary attachments that are to be sent. Operation returns a status code, indicating that the message is structurally acceptable — if it is not, a SOAP fault called “MalformedFault” is triggered. Possible status codes are “OK”, indicating general acceptance, and “ACK”, indicating that receipt of the message should be acknowledged to sender: if this code is returned by a client application when receiving a message, the middleware will send a proper acknowledgment business message.

53

•

submitSynchronous: send a business message synchronously. This operation is rarely needed in practice, as most current PIPs employ asynchronous communication exclusively. The operation takes the same parameters as submitAsynchronous, except all parameters are in/out — after the call they contain a synchronous response, which can be a response message, a receipt acknowledgment, or empty.

•

failureNotification: send a failure notification business message, initiating an instance of PIP 0A1. Failure notification is a special case of asynchronous submission. Therefore, this operation is not strictly speaking necessary, and is only available for convenience, as most RosettaNet business processes may initiate PIP 0A1 in case of failure and thus can benefit from easier syntax this operation provides as the user is not required to specify the whole PIP 0A1 business document but a more compact proprietary data structure. This functionality could also be achieved using the submitAsynchronous method with a suitable business document.

As described above, the prototype middleware applies no transformation to business documents themselves, except for the special case of extended support for PIP 0A1. For outgoing RNIF business messages, the middleware creates business messages from application-supplied business document and metadata, and correspondingly for incoming business messages, the metadata and business document are extracted and provided to the application. 4.1.2.3 WSDL Descriptor There is a WSDL descriptor available for the web service interface, which can be used to connect to the prototype middleware using web service enabled software development tools. The descriptor is designed so that it forms an RPC-based API, from which a programming language native API can be automatically generated with suitable tools. To facilitate use especially with the Java programming language, the interface obeys the constraints specified in the JAX-RPC specification. As previously specified, there are three communication patterns to choose from when using SOAP with WSDL: RPC/encoded, RPC/literal, and document/literal. Since this application is a programming API, the RPC “style” of WSDL SOAP binding is used. As the application is very XML-oriented, many definitions used in the WSDL descriptor come directly from the RNIF specification, literal is more intuitive and efficient choice for the “use” variable, as XML-toXML conversions can be applied when converting from this interface to RNIF headers and vice versa without an unnecessary encoding/decoding phase at the server. Therefore, the communica54

tion pattern used in the prototype is RPC/literal. Performance tests reported by Cohen (2003) also support this choice of RPC/literal pattern.

4.2 Internal Software Architecture Internal software architecture of the prototype middleware is presented here, i.e. how the prototype is constructed from its internal components. First, static structure is presented, giving an overview of software architecture. This is followed by a description of how the use cases that were defined are implemented.

4.2.1 Static Structure The prototype architecture was defined from scratch using established architectural styles: the prototype software architecture, illustrated in Figure 14, can be viewed as an instance of the pipes and filters architectural style, the pipeline style to be more specific, as described by Garlan & Shaw (1994). HTML

configuration

SOAP

WS-in

console

InboundPort

InboundPort

RNIF-in

RNIF

router

WS-out

OutboundPort

OutboundPort

RNIF-out

OutboundQueue

OutboundQueueProcessor

Figure 14: Static structure of internal software architecture of the prototype as a UML class diagram

Most important components of the architecture are the four filters “WS-in”, “WS-out”, “RNIFin”, and “RNIF-out”. Input filters marked with the “InboundPort” interface accept incoming messages and initiate processing in the prototype. Output filters marked with the “OutboundPort” interface deliver outgoing messages from the prototype system to their recipients. These 55

interfaces only define the behavioral contract carried out by the filters and do not directly correspond to e.g. Java programming language interfaces. The Web Service (WS) interface consists of two filters, “WS-in” and “WS-out”, which transform between SOAP and the internal business message representation of the system. Similarly, the RNIF interface consists of two filters, “RNIF-in” and “RNIF-out”, which transform between the internal representation and the RNIF business message format. As can be seen in Figure 14, one benefit of this architecture is that the filters are not aware of each other. The internal representation of business messages in the prototype is an unpackaged RNIF business message. The message is unpackaged to keep the message packaging logic in the RNIF filter, to allow for possibility of developing other interfaces besides the WS one. This choice of intermediate format means that the RNIF filters need to perform RNIF message packaging/depackaging, which may involve cryptographic operations, whereas the WS filters need to perform data type mapping to and from the WS interface format. During message processing, messages enter the system through one of the input filters. The “router” component decides which output filter to employ, based on an XML-based routing table, which is manually entered as configuration to the prototype. The routing table includes information on possible destinations: which filter handles them and configuration parameters, such as URL address of the recipient, whether encryption or digital signatures should be used, etc. In addition, a table of routing rules is included: these rules determine to which destination to route a message, based on message receiver and sender information. The router can route any message to either WS or RNIF destination, meaning that all four possible routing cases, WS-to-RNIF, RNIF-to-WS, WS-to-WS, and RNIF-to-RNIF, are supported. The “OutboundQueue” component is a queue that stores asynchronous messages after they have passed the router but before being sent to an output filter. The “OutboundQueueProcessor” reads that queue and sends messages onwards to output filters previously chosen by the router. A queuing mechanism is employed to facilitate asynchronous processing; it also allows balancing system load. In addition, there is a web-based “console” component for a human user to monitor and manage the system and a “configuration” component for maintaining the persistent configuration of the router: routing table, local system identity, etc.

56

4.2.2 Implementation of Use Cases The implementation of previously defined use cases is presented here. First, implementation of use case 1A, WS-to-RNIF asynchronous, is illustrated in Figure 15. : WS-in

: router

: OutboundQueue

: OutboundQueueProcessor

: RNIF-out

submitAsynchronous() submitAsynchronous()

submitAsynchronous send accepted

Figure 15: Implementation of use case 1A as a UML sequence diagram

In this use case, a message is first received through the WS interface, the SOAP operation identifier indicates request for asynchronous processing. The message is then transformed into an unpackaged RNIF business message and moved forward to the message router, which fills in the destination filter identity and other routing information from its routing table, moves the message onwards to the outbound queue, and returns control to the caller. Eventually OutboundQueueProcessor is activated for this message. It reads in the routing information put there by the router, which specifies that the output filter will be the “RNIF-out” filter and other information including at least the destination URL. As the message is destined to an RNIF destination, and this is an asynchronous operation, the asynchronous send method of the filter is called, and the filter packages the message and sends it to its destination over HTTP. Similarly, implementation of use case 1B, WS-to-RNIF synchronous, is illustrated in Figure 16.

57

: WS-in

: router

: RNIF-out

submitSynchronous() submitSynchronous() submitSynchronous() send response

Figure 16: Implementation of use case 1B as UML sequence diagram

In this case, a message is received through the WS interface, the SOAP operation identifier indicates request for synchronous processing. The message is then transformed and moved forward to the router, which fills in the destination filter ID and other routing information, and as the message was addressed to an RNIF destination, activates the “RNIF-out” filter, the synchronous send method of which is called, and the filter sends the message on its way. The synchronous response is then delivered back to original sender. Note the somewhat simpler implementation of the synchronous case, no queuing is required, but also significantly increased duration of blocking calls, as the sender has to wait until the whole send operation is performed to receive the synchronous response. Uses case 2 A and B are similar to 1 A and B, except exchange roles of RNIF and WS.

4.3 J2EE Implementation Here, J2EE implementation of the architecture is presented. First, J2EE modules that are used are introduced, and the most important ones, the message router and the external interfaces, are described in more detail.

4.3.1 J2EE Modules The prototype is divided into multiple J2EE modules. This is illustrated in Figure 17 below.

58

«web application» console

«web application» ws

«ejb» ws

«web application» rnif

«ejb» router

OutboundPort OutboundPort

«ejb» rnif

«ejb» configuration

Figure 17: J2EE module structure as a UML component diagram

These implementation modules correspond to the conceptual architectural components illustrated in Figure 14. One by one, these modules are: •

Router (EJB archive) provides the message router functionality. This module also handles internal queuing required for asynchronous message processing, implementing “OutboundQueue” and “OutboundQueueProcessor” components in the architecture.

•

WS (web application) implements the “WS-in” filter, exposing a web-based interface for remote applications

•

WS (EJB archive) provides the “WS-out” filter. It exposes the OutboundPort Java interface, through which it is accessed by the message router when any messages need to be sent to WS destinations.

•

RNIF (web application) implements the “RNIF-in” filter, analogous to the WS web application

•

RNIF (EJB archive) implements the “RNIF-out” filter, analogous to the WS EJB archive

•

Configuration (EJB archive) abstracts a set of configuration files in the local file system, providing persistent configuration for the router module

•

Console (web application) renders an HTML user interface, enabling a human operator to perform configuration and monitoring tasks to the system

59

4.3.2 Router Implementation The router module implements the router component and the related queuing functionality. Routing information is obtained from an XML-based routing table, which is entered manually by the user as configuration information. XPath expressions are used to query the table. Queuing for asynchronous messages is implemented using a JMS-based queue provided by the application server. The queue is persistent, so queued messages are not lost if system needs to be shut down. This queue is monitored by a message driven EJB, which send the messages submitted to the queue to their destinations.

4.3.3 WS Implementation WS subsystem, consisting of the EJB archive and the web application, implements the WS-in and WS-out filters. The main task of this subsystem is transformation from SOAP messages, with attachments, to unpacked RNIF business messages and vice versa. This involves transforming business message header metadata to RNIF header documents and copying the business document and the attachments as-is, except for the 0A1 PIP, which has its own custom data types in the SOAP interface and therefore transforming also the business document is required in this case. These XML-to-XML transformations are performed using XSLT. After the transformation, the resulting XML documents are validated using schemas expressed in W3C XML Schema language. These schemas were created by converting from the DTDs issued by the RosettaNet consortium to XML Schema notation and manually inserting additional content restrictions not supported by DTD mechanism, obtained from respective message guideline documents. Validating only against the DTD is not enough to fulfill the minimum validation requirements set in the RNIF specification (RosettaNet 2002b, section 2.1.2.2). The SOAP interface is developed using the JAXM API. JAX-RPC is not used, as the required conversion from a SOAP web service message to an RNIF business message is an XML-toXML one: intermediate mapping to Java data types is not required, or desirable. Still, the web service interface is designed to be suitable for RPC-based access and for use with JAX-RPC clients.

60

4.3.4 RNIF Implementation RNIF subsystem, consisting of the EJB archive and the web application, implements the RNIF-in and RNIF-out filters. This involves packaging/unpackaging the business message, and applying encryption/decryption and digital signatures, as specified in the RNIF specification. Incoming messages are validated using the same W3C XML Schema documents the WS subsystem uses for outgoing messages.

4.4 Client Applications This subsection describes creating applications that communicate with the prototype and use its services for RNIF-based communication — the purpose of the prototype is to make creating such applications easier. A client application connects to the prototype middleware using the web service interface. Such an application is typically concentrated on performing a specific role in an activity defined by a particular PIP specification. A client application can be an adapter, connecting an existing enterprise application to the prototype middleware for the purpose of RNIF-enabling it, or alternatively an enterprise application can be a client without a specific adapter component. During testing the prototype, several dummy client applications were written. After a general introduction on structure of these applications is given, the issue of managing activity state is discussed in more detail, as it proved to be one of the more involved issues in designing these applications. This is followed by some example source code.

4.4.1 Structure A client application has callbacks for processing incoming business messages and is able to send business messages on its own, so it must implement both client and server for the web service interface. Such application can either initiate a PIP instance by sending an RNIF business message to a remote trading partner, or alternatively wait for an initiating message from the remote, depending on whether the application has the initiating role in the PIP. In both cases, the application needs to control the state of the RNIF activity it is currently executing, so it can react to response messages, including acknowledgments and exceptions, appropriately. If implemented using J2EE, a client application might consist of a web application implementing the web service server part and another component implementing the web service client. 61

4.4.2 Activity Management The prototype provides means to send and receive messages, which are stateless operations. This still leaves the issue of maintaining state of an RNIF activity, which must then be done by client applications. This might not be very easy, as it involves processing acknowledgments, replies, timeouts, etc. While low-level implementation of timeouts etc. is a significant task in itself, it is very case specific. A high-level approach for implementing the required state management, based on state machines, is presented here, as this is required for all client application implementations. The approach is presented in form of generic state machines, which can be implemented in any programming language. Two examples of such state machines are given. These are derived from the RNIF specification (RosettaNet 2002, section 2.6). These two types of state machines are enough to handle all state management that is required for controlling types of activities specified in the RNIF specification. A client application can implement either a generic state machine controller, or a specific state machine for running a particular PIP. As an example of a single-action PIP, for PIP 3A7, Notify of Purchase Order Update, retry count is defined to be three. Initiator’s state machine for controlling this activity is represented by the state machine of Figure 18.

Figure 18: State machine of PIP 3A7 initiator’s activity controller as UML statechart diagram

An example of PIP 3A4, Request Purchase Order, an asynchronous two-action activity, initiator’s state machine is given in the diagram in Figure 19. Note the added complication caused by

62

the fact that the response message may arrive before or after the acknowledgment — both cases should be treated equally.

Figure 19: State machine of PIP 3A4 initiator’s activity controller as UML statechart diagram

Receiver of any asynchronous message is required to remember identity of all messages it has “seen” in a reasonable time, in order not to act upon the same message twice, as transport protocol is allowed to retransmit at will. Duplicate receptions must be acknowledged without any other action. This functionality is not explicitly modeled in the state machines presented.

4.4.3 Example Source Code Fragment A code-level example of a client application is presented here, to give a feel on how accessing the prototype looks like at Java source code level. The following code fragment uses the web service interface of the prototype middleware to send the first message of PIP 3A4, Request Purchase Order. This is the first message that is sent in Figure 2. The XML business message is given as an argument to this code fragment; the fragment generates necessary headers and sends the message. Note the use of automatically generated Java classes from the WSDL descriptor: although web services are used, the developer is only required to use a Java-based interface.

63

private void send3A4(final RosettaNetPortType port, final GlobalBusinessIdentifier to, final Source xml) throws RemoteException, MalformedFault, ServerFault { Role role; GlobalPartnerRoleClassificationCode gprcc; Service service; GlobalBusinessServiceCode globalBusinessServiceCode; BusinessMessageHeader header = new BusinessMessageHeader(); // Leave sender information to defaults header.setSender(new EndpointDescriptor()); // Set recipient information EndpointDescriptor receiver = new EndpointDescriptor(); receiver.setGlobalBusinessIdentifier(to); header.setReceiver(receiver); // Sender global partner role is hard coded, obtained from PIP specification role = new Role(); gprcc = new GlobalPartnerRoleClassificationCode("Buyer"); role.setGlobalPartnerRoleClassificationCode(gprcc); header.setFromRole(role); // Recipient global partner role is hard coded, obtained from PIP specification role = new Role(); gprcc = new GlobalPartnerRoleClassificationCode("Seller"); role.setGlobalPartnerRoleClassificationCode(gprcc); header.setToRole(role); // Recipient service code is hard coded, obtained from PIP specification service = new Service(); globalBusinessServiceCode = new GlobalBusinessServiceCode("Buyer Service"); service.setGlobalBusinessServiceCode(globalBusinessServiceCode); header.setFromService(service); // Sender service code is hard coded, obtained from PIP specification service = new Service(); globalBusinessServiceCode = new GlobalBusinessServiceCode("Seller Service"); service.setGlobalBusinessServiceCode(globalBusinessServiceCode); header.setToService(service); // Use 123 as message tracking ID. Real implementation should generate a unique ID. MessageTrackingID messageTrackingID = new MessageTrackingID(); org.rosettanet.www.InstanceIdentifier instanceIdentifier = new org.rosettanet.www.InstanceIdentifier("123"); messageTrackingID.setInstanceIdentifier(instanceIdentifier); header.setMessageTrackingID(messageTrackingID); // Business activity identifier is hard coded, obtained from PIP specification BusinessActivityIdentifier businessActivityIdentifier = new BusinessActivityIdentifier("Request Purchase Order"); header.setBusinessActivityIdentifier(businessActivityIdentifier); // Business action identity is hard coded, obtained from PIP specification ActionIdentity actionIdentity = new ActionIdentity(); GlobalBusinessActionCode globalBusinessActionCode = new GlobalBusinessActionCode("Purchase Order Request Action"); actionIdentity.setGlobalBusinessActionCode(globalBusinessActionCode); header.setActionIdentity(actionIdentity); // Process identity consists of PIP code and version, obtained from PIP specification, // and instance identifier ProcessIdentity processIdentity = new ProcessIdentity(); GlobalProcessIndicatorCode globalProcessIndicatorCode = new GlobalProcessIndicatorCode("3A4"); processIdentity.setGlobalProcessIndicatorCode(globalProcessIndicatorCode); processIdentity.setVersionIdentifier(new VersionIdentifier("02.02.00")); // Use 456 as PIP instance ID. Real implementation should generate a unique ID. processIdentity.setInstanceIdentifier(new InstanceIdentifier("456")); header.setProcessIdentity(processIdentity); // Send: header we just generated, given XML payload, no attachments port.submitAsynchronous(header, xml, null); }

64

The above example just sends a message, and does not perform proper activity control such as error handling or retries, which are required from a valid RNIF implementation. A real application should handle these.

4.5 Tools Used in Implementation Here, J2EE tools that were used during the implementation are listed. Except for WebSphere tools, which are commercial products from IBM, all other tools are open-source. •

IBM WebSphere Application Developer version 5.1 was used as an IDE during development

•

The following two application servers were used for running the system: o IBM WebSphere Application Server (WAS) version 5.1, implementing J2EE version 1.3 and J2SE version 1.4 specifications o JBoss version 3.2.3, implementing J2EE 1.3 and J2SE 1.4, was used as a secondary application server to verify generic J2EE interoperability. It was chosen for practical reasons: availability of the software and some previous experience on its use.

•

Following software libraries were used: o Apache Xerces version 2.6.2 XML processor for XML parsing and validation with XML schema languages XML DTD and W3C XML Schema is accessed through its JAXP (Java API for XML Processing) and DOM (Document Object Model) version 3 APIs. DOM version 3 enables in-memory schema validation — this feature is useful for performance reasons when validating dynamically generated XML documents. o Apache Xalan-Java version 2.6.0 XSLT processor for transforming, generating, and serializing XML content is accessed through its JAXP and DOM version 3 APIs, except for its XPath component, which is accessed through Xalan native interface: there is currently no standard J2EE interface with full XPath support including e.g. variables

65

o Bouncy Castle Crypto APIs version 1.22 provide the encryption components required, in addition to those bundled with J2SE 1.3, for SMIME cryptographic operations used in RNIF encryption and digital signing mechanisms o Apache Axis version 1.2 Alpha provides the SOAP runtime. Although still at Alpha stage, this version fixes some critical bugs present in version 1.1. o Apache XML Commons Resolver version 1.1, for OASIS XML catalog support, used to seamlessly replace XML documents available from Internet with corresponding local copies for added performance o Apache Commons HTTP Client version 2.0. The default J2SE 1.3 HTTP client has two main issues: it does not support setting connection-level timeouts, potentially causing applications to wait indefinitely for a TCP connection to open if the initiating IP packet got lost somehow, and it does not allow sending arbitrarily large HTTP requests, as it wishes to buffer the whole request in-memory before sending. The Apache Commons HTTP Client does not have these issues. It is used in the RNIF interface. The WS interface uses an internal HTTP client of Apache Axis. o Apache Commons Pool version 1.1 is used in some situations for pooling objects, i.e. storing references to reusable objects hoping they will be used in the future to save on object creation overhead o JSP Standard Tag Library (JSTL) version 1.0 implementation by the Apache Project provides some useful constructs for creating markup language based user interfaces. It is used in the web console. •

For creating XML-based J2EE descriptors in vendor-independent manner, the XDoclet 1.2 tool was used at compile time. XDoclet allows a developer to declare attributes for Java classes and methods in source code, such as that a class is an EJB or that a method is an EJB interface method, and generate required J2EE XML descriptors automatically at build-time. The advantage is that the descriptors are automatically generated and always in synchronization with source code. This is a vendor-independent alternative to similar tools provided by many vendors in IDEs.

66

4.6 Implementation Effort In this subsection, effort spent on various phases of implementation is summarized, followed by a summary of relative source code size of implemented software components.

4.6.1 Development Time The development work was done during period of December 2003–March 2004. Total implementation effort required was in order of a bit less than three person-months. Very roughly, the division of work to tasks is described in Table 3. Table 3: implementation work divided into tasks

Task

≈ Required Effort

Requirements gathering, reading related documentation

1 week

Software design

1 week

Installing development environment software

1 week

Implementing the system

6 weeks

Installing test environment software

1 week

Implementing test applications, testing the system, fixing bugs

4 weeks

Before the project, the developer was well familiar with J2EE, but only knew general principles of RNIF and web services and had no prior experience of IBM WebSphere products.

4.6.2 Prototype Size The prototype consists of about 3900 Non-Commented Source Statements (NCSS) of Java code. Division to subsystems is presented in Table 4.

67

Table 4: Sizes of software subsystems as Non-Commented Source Statements (NCSS)

Subsystem

≈ NCSS

Shared code

750

Configuration

240

Business message processing

550

Business message router

560

RNIF client/server

550

WS client/server

990

Web console

270

This table does not tell the full truth about relative amounts of effort required to create these types of subsystems. When designing subsystems a piece of functionality cannot always be clearly placed into one subsystem, but multiple alternative locations would often be equally good — e.g. when designing interfaces between subsystems, some responsibilities can sometimes equally well be placed to either side of the interface. The prototype also contains other types of resources, such as schema documents and stylesheets, which reduce need for Java code and are not included in figures in Table 4.

68

5 Testing the Prototype In this section, testing that was performed to evaluate the prototype is described. First, the RNIF compliance tests that determine the level of RNIF-compliancy of the prototype are described: attaining RNIF-compliance is a primary requirement for this work. Then, experiences of the student group that used the prototype are elaborated: the purpose of this testing is to gain some indication on the practical suitability of the prototype. After that, interoperability testing with another RNIF implementation is described: this was done both to obtain additional evidence on the RNIF-compliance of the prototype and to see whether any interoperability issues would appear. Finally, results of performance testing are given, first in terms of maximum message size that can be transferred, and then in terms of message throughput obtained, to get a rough overview on what kind of performance can be gained with this type of solution For each type of testing, there is an introductory section, a description of test setup, a section describing obtained results, and a section summarizing the results.

5.1 RNIF-Compliance 5.1.1 Introduction Basic RNIF-compliance was tested with the RosettaNet Ready™ Self-Test Kit, available from the RosettaNet consortium. The kit itself is a runner for PIP-specific test scripts, also available for many PIPs from the RosettaNet consortium. The purpose of this testing is to verify that the prototype can interoperate with other RNIFcapable software at least in basic level. These tests give some indication on this, as all possible PIP types are tested. Still, not all possible operational situations, such as all combinations of header values, are tested.

5.1.2 Setup Test scripts for RosettaNet test PIPs 0C1, 0C2, 0C3, and 0C4 were used, as these four PIPs test all four basic types of PIPs: the asynchronous single-action, asynchronous two-action, synchronous single-action, and synchronous two-action, respectively, while having a reasonably simple business document, so no unnecessarily complex test applications need to be created. 69

For the test PIPs, all test cases that tested non-corrupted mes-

Table 5: Test cases tested with

sages were ran. These contained test cases with both the self-test

RosettaNet Ready™ Self-Test

kit and the tested application being the sender and the recipient.

Kit. Test case numbers refer to

Some test cases tested error handling by not sending proper re-

test scripts available from the

plies, verifying that the application will properly initiate notifica-

RosettaNet consortium.

tion of failure. Thirteen tests cases were ran in total. In addition,

PIP

encryption, digital signatures, and attachments were tested

0C1 0000

OK

0001

OK

0002

OK

type during its construction. Specialized test applications were

0C2 0000

OK

created for the purpose of these tests to implement the test PIPs.

0001

OK

0002

OK

0003

OK

0C3 0000

OK

5.1.3 Results

0001

OK

As more and more functionality was added to the prototype,

0002

OK

0C4 0000

OK

0001

OK

0002

OK

against the self-test kit in ad-hoc manner. Interoperability testing was initiated at very early development stage and was used as a primary testing mechanism for the proto-

The test applications communicated with the prototype using its web service interface. They generated business documents required by the self-test kit and maintained the activity state.

more and more tests began to pass. Finally, all tests passed. Tests that were run are summarized in Table 5.

5.1.4 Analysis

Test Case

Status

Based on these tests, it would seem reasonably safe to assume that the prototype system is able to communicate with other RNIF-compliant applications to some degree with all PIP types.

5.2 Practical Usability 5.2.1 Introduction Practical usability of the prototype system was tested by a student group in the spring 2004 T86.301 (Project Course on ICT Enabled Commerce) course at Helsinki University of Technol70

ogy. The group explored transforming XML-based SAP IDOC, the proprietary interchange format of the SAP enterprise resource planning system, documents to RNIF business messages and vice versa. This problem was approached by developing a system that performed mapping from ORDERS05 IDOC documents into RosettaNet PIP 3A4 purchase order request messages, and to the reverse direction from PIP 3A4 purchase order confirmation messages to IDOC documents. The prototype system was employed in this scenario to provide the RNIF messaging portion of the developed system. (Ajalin et al. 2004) In context of this project, the purpose of this testing is to obtain some indication on suitability of the prototype in practice by applying it to a scenario somewhat resembling those that the prototype is supposed to be used in. Although one test case performed in laboratory conditions will not give much indication on how suitable the prototype is for solving problems in general, except if the result is very negative, it will give an example of one scenario to which the system is applicable. At least some indication is obtained of suitability of the prototype’s interface and general ease of use.

5.2.2 Setup The student group consisted of four undergraduate students at Helsinki University of Technology. The students were nearing their graduation, with backgrounds including studies of computer science, automation and systems technology, and industrial management. One student had prior experience of using the IBM WebSphere products, but not with web services. The student group had a preconfigured development environment, containing workstations equipped with the IBM WebSphere Studio Application Developer IDE, and a server running IBM WebSphere Application Server. The group used the prototype system in “black-box” manner; the prototype was preconfigured for their needs and the group only employed it through interface described by a WSDL descriptor. First, the group got two hours of hands-on training and corresponding written instructions on how to use the prototype and the WebSphere development environment. During development, email support was available for the group. After the implementation phase, the group filed a report (Ajalin et al. 2004) describing their experiences. Then, the group was briefly interviewed in an unstructured manner: first, the group was asked how they felt about the prototype, and then some additional questions were placed regarding specific issues they had mentioned in their report. 71

5.2.3 Results 5.2.3.1 System The system created by the student group is illustrated in Figure 20. As can be seen, all communication with the SAP product is performed through directories in hard disk: outgoing IDOC documents are placed to one directory by SAP and incoming documents are correspondingly fetched from another directory. The IDOC documents are produced by the SAP Business Connector component, which outputs them in XML format. SAP IBM WebSphere Dummy trading partner client

SAP out adapter SOAP

prototype

SOAP

Hard disk SAP in adapter

SOAP

SOAP

Dummy trading partner service

Figure 20: System created by the student group as a UML deployment diagram

The “SAP out” adapter performs conversion from SAP IDOC format to RosettaNet PIP 3A4. This is an XML-to-XML conversion and is implemented using XSLT. This message is sent to the prototype, which converts it to an RNIF business message and sends in to a “dummy” service, which represents another trading partner. The service eventually returns a valid response to PIP 3A4, which is then routed back to the “SAP in” adapter, which is an inverse of the SAP out adapter. Although the prototype is not interfaced with RNIF in this installation, the purpose of this testing not being to test the RNIF interface, messages are still internally transformed to RNIF business messages in the prototype. 5.2.3.2 IBM WebSphere The group was satisfied with quality and features of IBM WebSphere products and they did not have any major issues with them. Development aids, such as graphical tools for editing various XML descriptors of enterprise application, were found useful. The group first applied the WSAD XSLT editor, a tool that allows mapping two XML documents and generating XSLT transformations. It did not always manage to cope with situations where the XSLT document was modified outside the tool, and ultimately did not manage to produce some of the more complex transformations that were required, so it was replaced with another tool. Although tools helped to cut manual XSLT coding effort, the group felt that in order 72

of 90 % of effort went to establishing mappings, i.e. analyzing relevant specifications to determine which fields were equivalent, so the save in effort was not overly significant. The group found employing web services in WSAD using the web service and web service client generation wizards easy, but they commented that using a Java interface automatically generated from a WSDL descriptor might not have been the best approach in their situation. Since data from SAP was in XML format and SOAP message body is XML, an XML-to-XML conversion using XSLT would be easiest and most straightforward. They felt WebSphere does not provide same level of support for XML-based messaging over SOAP that is provided for accessing SOAP-based web services in RPC fashion through a Java interface. XML-based messaging is possible in WSAD, but requires the developer to employ SOAP APIs directly without advanced support from the tool. 5.2.3.3 Prototype The group did not encounter any major problems with the prototype. The training they were given was sufficient, the group only required e-mail support in solving arrangement issues such as server downtime, etc. They were satisfied with the interfaces and functionality the prototype provided, and found the system reasonably easy to use. The group decided to leave error handling, dealing with retries, acknowledgments, etc., outside the scope of their work, as implementing proper error handling in the application was deemed too big an effort. When introduced the idea of including this functionality within the prototype middleware, the group thought this would be desirable. 5.2.3.4 Effort The group estimates that with this approach, if sufficient knowledge on both implementation technologies and domain is available, it is possible to create mappings between an IDOC and a PIP in about a week. This requires that example message content is available. Creating adapter software components and other programming tasks required something like two weeks of effort for the group. More effort is required for setting up and testing the system. 5.2.3.5 Applicability of Approach The group felt that data in PIP 3A4 messages and in the SAP IDOC was very similar, and because of this, the transformations were straightforward. They suspected this was both because SAP, a founding member of RosettaNet, had readily spent some effort in making this type of transformation possible, and because a purchase order is a common and well-understood domain. 73

The group thought that RosettaNet PIP 3A4 documentation could benefit from additional concrete examples on what all the fields should contain, to not risk misinterpretation. In this case, the internal process consisted only of transformations. If there had not been a 1-1 mapping of messages between application protocol and business-to-business protocol, the adapters would be required to control the internal process and probably maintain some kind of internal state. As pointed out in the group’s report, a more complex solution would be required for this kind of functionality.

5.2.4 Analysis From this testing, it seems the prototype is reasonably easy to learn, its interface is usable, and the general approach is applicable to at least this type of integration scenario. IBM WebSphere tools can be successfully employed to at least this type of integration scenario without major problems.

5.3 Interoperability 5.3.1 Introduction The purpose of this testing is to test RNIF interoperability between the prototype system and a commercial product, Microsoft BizTalk Accelerator for RosettaNet version 2.0. Note that version 3.0 of the Accelerator has recently received the “RosettaNet Ready” badge for RNIF compliance from the RosettaNet consortium, indicating that the product has been tested and found RNIF-compliant, so any issues found in version 2.0 might no longer apply to this newer version. The goal of testing is twofold: to further test RNIF compliance of the prototype system, and to see if any interoperability issues exist even though the two implementations claim to follow the same specification. If any issues are found, it can be analyzed what needs to be changed in the prototype to facilitate interoperation. This testing is not able to show that the two implementations are able to interoperate in all scenarios, but it will give some indication that interoperability is possible.

74

5.3.2 Setup Only basic RNIF messaging-level interoperability was tested. This testing was done by having the prototype implementation send initial business message of PIP 3A4 to Microsoft BizTalk and verify that it would pass all validation checks and that an acknowledgment was received. Similarly, a message was sent from BizTalk to the prototype and it was verified that all validation was successfully passed and the message was acknowledged. This very simple test setup is illustrated in Figure 21. Trading Partner A

Trading Partner B

IBM WebSphere

prototype

Microsoft BizTalk Server

RNIF

RNIF

Figure 21: Test setup for interoperability testing

5.3.3 Results The first issue found out before actual testing was that BizTalk implemented version Release 02.00.00 of PIP 3A4 dated 19 April 2001, while version Validated 02.02.00 dated 13 August 2002 was implemented by the first version of test application. The schema for the two versions was so different that neither could read messages produced by the other. This was solved by downgrading the test application to employ the schema used by BizTalk. When sending business messages from BizTalk to the prototype system, the first issue that was encountered was that BizTalk does by default not fill in with data all header elements that are mandatory in RNIF specification, thereby violating this specification: the message headers were valid according to the relevant RNIF DTDs, but not according to message guidelines documents. The first test message caused 17 such validation errors to appear. This issue was anticipated, as it was also reported by Laesvuori (2003). The issue was worked around by changing the prototype’s validation of incoming RNIF messages to produce warnings that were logged, instead of aborting message processing. In addition to this, some default values, used in case of missing header values in the business message, had to be added to the prototype in portions of code that relied on the business message having valid content. After that, messaging worked satisfactorily.

75

When sending from the prototype to BizTalk, BizTalk refused to process a message that had its “Content-Type” headers folded onto multiple lines, which is an optional feature of the HTTP protocol (IETF 1999) and of the MIME specification (IETF 1996). This was fixed by removing such folding from the prototype. In addition, BizTalk seemed to require the “MIME-Version” header to be present in MIME entities other than top level. This header is not at all mentioned in the RNIF specification in context of HTTP transport binding (RosettaNet 2002b section 2.4.2.1). It is declared optional for the top level in the HTTP protocol and it is required by the MIME specification, but only for top level. Therefore, the proper use of this header is a bit unclear.

5.3.4 Analysis Based on this testing, it can be seen that established commercial RNIF implementations might not always follow the specifications to the letter, and there are areas where correct behavior is not clear. Therefore, it seems safe to assume that interoperability issues such as those described above are possible when communicating between RNIF-capable products from different vendors. The prototype managed to communicate with BizTalk after minor modifications, giving further evidence of the prototype’s RNIF interoperation capability.

5.4 Performance: Maximum Message Size 5.4.1 Introduction The purpose of this testing is to measure how large business messages can be processed. Most often, the size of transferred business messages is not a major issue, as e.g. PIP 3A4 uses messages that are in order of 100 kilobytes in size. However, in other applications such as transferring design documents, business messages with size of over 100 megabytes might be required (Kotinurmi et. al 2004). The XML documents that are manipulated need to be loaded in memory at once, as they are accessed as in-memory DOM (Document Object Model) trees, a model in which every XML entity corresponds to a programming language object. Current RosettaNet XML documents are sufficiently small not to provide significant burden on today’s server computers, in which it would be reasonable to assume an application would have in order of one gigabyte of system memory in 76

its disposal. Attachments are another story, as they are arbitrarily large binary entities. Therefore, also the MIME container, which contains the attachments, can be arbitrarily large.

5.4.2 Setup It was decided not to try to analyze memory consumption of only the prototype application, but of the whole application server running the application. Separately monitoring memory usage of individual applications and the application server would require a detailed analysis on how the application server uses memory and the results would still be ambiguous, as the application server can, in many situations, be understood to use memory for the application. Still, the application server continuously performs also unrelated background processing that consumes memory. In addition, because of garbage-collected memory management of the Java programming language, the total memory usage grows between invocations of the garbage collection and shrinks by some amount when the garbage collector runs. Perceived memory consumption depends therefore somewhat on the garbage collector algorithm. (Wilson 1992) These two factors add random variance to the results. This means the figures obtained are very rough, but as relatively large business messages were used, they should give some indication on what scale the memory consumption is. As the prototype is not optimized for memory consumption, this type of testing should be adequate. There are at least two ways to go when obtaining memory usage statistics from a Java virtual machine. The first is to collect profiling information from the virtual machine using its internal profiling tools. The second is to observe total peak memory usage as logged by the garbage collector. Both of these end up taking samples of memory allocation at some intervals. The first approach would be able to provide detailed information on how individual components use memory, but it also slows down the program execution. As only the total amount was required, the second approach was chosen. Memory usage was measured just after application server startup, then one PIP 0C1 business message with a large attachment was processed, and peak memory usage was measured. The initial usage was subtracted from the peak value to obtain a portion that would give an indication on maximum amount of memory consumed by message processing. The results were rounded to nearest ten megabytes to emphasize their inexact nature.

77

Both digital signing and payload encryption of the message were enabled. A self-signed X.509 certificate with a 1024-bit RSA key was used for both signing and encryption. The message digest and encryption algorithms were MD5 and 3DES, respectively. The tests were run under Microsoft Windows 2000 operating system, and Sun JDK 1.4.2 with non-incremental garbagecollection, under the JBoss application server.

Table 6: Peak memory usage rounded to tens

5.4.3 Results

of megabytes as function of message size

Sending a business message to the WS-to-RNIF direc-

Message size

≈ peak memory usage

causes the memory usage peak of about 180 megabytes

27 MB

180 MB

when going through the system. Other measurements

9 MB

60 MB

3 MB

20 MB

1 MB

10 MB

tion, with a single attachment of 27 megabytes in size

listed in Table 6 indicate that the growth is approximately linear. This is expected, nothing in the implementation should cause more or less rapid growth. Results in Table 6 are approximately linear, almost exact linearity of these figures is a coincidence caused by rounding.

To the other direction, memory allocation peaks of around 120 megabytes were detected for the 9-megabyte message, so memory consumption is about twice as that in the other direction. This is also an expected result considering the implementation.

5.4.4 Analysis Predicting maximum memory usage exactly with empirical measurements is difficult, but some rough approximations can be made. The prototype application consumes in order of 6-7 times as much system memory as the size of the business message that is being processed in WS-RNIF direction, twice that when going to the other direction. With optimization, these numbers could no doubt be reduced, but the important thing to note is the linear growth of memory requirement.

78

5.5 Performance: Message Throughput 5.5.1 Introduction The purpose of this testing is to measure the throughput, in number of messages per second, of the prototype application in a typical use scenario. In addition, the relative execution speeds of processing steps are measured.

5.5.2 Setup 5.5.2.1 Total Throughput Overall performance was measured by sending the prototype a series of 1000 PIP 0C1 business messages that contained one attachment, and measuring the mean execution time per message. Total message size was about 32 kilobytes, including all RNIF headers and packaging. Many business messages that occur in practice are larger than this, for example, PIP 3A4 messages are typically in order of 100 kilobytes in size, but on the other hand, e.g. acknowledgment messages are significantly smaller. Both digital signing and payload encryption of the outgoing message were enabled. A selfsigned X.509 certificate with a 1024-bit RSA key was used for both signing and encryption. The message digest and encryption algorithms were MD5 and 3DES, respectively. The tests were run on a server with single AMD Athlon XP 2700+ CPU at 2.17 GHz, running RedHat Linux 9 operating system, and Sun JDK 1.3.1, under the JBoss application server. The system had no other user processes during testing, but the operating system was not especially configured to prevent it from running normal background tasks. The messages were sent sequentially. Increasing concurrency was attempted in trials conducted when planning the test setup, but this did only reduce performance. On a single-processor computer this was expected, as most of the processing done by the prototype, such as XML transformations and encryption, is by its nature mostly CPU-bound10 and increasing concurrency only increases overhead. During testing, CPU-occupancy stayed at 90–100 % level. A small batch of

10

As opposed to I/O-bound: observed level of performance is limited by speed of the Central Processing Unit, not

by I/O operations. Processes doing mostly internal data manipulation with relatively small data sets fall into this category.

79

tests was run before ones that were measured, allowing loading necessary resources to memory in advance. 5.5.2.2 Relative Performance Measuring relative performance levels of individual portions of the application proved to be surprisingly difficult. First, it was attempted to measure these from a “live” system in context of the overall throughput testing described previously. This failed, for the following reasons: •

Times are measured using operating system internal clock, which can typically not accurately measure short intervals. The test system was only able to measure time at 10millisecond precision, while at least millisecond precision is required for these measurements.

•

Application servers employ multithreaded execution, and the threads of execution nondeterministically pre-empt each other, producing random variation in observed performance

•

The garbage-collected memory management model of the Java programming language causes that if one execution phase of a program is inefficiently implemented, and allocates a large amount of objects, the performance penalty might appear during following execution phases, when the garbage collector makes its next pass. A garbage-collection pause can take hundreds of milliseconds, so the effect is significant.

To circumvent these, the testing was done in unit-test fashion, running the appropriate modules in a custom test application, with testing parameters and environment identical to the total throughput test. Each test run contained 10 sequential invocations of each test step and only the average time was measured, which brought the desired millisecond precision to timing. The test application only featured concurrent execution, so variance caused by multitasking was reduced. Garbage collection was explicitly invoked before each test run and enough system memory was allocated so another garbage collection was not necessary before the next test run. The test runs were repeated 100 times, so there were 1000 invocations of each test step. Minimum, median, mean, and standard deviation of execution time is reported for each test step. For each step, there is a well-defined minimum value determined by raw CPU speed; larger values are caused by random variations from e.g. other tasks launched by the operating system. Therefore, the minimum might be the best indicator for true relative execution speeds. As only millisecond accuracy was obtained for timing, values of few milliseconds or less are not very reliable. 80

5.5.3 Results Total request processing times and execution times of phases are reported in Table 7 and Table 8 below. Note that division of processing into phases is ambiguous, as e.g. extracting a piece of information from a message can often be done in many alternative phases. In addition, per-phase times are obtained running the phases independently of each other and do not include communication etc. overhead associated with moving data between execution phases, unlike the time for total performance, which includes this overhead. Total performance was not measured in perrequest basis, so minimum, median, and standard deviation values are not applicable. Table 7: Execution times into WS-to-RNIF direction

Step

Minimum

Median

8.8 ms

9.2 ms

10.6 ms

4.8

SOAP-to-RNIF conversion

39.2 ms

41.1 ms

45.2 ms

9.9

Validating RNIF message

24.0 ms

25.0 ms

30.2 ms

15.4

1.7 ms

1.8 ms

1.9 ms

0.5

149.3 ms 151.7 ms

6.4

Parsing SOAP message

Routing RNIF packaging, encrypting, and digitally

144.8 ms

Mean Deviation

signing Total request processing

N/A

81

N/A

307 ms

N/A

Table 8: Execution times into RNIF-to-WS direction

Step

Minimum

Median

0.4 ms

0.5 ms

0.5 ms

0.1

25.1 ms

25.8 ms

30.0 ms

7.7

36.5 ms

39.9 ms

45.1 ms

16.8

Routing

7.0 ms

7.6 ms

8.8 ms

3.8

RNIF-to-SOAP conversion

6.8 ms

8.1 ms

9.4 ms

3.7

N/A

N/A

269 ms

N/A

Parsing RNIF message Verifying digital signature, decrypting RNIF

Mean Deviation

de-packaging Validating RNIF message

Total request processing

5.5.4 Analysis Because of numerous uncertainty factors related to these empirical measurements, this data does not warrant very throughout analysis. These very general statements about performance can be made: •

With modern server hardware, the prototype system can attain consistent message throughput of about 2–4 business messages per second with RNIF messages of about 32 kilobytes in size and cryptography/validation enabled. Given the inexact nature of testing, it can be said that processing a constant transaction loads of one message per second or less should be relatively easily attainable.

•

XSLT conversions from SOAP to RNIF format and vice versa are relatively slow — the SOAP-to-RNIF direction even more so, as the conversion is more complex. Custom Java code might be more efficient than XSLT.

•

Full W3C XML Schema validation is used for RNIF headers, and it takes some time. This could be disabled for added performance, although this is not allowed by the RosettaNet specification (RosettaNet 2002b, section 2.1.2.2).

•

Encrypting and digitally signing a message takes comparatively lots of time

82

6 Discussion In this section, the prototype system is first analyzed in terms of quality of the result and methodologies that were used. Then, experiences gained during the implementation project on both RosettaNet and relevant tools are reported. Then, relation of this project to previous work is discussed. This is followed by analysis on reliability of research results, and finally some possible directions for future research are given.

6.1 Prototype System This subsection discusses the prototype system that was created. First, the quality of the system that was created is assessed, first in terms of high-level goals set in the introductory section and then in terms of more detailed requirements that were defined. Then, benefits of the design approach are evaluated, followed by evaluation of the whole development approach. Finally, possible future development directions for the prototype system are given.

6.1.1 Prototype Quality 6.1.1.1 High-Level Goals The prototype was successfully used in one student project that resembles the intended use scenarios as a learning platform. There should be no problem attempting similar scenarios in following courses. The goal of serving as a learning platform is therefore reached. The second goal, creating a realistic business-to-business integration system, cannot be properly verified, as no testing in real-world scenarios was performed. As for the additional goals that were set for the prototype in the introductory section, ease of use cannot be verified for its vague definition. However, the student group did not require other assistance than the initial instructions they were given, and did manage to create a working RNIF application, so at least the interface is usable to some degree. Other goals, RNIF-compliance, portability, and scalability, are more concrete and discussed in context of requirements below. 6.1.1.2 Requirements Functional requirements were reached as far as they were validated by RNIF compliance, practical usability, and interoperability testing that was performed. As for technical requirements: 83

1. J2EE was successfully used as the implementation technology 2. The goal of easy installation cannot be verified for its vague definition, but it was reached to some degree. Installing the prototype consists of deploying a single binary installation file to the application server, and creating necessary resources such as messaging queues using application server specific mechanisms. In addition, some runtime library files related to XML parsers need to be updated in the application server installation. Runtime configuration involves entering parameters required for RNIF communication through the web-based configuration interface. With the JBoss application server, it would be possible to create an installation package that would require no mandatory configuration, except that required by RNIF, during a new installation. 3. Operation with IBM WebSphere and JBoss application servers was verified with both Linux and Microsoft Windows 2000 operating systems. Therefore, the goal of generic J2EE interoperability was reached at least for these two application servers. There are no known issues preventing operation with other application servers. 4. The performance goals that were set were reached, as verified by performance-related testing. The issue of performance is discussed further below. 6.1.1.3 Performance During performance testing, it was observed that the memory requirement of the prototype is linear in proportion to the business message size that is transferred. The J2EE environment can be made to handle arbitrarily large binary entities with constant memory requirement by using stream-oriented access exclusively. This means that all operations to such data are done incrementally, by only having a fixed-sized portion of data in memory at once, while the bulk of data is kept in a back-end store of larger capacity, e.g. hard disk. J2EE provides functionality such as generic stream APIs and stream-oriented file access, so it is not very hard to develop stream-oriented access, although it might take some effort to modify an algorithm to be suitable for incremental processing of data. For attachments, the process is trivial, as these are not actually processed by the prototype, except being copied from data structure to another. Stream-oriented access support for MIME containers is provided by J2EE. However, for the stream-oriented access to work properly, all software components that somehow access the data must employ it exclusively. This can be an issue especially with software libraries from tools, which might take their input from a stream but still buffer the whole data inmemory. This is the case with current version of Apache Axis: some components of Axis do not 84

employ stream-oriented access exclusively, which means that all attachments must be loaded in memory during SOAP message serialization. In addition, the Bouncy Castle SMIME implementation seems to load complete data for encrypted message portions in memory. Therefore, although the code developed during this project uses stream-oriented access, employed tools effectively limit the maximum size of attachments, and thereby business messages. Using the current implementation approach, it is therefore required to impose an upper limit for the size of business messages that can be safely transferred, as running out of memory has a tendency to put the whole application server into an unstable state, often requiring a system restart. If the prototype implementation approach was used as-is, a production-quality system would need to guarantee that the sum of sizes of messages in-transit at a given moment of time never exceeds this limit. This might prove to be difficult in practice. In addition, the maximum size of a business message that can be transferred would be limited to some portion of the available system memory. Therefore, enhancing tools to require only constant amount of memory would be highly desirable. Is the level of transaction processing speed obtained satisfactory? Transaction processing speeds required in practice depend on the use case. To bring these numbers into perspective, based on testing that was performed it seems that performance level for messaging begins to be a problem when approaching a load of more than one transaction per second. A constant load of one transaction per second during business hours within one day would mean processing 28800 transactions a day, which would be about 14400 business messages excluding acknowledgments. A large business could receive this many purchase orders in a day, most businesses would not. However, it is easy to imagine situations that create much larger transaction volumes, e.g. if RosettaNet is applied to synchronizing distributes databases with “notification” or “query” type messages. Of course, by applying distribution mechanisms provided by applications server, additional transaction processing speed could probably be gained. This approach was not attempted in this work. In addition, the prototype was not particularly optimized for speed, although unnecessarily inefficient solutions were avoided. If the messages transferred contain large attachments, this has some effect on transaction processing speed as well — this was not tested.

6.1.2 Prototype Design Software architecture for the prototype is straightforward. A notable thing about it is that adding another external interface type, besides web service and RNIF, should only require minor 85

changes to existing components, if any at all. This could be useful in implementing other external interfaces similar to the current web service based one, changing the prototype to use other ebusiness frameworks than RosettaNet would require deeper changes, as the unpacked RNIF business message is used as the internal data presentation format. In this aspect the current architecture would seem to match the “super standard” approach for converting business documents presented by Buxmann et al. (2002), with unpackaged RNIF business message as the super standard — but as long as all conversions are to/from RNIF, the “conversion for all cases” approach is effectively matched. Regarding J2EE design, one feature gained by using multiple J2EE modules is that the prototype is scalable by distributing these components to multiple processing nodes, or server computers, if such scalability is supported by the application server product in use. A single component can also be distributed to multiple nodes. Due to some low-level implementation issues in the prototype, not all distribution scenarios are equally possible and some guidelines must be followed. Even when maximum scalability is not required, using J2EE components helps to keep borders between application’s functional portions clear.

6.1.3 Development Approach It has been shown during this work that performing business-to-business integration using these tools is indeed possible. The following question is in which situations the approach taken in this work would be desirable: although it is possible, is developing business-to-business integration systems with generic software development tools cost-effective. Based on this work, no definitive answer can be given. After spending nearly three months of development effort, what was created was a prototype that functioned and was tested to some degree, but would no doubt still require more extensive testing and additional features before attempting anything resembling production use. Some portion of development time was spent on learning the tools, so another similar system would require less time, but the investment in effort is still substantial. On the other hand, the relatively low-level approach that was taken with the prototype, working with entities such as XML parsers and J2EE components, has one major advantage: a generalpurpose programming language allows for maximum flexibility. No compromises are required for creating interfaces to external applications, such as the web service interface in the prototype. If the integration case at hand requires some very specialized data mapping to be applied, this is a simple matter of additional programming effort; there are no inherent limitations on what kinds of transformations are possible. 86

In context of a single integration case, if there is an off-the-shelf commercial integration tool available that will satisfy all requirements, that tool should probably be preferred instead of spending the development effort. However, if available commercial integration tools would require extensive customization, the situation is no longer so simple: these customizations might well provide to be more difficult than creating a custom system. If the prototype was developed further, and used in several integration cases, for example as an in-house tool for an integration solutions developer, it would make the approach more costeffective and perhaps provide some competitive advantage compared to employing an off-theshelf commercial tool because of added flexibility. By spending enough effort, it would also be possible to develop the prototype into a “shrink-wrapped” integration tool.

6.1.4 Improvement Possibilities At its current state, although the prototype system works as expected, it must still be classified as a research prototype rather than an integration product. Here, some possible approaches to enhance the prototype are listed. Configuration interface. The user should probably not be required to employ e.g. XML documents to configure trading partners and routing, but a web-based configuration interface should be developed to perform these tasks. Logging. Incoming PIP messages should be stored persistently to e.g. hard disk, for monitoring and non-repudiation purposes. This could be easily implemented in the message router. Validation. The prototype does validate all incoming and outgoing RNIF header documents using a W3C XML Schema, which should be able to catch most errors that are possible to detect without any additional information about message content. Still, the error messages produced by this validation do not always make much sense to the user of the web service interface, as they refer to the transformed RNIF XML documents instead of the original SOAP message. Either another XML Schema should be created that would independently validate the SOAP messages, or custom Java code should be added to perform such validation. Exception handling. Currently, the prototype does not provide any support for sending RNIF exceptions, although they may be sent using the generic business message sending mechanism. As exceptions are mandatory in some situations, support similar to what currently exists for executing PIP 0A1 should also be added for exceptions.

87

Activity management. It would be possible to manage RNIF activity state on behalf of client applications. As this can be modeled using state machines, it would be possible to devise a software layer that sits between a client application and the stateless operations, and maintains these state machines on behalf of the application. The easiest, naïve approach would be to implement these in separate threads of execution, but this would result in very many concurrent threads, which would consume too much system resources on many operating systems. A more practical solution might be to store state of state machines in persistent storage, such as a relational database. The system would need to scan this database for every incoming message and implement some form of timer functionality for the time-based transitions. Operations exposed to applications would involve starting the message sending process and reporting the results, instead of having the applications to deal with individual messages and associated timeouts and retries, etc. PIP-specific services. The method that is used to generate the RNIF business message headers using custom data types in the web service interface could be used to define content for PIP business documents — this is already done for PIP 0A1, Notification of Failure. This would enable creation of business documents using an RPC-based API without need to employ XML. Whether or not this is desirable depends on the context: if the data is readily available in XML format, an XML-to-XML conversion would probably be preferable, whereas if the business message is being generated from scratch, an RPC-based web service API could be preferable for it would allow more compact client code.

6.2 Experience from Prototype Implementation In this subsection, experience gained from the implementation phase is discussed: first, experience of RosettaNet specifications and then of available tools.

6.2.1 RosettaNet Specifications 6.2.1.1 Validation During this work, it was noted that the approach taken in RosettaNet specifications to specifying structure of XML documents, DTD files with message guideline documents, might not be optimal. A DTD accompanied with message guidelines is not suitable for automatic generation of validators that would satisfy minimum validation requirements placed in the RNIF specification (RosettaNet 2002b, section 2.1.2.2), so such validators need to be created manually. Creating

88

validators manually is error-prone, as it leaves room for personal interpretation of specifications and plain erroneous implementation. As a formal partial validation mechanism, DTD, exists, it seems some developers feel stricter validation to be optional, although it is required by the RNIF specification. This became apparent during the project, as an established commercial product allows generating messages that pass DTD validation but are otherwise malformed. If communicating parties need to be prepared to process such malformed messages, this leads to situation where everyone involved must turn validation features off, and much of the benefit of having message guidelines is lost. Therefore, it would seem to be advantageous if RosettaNet specifications started to employ W3C XML Schema, or some other widely used XML validation mechanism with more expressive power than DTDs. Although such formalism might not be able to capture all conditions that need to be placed, it is clear that better alternatives than DTD exist for this purpose. It would also be desirable that developers of commercial RosettaNet-enabled software packages employed stricter validation for outgoing messages to disallow sending malformed messages. 6.2.1.2 PIP Specifications It was observed that existing software packages might employ various older versions of PIP specifications that are incompatible with the most up-to-date versions. Therefore, one has to be careful to select which versions to implement. If communication between multiple trading partners implementing different versions of the PIP specifications is desired, transformations between these PIP versions need to be derived and maintained, which requires additional effort. The lists of enumerated values in PIP specifications raised some questions. For example, during the project, a new version of PIP 3A4 was released, adding amongst other things new global partner classification codes and “Telecommunication Industry” as a new global supply chain code. However, no such change has so far been applied to e.g. PIPs of the testing cluster. Until all relevant PIPs are revised, only some subset of PIPs is usable by an organization that requires this new metadata. After that, some trading partners might still validate against an older version of the PIP specification and might refuse to process these new metadata values, although rest of the information might be valid. Relaxing validation would violate the RNIF specification (RosettaNet 2002b, section 2.1.2.2). This kind of arrangement guarantees maximum compliance by all trading partners, but transitional periods to new PIP versions might be a bit problematic. The student group testing the prototype felt that at least PIP 3A4 specification could have benefited from additional examples on how various fields should be used in practice. 89

6.2.1.3 RNIF Specification The RNIF specification is extensive and supplies most information that is required to implement the specification. One weaker part of the specification that was detected was error handling: how exactly errors should be reported in all situations was left a bit vague. In lots of places, the specification says that a notification of failure should be initiated, but it is not always very clear in context of what message should the 0A1 PIP be initialized and which fields should be copied from that message. The general guidelines are there, but it seems that exact syntax to use for all cases is not specified in either the RNIF specification or the specification for the 0A1 PIP. In addition, there was one clear incompatibility: the set of allowable global partner classification codes in PIP 0A1 documentation was smaller than for example with PIP 3A4, it was not clear what value should have been used in this case. However, perhaps the failure notification message content is not overly significant. 6.2.1.4 Namespaces One slightly complicating issue had to do with RosettaNet namespace definitions. Some XML elements with same local name that were defined in both the RNIF specification and the business dictionary were seemingly supposed to represent same data, but on other cases, the data was different although the local element name was not. This made creating documents employing elements from both specifications a bit bothersome.

6.2.2 Tools 6.2.2.1 Web Services Apache Axis is by no means a finished product, as it says in their documentation. For example, the latest “official” version at time of development, 1.1, left some required classes out when generating JAX-RPC compliant source code, but referred to these missing classes in generated code, resulting in code that did not compile. Leaving debug logging on in the very latest version 1.2 Alpha caused SOAP attachments to be added multiple times, which corrupted the message. After some back-and-forth switching, 1.2 Alpha was deemed more usable of the two. Although the goal of this research was not to evaluate web service interoperation, some experience on this was gained: as the prototype did not use any IBM-specific solutions, Apache Axis was used as a SOAP runtime instead of the native runtime of the IBM WebSphere Application Server. However, the applications used to test the system employed the IBM runtime. 90

The main issue that was found is that the WSDL descriptor needs to be created with care. The two runtimes sometimes employ different portions of WSDL, e.g. Axis uses the “style” attribute of “soap:operation” elements, while the IBM runtime requires such attribute be defined on the corresponding “soap:binding” element. Another issue was that validation is done to different degree: Axis is quite happy to render a Java interface for WSDL descriptors that are seriously malformed as long as it can somehow obtain the information it requires. Axis can produce a usable interface when e.g. the “schema” elements have wrong namespace definition, in which case the whole descriptor does not make much sense. The IBM runtime does a more throughout validation, and does not attempt to process seriously malformed descriptors. Therefore, it can be said that if a WSDL descriptor works with one runtime this is no guarantee that it will work on others. An independent validation for the descriptor should be performed, or alternatively interoperability testing should be done with all SOAP runtimes that are to be supported. When creating WSDL descriptors, the limits of RPC convention, such as JAX-RPC, should be observed from the beginning. The tools differ slightly on details of the mapping for the parts that are left undefined, or optional, in the specification, such as processing the W3C XML Schema “choice” content model, which is optional in JAX-RPC version 1.1. The WSDL descriptor of the web service interface was crafted by hand, not generated by tools. This is for two reasons. First, the descriptor shares XML data elements with the RNIF specification and the RosettaNet business dictionary whenever appropriate — this both facilitates easier adoption of the syntax and has a minor performance benefit, as trivial XML element conversions are avoided. WSDL generation tools cannot know when they should share elements. Second, there are some choices between ways of expressing equal XML structures in W3C XML Schema syntax, mainly having to do with the XML Schema type hierarchy. These choices do not directly affect the validity of XML, but by optimizing them, one can achieve a more intuitive data model with tools that create RPC wrappers for XML syntax, such as JAX-RPC tools. For example, for a human modeler it is obvious that elements named “fromRole” and “toRole”, having identical content, should both be defined using same type, which could be named “role”. Tools cannot make these kinds of decisions. Once everything was working, and all problems were solved, the web service technology started to live up to its promises. When generating a new client application, all the developer had to do 91

was to check the URL of the WSDL descriptor on the prototype installation that was to be used, enter it to WebSphere Studio Application Developer web service client creation wizard, and click “next” a couple of times. This produced a JAX-RPC compliant Java interface that the developer could start applying. It became apparent in context of the student project that some current Java-based web services implementations are more concentrated on providing RPC style web service operations in this manner, through JAX-RPC, than XML-based messaging operations, although the same web services interface can often be used in both ways. The RPC style is perhaps easier for programmers who do not know XML and might be less error prone and easier to maintain because more functionality is automated by the tool. On the other hand, applying XML-based messaging might be preferable when sending content that is readily available in XML format, and developers very familiar with XML might find this approach easier than the RPC based one. The automatically generated JAX-RPC compliant code often does not exactly match what a human modeler might have designed. The most obvious difference is that XML modelers have to rely on the type system to provide all validation, while Java programmers sometimes prefer more generic data types, such as “String”, with custom validation code, for added efficiency and ease of use. Therefore, a Java programmer might find the large amount of classes, corresponding to XML schema data types, in an automatically generated Java interface a bit annoying. Still, this is somewhat a matter of personal preference and of minor importance. 6.2.2.2 XML Tools Although the XML tools, such as Apache Xalan and Xerces, have been around far longer than the Web Service runtimes, some subtle bugs were found also in them. New versions of these tools were updated a number of times during the project to fix some minor issues that had appeared, such as not being able to save the result of an XSLT transformation to an in-memory tree representation in an earlier Xalan version, etc. It might not always be possible to update the XML tools in a J2EE application server, as the server itself employs the same tools and if there are differences between the two versions in some functionality the application server relies on an update might not be possible. Some workarounds were required when developing the prototype to solve these issues. 6.2.2.3 XSLT During the project, the web service interface was refactored from a Java-based implementation to XSLT. First, it was thought that XSLT would not be suitable, as the required conversion fea92

tures multiple source documents and some configurable default values, and therefore the first version of the conversion was implemented using Java code and XPath queries. After the initial implementation, it was observed that the resulting transformation could just as well be realized using XSLT, if the XSLT processor was passed additional information such as some default values using transformer parameters and multiple source documents using a custom URI resolver component. Therefore, the transformation was redone using XSLT, and as assumed, the XSLT representation of the transformation proved to be shorter in length and easier to understand and maintain than the original Java-based one. 6.2.2.4 Application Server Interoperability Interoperability between application servers is an interesting sub-problem in itself. Although this has relatively little to do with the main research problem, an attempt was made to make the prototype work in multiple J2EE environments, IBM WebSphere Application Server and opensource JBoss, to see how easy this would be to accomplish. Initially, it was assumed there would be some incompatibilities between the two application servers, but only minor workarounds would be required to circumvent them. This proved to be the case, more or less. The requirement caused by inter-application server interoperability was that no vendor-specific extensions to the J2EE specification would be used. In practice, this meant also refraining from using development aids, such as some of the “wizard” style functionality available in the IBM WebSphere Studio Application Developer product, but relying on plain Java code and XMLbased descriptors. The XDoclet tool proved to be able to generate the necessary J2EE descriptors automatically in a vendor-independent manner, so this did not cause a significant increase in total effort. Some minor interoperability problems were found during development, cases where the application servers did implement some functionality differently. These were typically situations where exact behavior was not defined in relevant specifications. It was always possible to circumvent these incompatibilities by changing the code not to rely on this undefined behavior. This did slightly increase required development effort. Overall, the only application server specific customization that was required in the final implementation was done in the build files, where the set of vendor-specific descriptors that is generated is adjusted per build. No duplicate vendor-specific code was required because of multiple application server support.

93

Still, it can be assumed that some experience is required from the developer to develop in this manner. This is both because vendor-specific development aids cannot be used, requiring the developer to be familiar with more low-level mechanisms such as XML-based descriptors and plain Java code, and also because one has to recognize non-portable vendor-specific extensions when they appear, which requires some general knowledge of the J2EE specification. 6.2.2.5 IBM WebSphere Two IBM WebSphere products were employed in the project: WebSphere Application Server (WAS) version 5.1 that was used to run the prototype system, and WebSphere Studio Application Developer (WSAD) version 5.1 that was used as an IDE in development work. When starting development, only WAS version 5.0 was available, and because of that, only J2SE version 1.3 functionality could be used. Having J2SE version 1.4 available would have saved some effort, as 1.4 provides some new convenient APIs, like generic URI operations. WSAD is, to some degree, a vendor-specific IDE, meaning it is primarily targeted for creating applications that run inside WAS. Because application server interoperability issues had to be considered, as described above, the more advanced features of WSAD could not be applied, as they tended to generate code that would run only on IBM products, and therefore WSAD use was limited to that of a generic J2EE programming environment. In that role, the tool served well: although more advanced features were not used, features that helped performing basic programming tasks resulted in direct save of effort. An example of this is the dynamic code correctness checking that made the need for recompiling the software rare, which was a good thing, as a full build took about one minute on the development machine. Also WAS performed well, although there were some issues with installation, mainly because the Linux operating system version used was not officially supported. After installation, no major issues were detected. The web-based administration interface proved to be convenient to use. A minor issue was that attempting to deploy applications that were not created using WSAD to WAS required a separate deployment step with the “ejbdeploy” tool, which took around 7 minutes for the prototype with an IBM ThinkPad R30. For some reason, deploying from WSAD was a lot faster. If there is not a way to work around this, WAS might not be very suitable for developing applications without using also WSAD. The student group that tested the prototype was also mainly satisfied with both WAS and WSAD. Perhaps the most important issue was less advanced support for XML-based SOAP messaging from the tool compared to RPC-based messaging. 94

6.2.2.6 Open-Source Software Perhaps an interesting thing to note is that if the JBoss application server is used instead of WebSphere products, the prototype could have been created by applying open-source software exclusively, with zero investment for third-party software. 6.2.2.7 RosettaNet Ready™ Self Test Kit The first thing about the self-test kit is that it is a bit painful to install. It requires to be installed to multiple fixed file system locations and TCP ports, which might already be reserved on many systems. After the kit is running, it performs quite nicely. It does have some minor documented limitations, such as refusing to accept a namespace declaration that is optional but valid in RNIF header XML documents. If the software to be tested can be modified in minor ways to facilitate these limitations, the self-test kit is very usable both in early stages of development and as a regression testing tool in later stages. With test scripts provided by the RosettaNet consortium, the self-test kit can only do generic PIP testing, so at some point of development it must be complemented with application-specific testing. It is also possible to write custom test scripts to the self-test kit, these consist of the request business document and metadata, and fields that are checked in the response, but this approach was not attempted during this project.

6.3 Relation to Previous Work Relation of this work to previous work is analyzed here, first in terms of implementation methodology, and then by comparing the prototype system that was created to previously reported research and commercial implementations.

6.3.1 Architecture and Implementation Methodology As the research problem of RNIF software implementation was not specific enough for starting to construct a solution, previous experience at the NetData project (Kotinurmi et al. 2004) and system architecture described in the literature (Chan et al. 2002, Pendyala et al. 2003, Shim et al. 2002) was used to further define the problem. When designing and implementing the prototype, a single approach that could have been followed was not found from literature. Instead, suggestions for implementing individual portions of the prototype system were found from literature and applied. 95

The prototype uses J2EE as its main implementation technology, as do many previously reported systems, e.g. Sundaram & Shim (2001), Buxmann et al. (2002), Nurmilaakso et al. (2002), and Shim et al. (2002). XML-to-XML transformations in the prototype apply XSLT-based solution presented by Buxmann et al. (2002), also applied by Nurmilaakso et al. (2002), Pendyala et al. (2003), and Kotinurmi et al. (2004). Software design of the prototype system applies architectural styles listed by Garlan & Shaw (1994). To some degree, it follows the general modular principle applied by Sundaram & Shim (2001) and Nurmilaakso et al. (2002): reusable modules that are dynamically used to form pipelines. However, in the prototype implementation, the types of modules are limited to input and output filters, and the module architecture is not fully generic, as this was not deemed necessary. Unlike the two papers mentioned, the purpose of this work was not to provide enterprise systems integration functionality, e.g. extracting or storing information to databases, which is the primary motivation these sources present for having this type of architecture.

6.3.2 Comparison to Research Implementations Here, a comparison of this work to some previously reported business-to-business research integration implementations is made. The systems that are discussed are the same ones that were previously introduced. For each system, the areas that share similarities with the prototype system are pointed out: this is to give some indication on how a software component such as the prototype could be used in scenarios described in those papers, to establish how the functionality provided by the prototype system relates to previously reported implementations. At the NetData project, work by Laesvuori (2003) solves the same basic problem as this work, implementing RosettaNet, but the approach is different. While that work consisted of implementing a single integration case using a commercial integration tool, in this work a generic integration tool using general-purpose software development tools is implemented. It would be more accurate to compare the prototype created during this project to the Microsoft BizTalk Accelerator for RosettaNet product applied in NetData. If a system functionally similar to the NetData system was implemented with the prototype system developed during this project, the RosettaNet adapters would probably be realized as a custom J2EE application, which would communicate with the prototype middleware using its web service interface. Both the prototype and the custom application could be deployed within the same application server instance, although this would not be required. The custom application 96

would expose another web service interface towards the PDM system and perform conversion between these two interfaces. This setup is illustrated in Figure 22. Trading Partnerr A

Trading Partner B

IBM WebSphere SOAP RN out adapter HTTP PDM adapter

SOAP

RNIF

SOAP prototype

RNIF

HTTP RN in adapter

SOAP

PDM System

Figure 22: Architecture of a fictional system, functionally resembling that which was created in the NetData project, implemented using the prototype, described as a UML deployment diagram

When comparing NetData architecture and Figure 22, it is possible to observe similar roles taken by the prototype system and the Microsoft BizTalk server. The main difference is that in Figure 22, the RN adapter communicates with the prototype messaging system and leaves final RNIF message composition to it. SOAP is used for this communication, thus no relational database or hard disk is required as an intermediate data store in this phase. A significant portion of the system developed by Nurmilaakso et al. (2002) is focused on implementing enterprise systems integration issues that are not addressed by this work. The messaging as implemented in the prototype corresponds roughly to a subset of the “Messenger” component in that system, although the prototype implementation is probably more complex, as the RNIF specification is implemented, instead of just sending the business documents. If the same system was to be built using the prototype developed in this work, provided RNIF messages were desired to be exchanged instead of xCBL, most of the existing system would therefore be used asis, replacing only the messaging-related functionality with the prototype. This approach is illustrated in Figure 23. Trading Partnerr A

Trading Partner B

IBM WebSphere SOAP

prototype

RNIF RNIF

SOAP Database Communication Application HTTP

Figure 23: Architecture of a fictional system, functionally resembling that which was created by Nurmilaakso et al. (2002), except for utilizing RNIF messaging implemented using the prototype instead of xCBL, described as a UML deployment diagram

97

Somewhat similarly, in approach by Sayal et al. (2001), a system such as the prototype could be used to provide the RNIF messaging ability for the Trade Partners Conversation Manager component. Still, most of the transformations, maintaining state, and that kind of functionality must still be implemented in the TPCM component, and only the RNIF messaging portion could be implemented using the prototype. This is illustrated in Figure 24. Trading Partner B

Trading Partner A

External Application TPCM Workflow Engine

RNIF SOAP

External Application

Microsoft BizTalk Server SOAP

prototype

RNIF

Figure 24: Architecture of a fictional system, functionally resembling that which was created by Sayal et al. (2001), implemented using the prototype, described as a UML deployment diagram

6.3.3 Comparison to a Commercial Product There are multiple commercial RNIF implementations. Here, a brief comparison between the prototype and the IBM WebSphere Business Integration Connect product version 4.2.1 is performed. This product was chosen, as it shares some implementation technologies, such as the WebSphere Application Server, with this project. The IBM product supports also other e-business frameworks than RosettaNet, but regarding RNIF implementation, the task performed by both systems is somewhat similar. Both systems allow the application to send messages using a custom interface. They provide message routing, and do not attempt to provide sophisticated EAI functionality but leave this to dedicated EAI products. RNIF support provided by the IBM product is more advanced: it implements e.g. the activity management approach listed as a future improvement possibility for the prototype. In addition, the IBM product features a sophisticated configuration and monitoring interface, unlike the very basic interface provided by the prototype, and offers a wider variety of interfaces for back-end applications. The IBM product is deployed as three separate system processes; in addition, a RDBMS and the WebSphere MQ messaging product are required. It is therefore probably more complex to install and configure than the prototype system that consists of a single enterprise application. Both 98

systems are scalable using similar principles: software components that can be distributed to multiple servers and use Java-based RPC or messaging to communicate. To sum it up, it seems RNIF-related functionality provided by both systems is very similar, but the IBM product has significantly higher quality of implementation and much more auxiliary features. On the other hand, the prototype system, being just one application, is more lightweight and therefore possibly easier to deploy in some situations.

6.4 Reliability of the Results According to testing was performed, it seems the prototype system works and satisfies the requirements that were placed to it, to the extent these goals are verified by the testing. This is probably enough to verify the prototype’s suitability to serve as a learning platform for students. However, care has to be taken when attempting generalizing the results, as testing was not overly extensive, e.g. not very many PIP types or operational scenarios were tested. Practical suitability of the system architecture approach taken is a question that cannot be answered solely based on this work. The only indication of this was the testing performed by the student group, in which they applied the prototype system in one integration scenario in laboratory environment, and some comparison to previous research projects and one commercial product. The student groups’ integration problem was probably a relatively easy scenario, and some issues, such as error handling and supporting multiple concurrent process instances, were outsize the group’s scope. Therefore, no overly general conclusions about suitability of this approach should be drawn — the prototype system has not been applied in solving any real-life problems, and although the prototype works, it cannot be determined whether it is useful in solving them. Interoperability testing that was performed does not cover all possible scenarios, so interoperability in all operational situations is not verified by this kind of testing — especially error handling was not tested except for very few cases. For performance testing, the prototype was not designed with performance optimization in mind, although an attempt was made not to employ unnecessarily inefficient solutions. Because of this, also testing arrangements were kept reasonably light, so that the results would give a general overview of observed performance with reasonable effort. In particular, not very much effort was spent attempting to generate as realistic test cases as possible and each test was performed only for one business message. Therefore, generalizing these results should be done with caution, but they do give an indication on what level of performance is attainable. 99

6.5 Further Research Some areas that would benefit from further research have been identified in the course of this work. Research on system architectures used in practice in business-to-business integration would be beneficial, to establish generic models on what the roles of various software components could be when performing integration using e-business frameworks. Except for a few case studies, not much information on the subject could be found to form basis of system architecture used in this work. Further research would be required in RNIF software implementation using currently available tools, with emphasis on practical usability of the entire software solution instead of just demonstrating a partial solution covering a single aspect of the problem. Especially experience on applying these implementations in real-life e-business scenarios would be valuable. Few papers exist that describe structure of systems that have been designed to meet requirements of production scenarios, instead of just demonstrating that building such a system is possible. A study of commercial business-to-business integration software packages, covering approaches they take on various issues and their relative benefits, could be useful to establish the current state-of-the-art in business-to-business integration and how does it correspond to ideas found in literature. This kind of material was not found during this work, therefore it is not entirely clear how this work compares to existing commercial solutions: although a comparison to one commercial product was done, it was very brief and based solely on information supplied by the vendor. It is not clear in what types of situations the current commercial solutions are applicable, and what situations still require custom software development work. In addition, it is not clear how much customization do the current commercial products require for each integration case.

100

7 Conclusions The need for connecting information systems of collaborating organizations is becoming increasingly common: advantages such as increased speed, efficiency, and reliability, can be gained by automating inter-organizational business processes (Shim et al. 2000, Rodgers et al. 2002). To connect heterogeneous information systems of different organizations, business-to-business integration, i.e. facilitating interoperation of disparate information systems, is performed (Medjahed et al. 2003). Business-to-business integration frameworks are generic solutions for performing such integration. XML (Extensible Markup Language) based business-to-business integration frameworks employ XML and the Internet as tools for integration. These are also called ebusiness frameworks. (Shim et al. 2000) RosettaNet is an industry consortium that maintains the RosettaNet e-business framework, which specifies inter-organizational business processes for multiple industries. These process specifications include messages that are exchanged between organizations, and associated messaging choreography. RNIF (RosettaNet Implementation Framework) specifies how these messages are exchanged. (RosettaNet 2004) The purpose of this work is to gather and report experience in implementing RNIF with currently available tools, as this kind of experience is not widely reported in literature (Nurmilaakso 2003). This was done by developing a prototype, a middleware system that provides RNIF functionality, on top of which other applications can be built with less effort than creating an equivalent RNIF implementation from scratch. J2EE (Java 2 platform Enterprise Edition) generic component-oriented enterprise application architecture and web services, a group of technologies for XML-based messaging in Internet, were used as primary implementation technologies. The prototype had two high-level goals. The first was to create a learning platform students can use to practice RosettaNet-based integration: the platform provides RNIF-compliant messaging functionality, so students can concentrate on higher-level issues, such as semantics of exchanged messages. The second goal was to create an RNIF implementation that provides services that are assumed useful in a typical business-to-business integration case, to gather experience on building such systems. Research problem was defined with research questions, which were how can RNIF be implemented in practice, how suitable are current tools and what could be improved in them, how

101

much effort is required, what level of performance can be obtained, and are there any interoperability problems with RNIF implementations. First, a brief literature study was performed — the most important result of this was a survey on reported business-to-business integration implementation strategies and case studies. Some system architecture approaches were found (Chan et al. 2000, Shim et al. 2000, Medjahed et al. 2003). Architecture of three previously reported systems (Kotinurmi et al. 2004, Nurmilaakso et al. 2002, Sayal et. al 2001) was studies in more detail. These were later compared against the prototype system, to establish how this work relates to previously reported systems. One commercial product (IBM 2003b) was similarly studied. During prototype implementation, the tools that were selected were found suitable enough for the task. The only major problem was that some tools, especially web service related, did not function perfectly and bugs in them had to be circumvented. It was learned that web service descriptors needs to be created with care, Java-based web service implementations are not always targeted towards XML-based integration, and XML-to-XML transformations should be performed using dedicated tools. In addition, minor J2EE tool interoperability problems were found, cases where tools implement some functionality differently. Based on this work, it is not clear in which situations it would be beneficial to use these relatively low-level tools, with associated significant implementation effort, compared to obtaining a commercial integration product. After implementation, the prototype was evaluated by testing it. In interoperability testing, the prototype was found to be able to interoperate with other RNIF-capable systems: the test suite from the RosettaNet consortium and a commercial product (Microsoft 2004). However, there were some interoperability problems with the commercial RNIF implementation. This is because of small differences in interpreting related specifications. The worst interoperability problem that was encountered was that the commercial implementation could produce business messages that pass DTD (Document Type Definition) validation but do not obey further constraints placed in related message guidelines documents, thereby violating RNIF specification. Practical usability of the prototype was tested by a student group. The group successfully used the prototype in one integration project, in which it was deemed to add value by making implementing of at least of some RNIF-capable software systems easier. According to performance testing, this type of implementation is easily able to handle a constant transaction load of about one business message transaction per second. Tools that were used place some limitations to maximum size of business messages that can be transferred: memory use grows linearly in proportion to message size. 102

It can be said that the prototype did reach the goals that were set, as far as they are verified by testing that was performed. The test setup was not overly extensive, e.g. not very many PIP types or operational scenarios were tested. Practical suitability of the system architecture approach taken is a question that cannot be answered solely based on this work, although it did work for the student group. In addition, performance-testing arrangements were very light. Summarizing, the results have been verified to degree that it can be said that the approach taken is viable, but nothing certain can be said on general applicability of this approach. The research questions that were placed can now be answered based on these results. With the prototype, one possible method of implementing RNIF was presented. Tools that were used were suitable in general, but some improvement suggestions could be made. Required implementation effort was about three person-months. Performance level easily achievable using these tools was demonstrated, the most significant point being the linear growth of memory requirement relative to transferred message size. Some interoperability problems with existing RNIF implementations were pointed out — it is clear that some interoperability problems are to be expected, even when interacting established commercial RNIF implementations. The main contribution of this work has perhaps been providing some insight on what problems need to be addressed when constructing a software RNIF implementation and showing one possible way to do it, as it seems this kind of information is not commonly available in literature. In addition, some experience of applying related software development tools, J2EE (Java 2 platform Enterprise Edition) and especially web services, in context of RNIF implementation was gained. Some areas that would benefit from further research have been identified in the course of this work. Research on system architecture models used in business-to-business integration in practice would be beneficial, to establish generic models on what the roles of various software components could be when performing integration using e-business frameworks. Further research would be required in RNIF software implementation using currently available tools, with emphasis on practical usability of the entire software solution instead of just demonstrating a partial solution covering a single aspect of the problem. Especially experience on applying these implementations in real-life e-business scenarios would be valuable. In addition, as various commercial RNIF implementations exist, a study of commercial software implementations, covering approaches they take on various issues and their relative benefits, could be useful to determine what is the current state-of-the art in e-business integration and how does it correspond to ideas found in literature. 103

References Ajalin, P., Briongos, U., Kesäniemi, A., Tarvainen, O., 2004, SAP R/3 Integration to RosettaNet Processes Using Web Service Interfaces. http://www.soberit.hut.fi/T-86/T-86.161/2004/ BEA, 2003, Introducing BEA WebLogic Integration™ 8.1. http://www.bea.com/content/news_events/white_papers/BEA_WLI_81_Overview_wp.pdf Björkander, M., Korbyn, C., 2003, Architecting Systems with UML 2.0. IEEE Software, 20 (4), 57–61. Booch, G., Rumbaugh, J., Jacobson, I., 1999, The Unified Modeling Language User Guide (USA: Addison-Wesley). Bussler, C., 2002, B2B Integration Technology Architecture. Fourth IEEE International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems, Newport Beach, California, USA, June 26–28 2002. Buxmann, P., Díaz, L., Wüstner, E., 2002, XML-Based Supply Chain Management — as SIMPLEX as It Is. Proceedings of the 35th Annual Hawaii International Conference on System Sciences, Hawaii, USA, January 7–10 2002. Chan, S., Dillon, T., Siu, A., 2002, Applying a Mediator Architecture Employing XML to Retailing Inventory Control. The Journal of Systems and Software, 60 (3), 239–248. Cohen, F., 2003, Discover SOAP Encoding’s Impact on Web Service Performance. ftp://www6.software.ibm.com/software/developer/library/ws-soapenc.pdf Damiani, E., De Capitani di Vimercati, S., Samarati, P., 2002, Towards Securing XML Web Services. Proceedings of the 2002 ACM Workshop on XML Security, Fairfax, Virginia, USA, November 22 2002. ebXML, 2001, ebXML Technical Architecture Specification 1.0.4. http://www.ebxml.org/specs/ebTA.pdf ebXML, 2002, Message Service Specification version 2.0. http://www.ebxml.org/specs/ebMS2.pdf ebXML, 2004, ebXML. http://www.ebxml.org

104

Emmerich, W., 2002, Distributed Component Technologies and Their Software Engineering Implications. Proceedings of the 24th International Conference on Software Engineering, Orlando, Florida, USA, May 19–15 2002. Engels, G., Groenewegen, L., 2000, Object-Oriented Modeling: a Roadmap. Proceedings of the Conference on the Future of Software Engineering, Limerick, Ireland, June 4–11 2000. Fujitsu, 2003, Interstage Integration Manager. http://www.fujitsu.com/downloads/INTSTG/datasheets/v6/IM6.0DS_1003_FINAL.pdf Garlan, D., Shaw, M., 1994, An Introduction to Software Architecture. Carnegie Mellon University Technical Report CMU-CS-94-166. Hollingsworth, D., 1995, The Workflow Reference Model issue 1.1. Workflow Management Coalition Specification TC00-1003. IBM, 2002a, Web Services Security (WS-Security) version 1.0. ftp://www6.software.ibm.com/software/developer/library/ws-secure.pdf IBM, 2002b, WS-Attachments. ftp://www6.software.ibm.com/software/developer/library/wsattach.pdf IBM, 2003a, Business Process Execution Language for Web Services version 1.1. ftp://www6.software.ibm.com/software/developer/library/ws-bpel.pdf IBM, 2003b, IBM WebSphere Business Integration Connect Enterprise and Advanced Editions Product Overview Version 4.2.1. http://www306.ibm.com/software/integration/wbiconnect/library/doc/wbic421/pdf/overview.pdf IBM, 2004, WebSphere Application Server. http://www306.ibm.com/software/webservers/appserv/was/ IETF, 1996, Multipurpose Internet Mail Extensions (MIME). Request for Comments 2045–2049. IETF, 1998a, S/MIME Version 2. Request for Comments 2311–2312. IETF, 1998b, Uniform Resource Identifiers (URI): Generic Syntax. Request for Comments 2396. IETF, 1999, Hypertext Transfer Protocol — HTTP/1.1. Request for Comments 2616. Kasanen, E., Lukha, K., Siitonen, A., 1993, The Constructive Approach in Management Accounting Research. Journal of Management Accounting Research, 5, 243–264.

105

Kotinurmi, P., Laesvuori, H., Jokinen, K., Soininen, T., 2004, Integrating Design Document Management Systems Using the RosettaNet E-Business Framework. 6th International Conference on Enterprise Information Systems, Porto, Portugal, April 14–17 2004. Laesvuori, H., 2003, RosettaNet Implementation Using Microsoft BizTalk with Emphasis on Exception Handling. Master’s thesis, Helsinki University of Technology. Lee, D., Chu, W., 2000, Comparative Analysis of Six XML Schema Languages. ACM SIGMOD Record, 29 (3), 76–87. Medjahed, B., Benatallah, B., Bouguettaya, A., Ngu, A., Elmagarmid, A., 2003, Business-toBusiness Interactions: Issues and Enabling Technologies. The VLDB Journal, 12 (1), 59–85. Medvidovic, N., Egyed, A., Rosenblum, D., 1999, Round-Trip Software Engineering Using UML: from Architecture to Design and Back. Workshop on Object-Oriented Reengineering, Toulouse, France, September 6 1999. Meyer, B., 2001, .NET Is Coming. IEEE Computer, 34 (8), 92–97. Microsoft, 2004, BizTalk Accelerator for Rosettanet. http://www.microsoft.com/biztalk/evaluation/rosettanet/default.asp Newcomer, E., 2002, Understanding Web Services (USA: Addison-Wesley). Nurmilaakso, J., Kettunen, J., Seilonen, I., 2002, XML-Based Supply Chain Integration: a Case Study. Integrated Manufacturing Systems, 33 (8), 586–595. Nurmilaakso, J., 2003, XML-Based Supply Chain Integration: a Review and a Case Study. Licentiate thesis, Helsinki University of Technology. Nurmilaakso, J., Kotinurmi, P., 2004, A Review of XML-Based Supply Chain Integration. (to appear) Production Planning & Control. OAG, 2001, Implementing OAGIS within the RosettaNet Implementation Framework Version 2.0. http://www.openapplications.org/downloads/whitepapers/frameworks/OAGI_RN_WhitePaper_2 0.pdf OASIS, 2004, Technical Overview of the OASIS Security Assertion Markup Language (SAML) 1.1. http://www.oasis-open.org/committees/documents.php?wg_abbrev=security Oracle, 2003, Web Services Reliability (WS-Reliability) 1.0. http://otn.oracle.com/tech/webservices/htdocs/spec/WS-ReliabilityV1.0.pdf 106

Pendyala, V., Shim, S., Gao, J., 2003, An XML-Based Framework for Enterprise Application Integration. IEEE International Conference on E-Commerce, Newport Beach, California, USA, June 24-27 2003. Rawlins, M., 2003, Using XML with Legacy Business Applications (USA: Addison-Wesley). Rivest, R., Shamir, A., Adleman, L., 1978, A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communications of the ACM, 21 (2), 120–126. Rodgers, J., Yen, D., Chou, D., 2002, Developing E-Business: a Strategic Approach. Information Management & Computer Security, 10 (4), 184–192. RosettaNet, 2002a, Measuring Business Benefits of RosettaNet Standards. A Co-adoption Model Conducted by the University of Illinois. http://www.rosettanet.org/roistudies/ RosettaNet, 2002b, RosettaNet Implementation Framework: Core Specification 2.00.01. http://www.rosettanet.org/rnif/ RosettaNet, 2004, RosettaNet Corporate Brochure. http://www.rosettanet.org/ SAP, 2004, Interface Repository. http://ifr.sap.com/catalog/ Sayal, M., Casati, F., Dayal, U., Shan, M., 2001, Integrating Workflow Management Systems with Business-to-Business Interaction Standards. HP Labs Technical Report HPL-2001-167. Shim, S., Pendyala, V., Sundaram, M., Gao, J., 2000, Business-to-Business E-Commerce Frameworks. IEEE Computer, 33 (10), 40–47. Shim, S., Zeng, Z., Gao, J., 2002, Automatic Generation of RosettaNet Based on Generic Templates and Components. Fourth IEEE International Workshop on Advanced Issues of ECommerce and Web-Based Information Systems, Newport Beach, California, USA, June 26–28 2002. Sun Microsystems, 2000, The Java™ Language Specification, Second Edition. ftp://ftp.javasoft.com/docs/specs/langspec-2.0.pdf Sun Microsystems, 2001a, Java™ 2 Platform Enterprise Edition Specification, version 1.3. http://java.sun.com/j2ee/1.3/docs/index.html Sun Microsystems, 2001b, Java Message Service. http://java.sun.com/products/jms/docs.html Sun Microsystems, 2002a, Java API for XML-based RPC (JAX-RPC) 1.1. http://java.sun.com/xml/jaxrpc/docs.html

107

Sun Microsystems, 2002b, Java™ API for XML Messaging (JAXM) 1.1. http://java.sun.com/xml/downloads/jaxm.html Sun Microsystems, 2003a, Java™ Platform Enterprise Edition Specification, version 1.4. http://java.sun.com/j2ee/1.4/docs/index.html Sun Microsystems, 2003b, SOAP with Attachments API for Java™ (SAAJ) 1.2. http://java.sun.com/j2ee/1.4/docs/index.html Sun Microsystems, 2004, Java Technology. http://www.java.sun.com Sundaram, M., Shim, S., 2001, Infrastructure for B2B Exchanges with RosettaNet. Third International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems, San Juan, California, USA, June 21–22 2001. webMethods, 2003, webMethods Integration Platform. http://www.webmethods.com/PDF/Corporate_Brochures/Integration_Platform_Brochure.pdf Westarp, F., Weitzel, T., Buxmann, P., König, W., 1999, The Status Quo and the Future of EDI — Results of an Empirical Study. Proceedings of the Seventh European Conference on Information Systems, Copenhagen, Denmark, June 23–25 1999. Wiederhold, G., 1992, Mediators in the Architecture of Future Information Systems. IEEE Computer, 25 (3), 38–49. Wileden, J., Wolf, A., Rosenblatt, W., Tarr, P., 1991, Specification-Level Interoperability. Communications of the ACM, 34 (5), 72–87. Wilson, P., 1992, Uniprocessor Garbage Collection Techniques. Proceedings of the International Workshop on Memory Management, St. Malo, France, September 17-19 1992. Lecture Notes in Computer Science, 637, 1–42. W3C, 1999a, Namespaces in XML. http://www.w3.org/TR/1999/REC-xml-names-19990114/ W3C, 1999b, XML Path Language (XPath) 1.0. http://www.w3.org/TR/1999/REC-xpath19991116 W3C, 1999c, XSL Transformations 1.0. http://www.w3.org/TR/1999/REC-xslt-19991116 W3C, 2000a, Simple Object Access Protocol (SOAP) 1.1. http://www.w3.org/TR/2000/NOTESOAP-20000508/ W3C, 2000b, SOAP Messages with Attachments. http://www.w3.org/TR/2000/NOTE-SOAPattachments-20001211 108

W3C, 2001a, XML Schema. http://www.w3.org/TR/2001/REC-xmlschema-0-20010502/ W3C, 2001b, Web Service Definition Language (WSDL) 1.1. http://www.w3.org/TR/2001/NOTE-wsdl-20010315 W3C, 2004, Extensible Markup Language (XML) 1.0 (Third Edition). http://www.w3.org/TR/2004/REC-xml-20040204/ WS-I, 2003, Basic Profile Version 1.0a. http://www.ws-i.org/Profiles/Basic/200308/BasicProfile-1.0a.html xCBL, 2004, XML Common Business Library. http://www.xcbl.org/ Zisman, A., 2000, An Overview of XML. Computing & Control Engineering Journal, 11 (4), 165-167.

109

Appendix A This is a brief summary of UML notation as used in diagrams of this report. UML activity diagram notation:

UML component diagram notation:

UML deployment diagram notation:

110

UML class diagram notation:

UML sequence diagram notation:

UML statechart diagram notation:

111

implementing a rosettanet business-to-business ... - CiteSeerX

implementing a rosettanet business-to-business ... - CiteSeerX

Suggest Documents

implementing a rosettanet business-to-business ... - Euro4usa

Ontologically enhanced RosettaNet B2B Integration ... - CiteSeerX

Implementing a Temporal Datatype - CiteSeerX

From RosettaNet PIPs to BPEL Processes: A Three Level ... - IAAS

Implementing a people focused SPI programme - CiteSeerX

Implementing Decentralized Local Governance: A ... - CiteSeerX

Implementing a Testbed for Mobile Multimedia - CiteSeerX

3 Implementing Indigenous - CiteSeerX

Implementing Function Spreadsheets - CiteSeerX

Implementing e-learning across a faculty - CiteSeerX

Implementing a Sepsis Resuscitation Bundle Improved ... - CiteSeerX

Implementing a Hidden Markov Model Speech ... - CiteSeerX

Implementing Organizational Change - CiteSeerX

Implementing a Secure setuid Program - CiteSeerX

EXPERIENCES IMPLEMENTING DATAFLOW ON A ... - CiteSeerX

Implementing a Reduced-Workload Arrangement to ... - CiteSeerX

Implementing a people focused SPI programme - CiteSeerX

Implementing a Tamper-evident Database System - CiteSeerX

On Implementing SchemaLog { A Database Programming ... - CiteSeerX

Implementing L'OZ - CiteSeerX

implementing a sustainable livelihoods framework for ... - CiteSeerX

Implementing e-learning across a faculty - CiteSeerX

Implementing Relational Specifications in a Constraint ... - CiteSeerX

A Pragmatic Approach for Implementing Knowledge ... - CiteSeerX