A Domain-Specific Language for Interoperability Between Object-Oriented and Mainframe Systems Marcos Rodrigo Sol Souza
Maria Augusta Vieira Nelson
Unisys Brazil Belo Horizonte, Brazil
[email protected]
Institute of Informatics Pontifical Catholic University of Minas Gerais Belo Horizonte, Brazil
[email protected]
Abstract This work presents a domain-specific language (DSL) to integrate object-oriented applications with legacy systems running on mainframes. The DSL offers abstractions to solve recurring problems encountered when integrating these kinds of systems. It offers to the developer a semantic interface that allows him to concentrate on the implementation of the business process requirements without having to worry about the complex computational details of establishing the interaction between systems. The DSL is implemented using the resources of Ruby, a dynamic object-oriented programming language. In this paper we show the syntax of the DSL, its implementation and a case study of its application. Keywords interoperability, domain-specific language, mainframe systems
1.
Introduction
Information technology is a key factor for the success of many organizations. The dynamic and competitive environments in which they are inserted demand ever-greater agility in building and updating systems. Domain-specific languages (DSLs) can contribute to increase development productivity and keep the focus on the elements that really add value. Many organizations keep important business processes in legacy systems. These systems were often built in closed mainframe architectures. Migrating these systems to more current platforms is not always of interest (Knight 1994). These migrations can have high costs and demand professionals who understand the legacy system’s implementation and who are not always available. In other cases, the mainframe systems simply still fulfill the majority of functional and non-functional requirements. These legacy systems are usually inserted in heterogeneous environments and must coexist with other software systems. There are several approaches for interoperability between legacy systems and new object-oriented systems. Service-oriented architectures have been used successfully A DSL for interoperability
in recent years, unlike other less successful past attempts like CORBA (Common Object Request Broker Architecture)and the JCA (Java Connector Architecture) specification. Much of the work for achieving interoperability is related to mapping data and communicating across networks. Specifically, to achieve interoperability between legacy systems and object-oriented systems, one must map untyped text messages that contain the inputs and outputs of programs in the legacy system to classes in the object-oriented system. This activity involves many phases related to the type conversion between the two systems. Building applications that must deal with interoperability issues takes a long time. The tasks are usually repetitive and draw the developers’ attention away from the problem domain. Using a DSL to abstract away the complexity of achieving interoperability between object-oriented systems and legacy systems is a good approach for faster and more productive development. Many legacy systems in mainframes were implemented in COBOL. In the case of Unisys mainframes, the interoperability with other systems happens through the exchange of messages. These messages represent a data structure that the COBOL program can understand. The messages must be manually converted to the corresponding object-oriented model. The mapping leads to lowered flexibility to deal with changes in the legacy system. A new process or even a new attribute can have an impact on the object-oriented system that depends on the legacy system.
1.1
SOA approach
Service Oriented Architectures (SOA) are becoming a standard for achieving interoperability among complex systems. However, the DSL presented in this work differs from SOA in some aspects. In many computational ecosystems, service buses help with integration and allow great scalability for the environment (Krafzig et al. 2004). SOA usually imply some sort of adaptation in the software systems that wish to publish and consume services. 1
2008/10/3
These implementations can be aided by middleware or by direct intervention over the application (Oberle 2006). The use of SOA requires a clear mapping of the business rules and the knowledge of where they can be found in the system. This is often not available in legacy systems, because the rules can be scattered across several programs. The approach proposed in this work differs from the serviceoriented approach because it sees the legacy system as a black box that has a set of working inputs and outputs, without concerning itself with the business rules behind it (Hurwitz et al. 2006).
2.
what the program must execute when receiving the message. Actions can be of many kinds, from common actions of a CRUD (create-read-update-delete) to complex actions like emitting reports or performing simulation computations. The remainder of the message contains the layout of the message that the program expects to receive. In our case, this is a COBOL segment. The complexity of this kind of data structure must be handled by the object-oriented program that needs to communicate. This implies that compaction algorithms must be implemented, and vector or matrix field groups must be interpreted. In COBOL, these field groups are called occurs(Unisys 1999)(Unisys 1995).
Providing abstractions for interoperability
3.
A large part of the difficulty in building applications that need to communicate with legacy systems lies in the lack of good interoperability abstractions. Developers spend a long time writing code that performs simple tasks like mapping input/output messages to object-oriented concepts, instead of dedicating time to effectively solve the real problems related to the application domain. Implementing communication is also an important aspect. No matter what technologies and standards are used to achieve interoperability, the mapping of data and interfaces between the systems is usually static. This leads to replication of data representation in both systems, even if transient. The replication impacts negatively on the maintenability of the application, because a change that includes new data in one of the system’s interfaces must be replicated across the entire code(Bisbal et al. 1999). 2.1
2.2
The main goal of this DSL is to provide an interface to integrate programs without the need for the developer to know about the issues involved in interoperability. He must be capable of integrating a legacy system with his objectoriented system with nothing but the syntax offered by the DSL. 3.1
Message exchange
Communication between programs is done through asynchronous message exchange, as required by the mainframe. These messages carry data in a previously defined format and represent the actions that must be executed in the legacy system. Messages also contain execution parameters. The main body of the message is composed of a sequence of elements. First comes the program identifier, which in the case of programs managed by the Unisys Transaction Server (COMS) is called TRANCODE, a unique attribute that identifies each program managed by the mainframe central controller. Next comes the action. The action determines A DSL for interoperability
Abstracting the interoperability complexity
The first step towards arriving at a syntax for the DSL was to describe a common code fragment to perform a simple integration between a legacy program and its representation in an object-oriented system. The object model representing the domain of the program that will be used (Krishnamurthi 2003) needs to be described. Listing 1 shows a class that represents the domain of debits in loan contracts. The programming language used in the listings is Ruby (rub 2000). The name of an instance variable starts with @ in Ruby.
Communication protocol
Interoperability demands the definition of a communication protocol between the platforms. Though mainframes have their own communication protocols like the Burroughs Network Architecture (BNA), they currently natively support the TCP/IP protocol, which makes this the most natural choice (IBM 1999). In the case of Unisys mainframes, it is also possible to combine the TCP with the Custom Connection Facility (CCF). The CCF provides a way to direct packages that arrive at a specific TCP port to a central program. This program will direct the message to its destination and provide a reply to the caller.
Mapping domain expertise
Listing 1. Problem domain class 2 3
5 6 7 8 9 10 11 12 13
15 16 17
19
# Debits class with legacy system a t t r i b u t e s class Debits @trancode @action @contract id = 0 @customer id = ” ” @name = ” ” @year = 0 @fine = 0 . 0 0 @location code = 0 @status = [ ] a t t r a c c e s s o r : trancode , : action , : contract id , a t t r a c c e s s o r : c u s t o m e r i d , : name , : y e a r , : f i n e , attr accessor : location code , : status end
Another class would be necessary to map the messages from each direction into the class defined above. Listing 2 shows how this function would be written. Listing 2. A class to query the legacy system 1
c l a s s DebitsQuery
2
2008/10/3
2
6
t r a n c o d e = ” V034741001 ” a c t i o n = ”QRY” msg in = t r a n c o d e + a c t i o n + c o n t r a c t i d
8
c o m p r e s s u t i l i t y = C o m p r e s s U t i l i t y . new
4 5
10 11 12 13
3.2
def findByContract ( c o n t r a c t i d )
The next step is the definition of a grammar so that the DSL can provide the data required for interoperability without burdening the programmer (Fernandez 2007). The code fragment that resulted from the previous analysis was again examined to find the functions that should be supported by the DSL, according to Listing 4:
m s g i n = c o m p r e s s u t i l i t y . p a c k ( msg in , 1 3 , 2 0 ) conn = Mainframe . c o n n e c t i o n . new : h o s t , : p o r t m s g o u t = conn . s e n d r e c e i v e ( m s g i n ) m a p p e d O b j e c t = D e b i t s . new
Listing 4. Functionalities of the analyzed code 1
15 16
msg out = c o m p r e s s u t i l i t y . u n p a c k ( ms g ou t , 1 3 , 2 0 )
2
c l a s s Foo d e f main # C r e a t e s an i n s t a n c e o f t h e c l a s s D e b i t s Q u e r y d e b i t s = D e b i t s Q u e r y . new
4 18 19
msg out = c o m p r e s s u t i l i t y . u n p a c k ( ms g ou t , 6 6 , 7 1 )
5
# E x e c u t e s an a c t i o n i n t h e l e g a c y s y s t e m # p a s s i n g an i n p u t a t t r i b u t e obj = d e b i t s . findByContract (1234)
7 21 22 23 24 25 26 27 28 29 30 31 32
34 35 36
mappedObject . t r a n c o d e = msg out [ 0 , 1 0 ] mappedObject . a c t i o n = msg out [ 1 0 , 1 3 ] mappedObject . c o n t r a c t i d = msg out [ 1 3 , 2 5 ] m a p p e d O b j e c t . name = m s g o u t [ 2 5 , 6 1 ] mappedObject . year = msg out [ 6 1 , 6 5 ] mappedObject . f i n e = msg out [ 6 5 , 6 6 ] mappedObject . l o c a t i o n c o d e = msg out [ 6 6 , 7 1 ] m a p p e d O b j e c t . s t a t u s = A r r a y . new ( 4 ) mappedObject . s t a t u s [ 0 ] = msg out [ 7 1 , 7 4 ] mappedObject . s t a t u s [ 1 ] = msg out [ 7 4 , 7 7 ] mappedObject . s t a t u s [ 2 ] = msg out [ 7 7 , 8 0 ] mappedObject . s t a t u s [ 3 ] = msg out [ 8 0 , 8 3 ]
Defining a grammar
8 9
11 12 13 14
16
# G e t s and m a n i p u l a t e s t h e a t t r i b u t e s r e t u r n e d # by t h e l e g a c y s y s t e m p u t s o b j . name end end
For the programmer, the DSL should support the following actions: • Create an instance of the object
return mappedObject end end
• Execute the actions supported by the legacy system • Inform input parameters for the actions
Finally, a new class could be written to achieve interoperability using the above-defined classes. Listing 3 shows these classes being used.
• Obtain and use the results of executing an action
Based on this analysis, a syntax was defined to simplify the process. First, we define a key that indicates which program must respond to the requests in the legacy system. This key was called trancode. By convention, any method with a three-character name represents the execution of an action in the legacy system. The basis for this decision lies on the practices of work environment where the study was performed. In this environment, it is common practice to use three letters to name actions. Finally, it was defined that the input or output attributes will be represented by their own names. When used in the left-hand side of an assignment, the attribute will represent data that must be sent to the mainframe. The same attribute, when used in the right-hand side of an assignment, will represent data to be read from the mainframe. Based on this analysis, Listing 4 was rewritten, establishing the syntax of the DSL to be as shown in Listing 5:
Listing 3. Execution of a query in the legacy system 1 2 3 4 5 6 7
c l a s s Foo d e f main d e b i t s Q u e r y = D e b i t s Q u e r y . new obj = debitsQuery . findByContract (1234) p u t s o b j . name end end
Except for the 7 lines of code from listing 3, all code was written solely for the purpose of interoperability. There is code to create a class representing the interface of the legacy system, code to create connections with the mainframe, code to compress and uncompress messages, and lots of code to map data from the mainframe to the object. One of the goals of the DSL is exactly to abstract away most of the complexity of interoperability so that the programmer can focus on the implementation of the business rules (Fowler 2005). Given this premise, the DSL must provide a way for the programmer to access any program in the legacy system by using only the last code fragment; all the rest must be a responsibility of the DSL. A DSL for interoperability
Listing 5. DSL syntax 1 2
4 5
# D e f i n e s t h e program i n t h e l e g a c y e n v i r o n m e n t t r a n c o d e : V034741001 # D e f i n e s an i n p u t p a r a m e t e r c o n t r a c t i d = 1234
3
2008/10/3
7 8 9
11 12
This method is also called every time the system executes a method that substitutes the current DSL program for another. After assigning all message specifications, arrays are created with the fields in the order in which they will be send and received by the mainframe. Every time the trancode command is executed, the program name received as parameter will be used in a query to the knowledge repository. If there is information about the program in the repository, a specification will be returned and used to update the data structures, configuring the current instance to the context of the new program. This command is defined in the method shown in Listing 7.
# E x e c u t e s an a c t i o n named q r y ( t o q u e r y ) # i n t h e program qry # W r i t e s t h e f i e l d name , t h e r e s u l t o f t h e q u e r y p u t s name
4.
Implementation strategy
4.1
Formal definition of the DSL syntax
The DSL used in this work was written in the Ruby programming language. Ruby was selected because it is a dynamic language that allows behaviors to be changed by overloading methods of its kernel. Ruby also offers excellent support for metaprogramming. The implementation of the DSL follows the approach known as Class macro. In this approach, the DSL is built from methods in some base classes. Subclasses can use these methods to modify their own behavior or that of their own subclasses (Fernandez 2007). Another benefit of this approach that is used by the DSL is the ability to change an object’s behavior dynamically by executing a command. To use the DSL, the programmer must create a class that inherits from the DSL class. This class contains the grammar defined by the DSL. When executing an instance of the object that inherited from the DSL, the constructor will initialize the necessary data structures and will create a connection with the mainframe. Next, it creates the attributes that are required to store the data during the execution of the DSL, which include the current program on the legacy system that the DSL refers to. The following step is to call the private method that updates the input and output data structures according to the specification found in the domain repository. This repository contains the descriptions of the communication interface of each program from the legacy system. The repository is created automatically by reading reports about the programs available in the mainframe. — O repositrio tambm mantm informaes sobre a tipagem dos atributos no sistema legado. A DSL utiliza essas informaes para realizar o casting dos dados para sua representao equivalente no sistema legado. — The update internal structures method is shown in Listing 6.
Listing 7. Macro to identify the context 1 2 3 4 5
Once the current program is defined, the programmer can use the other commands from the DSL syntax, which are assignment of a value to an attribute, reading an attribute, and execution of an action. This is implemented by overriding method Kernel.method missing. This method belongs to the Ruby language kernel. Once overridden, it will intercept every call to a nonexistent method on the object that inherited from the DSL (rub 2000). When method method missing is called, the DSL verifies in the current specifications if the method that failed represents any program attributes. If so, it verifies if the failure was caused by an attempt to assign a value to or read a value from a nonexistent attribute. If it’s an attempt to read, the value corresponding to the attribute that is in the data structure that stores the output values will be returned dynamically to the caller. If the call is an assignment, then the value will be assigned to the data structure that stores the input values so that it can be sent later to the legacy system. This implementation also verifies if the name of the method has only three letters, according to DSL convention. In this case, it will send the corresponding action to the program in the legacy system, passing as argument a message containing all the data that were assigned by the programmer. When a reply message is received from the legacy system, the message is parsed and the values are dynamically assigned to the corresponding attributes. Overriding the method Kernel.method missing is what makes this approach possible and flexible. Listing 8 shows the implementation of this method for this DSL (rub 2000).
Listing 6. Updating internal structures 1 2 3 4 5
def trancode ( trancode ) @specs = R e p o s i t o r y . t r a n c o d e s [ t r a n c o d e . t o s y m ] @trancode = t r a n c o d e update internal structures end
def u p d a t e i n t e r n a l s t r u c t u r e s @in = @specs @out = @specs @sorted in = @in . s o r t { | x , y | x [ 1 ] [ 0 ] y [ 1 ] [ 0 ] }
Listing 8. Implementation of method missing 7 8 9
11 12
@ s o r t e d i n . d e l e t e i f { | e | e == n i l } @sorted out = @out . s o r t { | x , y | x [ 1 ] [ 0 ] y [ 1 ] [ 0 ] } @ s o r t e d o u t . d e l e t e i f { | e | e == n i l } end
A DSL for interoperability
1 2 3 4 5
d e f m e t h o d m i s s i n g ( methId , ∗ a r g s ) # verifies if it is a getter i f m e t h I d . id2name = ˜ / = $ / && @in [ m e t h I d . id2name [ 0 , m e t h I d . id2name . s i z e − 1 ] . to sym ]
4
2008/10/3
7 8 9
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
5.
return @ d a t a i n [ m e t h I d . id2name [ 0 , m e t h I d . id2name . s i z e −1] . to sym ]
We performed a case study consisting of rewriting a real application with the proposed DSL. The chosen application is responsible for simulating real estate loan financing for a financial institution. All business rules for this application already existed in a legacy system, but the client required a web-based interface with high usability, which would allow the institution’s customers to do their own loan simulations. This application was originally implemented using the Java programming language and a large portion of its code was written with the sole purpose of achieving interoperability with the legacy system. When using this application, the user provides some data such as birth date, income, and some characteristics of the property for which he wishes to acquire a loan. This information is sent to the legacy system, which selects the best type of loan for the client’s profile, from among many available types. At first, the legacy system assumes that the conditions for the loan are those that will maximize the value of the loan. The results of the simulation calculations are displayed to the user. The user can change some of the conditions for the loan such as duration and initial value. In this case, he is taken to a screen where a new form for data entry is shown. The changes are sent to the legacy system and new values are returned to the user. When using the DSL, all of the code related to interoperability was eliminated, making the implementation fully oriented to the problem domain. Listing 9 shows how the DSL can be used to represent the entire flow described above.
# verifies if it is a setter e l s i f @out [ m e t h I d . id2name . t o s y m ] | | @in [ m e t h I d . id2name . t o s y m ] i f @ d a t a o u t [ m e t h I d . id2name . t o s y m ] && @ d a t a o u t [ m e t h I d . id2name . t o s y m ] ! = n i l r e t u r n @ d a t a o u t [ m e t h I d . id2name . t o s y m ] else i f @in [ m e t h I d . id2name . t o s y m ] r e t u r n @ d a t a i n [ m e t h I d . id2name . t o s y m ] e l s i f @out [ m e t h I d . id2name . t o s y m ] r e t u r n @ d a t a o u t [ m e t h I d . id2name . t o s y m ] else r a i s e NoMethodError end end else # v e r i f i e s i f i t i s an a c t i o n r a i s e NoMethodError u n l e s s m e t h I d . id2name = ˜ / ˆ ( \ w{7 ,10} \w{ 1 , 4 } | \w{ 1 , 4 } ) $ / p a r a m s = {} params = a r g s [ 0 ] i f a r g s . l e n g t h > 0 e x e c u t e ( m e t h I d . id2name , p a r a m s ) end end
The overridden method receives as parameter the name of the method that was not found and that represents the attempt to use some DSL command, plus a list of pointers to the arguments to this command. The system checks if a read attempt was made, and if the appropriate attribute exists in the input specification. If so, the current value of the attribute is returned. To achieve this, a regular expression detects whether the last character of the name of the received method is ’=’, which the Ruby language concatenates by convention to the argument that represents the method name. If it isn’t a read attempt, the system checks if a write attempt was made. If there is a compatible attribute, then the value of the first argument to the method is associated to the key of the hash that stores the input attributes. If both checks above fail, the system checks if the method name has only 3 letters, in which case it calls a method that will send an execution request to the legacy system. If none of the above conditions are true, the system considers it a syntax error, raising an exception that will inform the user that the attempted command is not valid in the current DSL context. The method responsible for sending the executed action to the legacy system assembles an input message to that system according to the specifications. This message will be sent through the connection created by the constructor of the parent object. When a reply to this message is received from the legacy system, it is parsed and the values it contains are assigned to the corresponding attributes. A DSL for interoperability
Applying the DSL
Listing 9. Modeling the simulator flow 2 3 4 5
S i m u l a t o r F l o w < DSL # I d e n t i f i e s t h e program i n t h e # legacy environment trancode : simulator # Initializes attributes b i r t h d a t e = ’ 22/02/1983 ’ income = 2 0 0 0 . 0 0 p r o p e r t y v a l u e = 50000.00
7 8 9 10
# Executes the action ( si1 ) that corresponds # to the f i r s t step in the simulation flow si1
12 13 14
17
# Chooses t h e f i r s t p r o d u c t as t h e # simulation target
19
product code = products [0]
21
# Executes the action ( si2 ) that corresponds # to the second s t e p in the s i m u l a t i o n flow si2
16
22 23
# Writes the i n s t a l l m e n t value of the loan p u t s ” I n s t a l l m e n t v a l u e : ${ i n s t a l l m e n t } ”
25 26
# Changes t h e d u r a t i o n o f t h e l o a n t e r m = 180
28 29
5
2008/10/3
# Executes the action ( si2 ) that corresponds # to the second s t e p in the s i m u l a t i o n flow si2
31 32 33
# W r i t e s t h e i n s t a l l m e n t v a l u e a f t e r t h e change # of the loan duration p u t s ” I n s t a l l m e n t v a l u e : ${ i n s t a l l m e n t } ”
35 36 37 38
end
Judith Hurwitz, Robin Bloor, Carol Baroudi, and Marcia Kaufman. Service Oriented Architecture For Dummies. 2006. IBM. SNA APPN Architecture Reference. 1999. Robert Knight. Don’t shoot the mainframe. Software Magazine, April 1994. Dirk Krafzig, Karl Banke, and Dirk Slama. Enterprise SOA: Service-Oriented Architecture Best. Prentice Hall PTR, 2004. Shriram Krishnamurthi. Programming Languages: Application and Interpretation. Brown University Press, 2003.
6.
Conclusion
This work presented a language specific for the domain of interoperability between object-oriented systems running on microcomputers and legacy systems running on mainframes. The legacy applications are important to organizations, because they contain systems that are essential to their businesses. However, the information in these systems needs to be accessed by users through varied interfaces, including the web. This creates a constant need for new applications that need to communicate with the legacy systems. We conclude that using a DSL for interoperability between legacy and object-oriented systems is a good approach to increase development speed and improve maintainability of this kind of application. Our results encourage the use of the proposed approach, because is was possible to abstract away a great portion of the complexity inherent to interoperability, such as communication, message composition and object assembly. Besides productivity, another important aspect of this approach is the ability given to the developer to describe with greater semantics the transformations in his business processes, using the DSL syntax. There are a few limitations to the DSL approach presented here. First, the focus of this work was the legacy systems developed in COBOL that run on Unisys mainframes. It maybe necessary to adapt the DSL to deal with systems written in other languages or in other hardware platforms from different manufacturers. The second limitation is related to the fact that this DSL is internal to Ruby and its syntax can never abandon the principles of its host language. However, the gains in productivity, its simple syntax, the transparency in the interoperability and the shift in focus to business rules are significant advances.
Daniel Oberle. Semantic Management of Middleware. Springer, 2006. Unisys. COBOL ANSI-85 Programming Reference Manual - Basic Implementation. 1995. Unisys. Transact Services Programming Reference Manual. 1999.
References Ruby doc. 2000. URL http://ruby-doc.org. Jes´us Bisbal, Deirdre Lawless, Bing Wu, and Jane Grimson. Legacy information systems: Issues and directions. IEEE Software, 16(5):103–111, /1999. Obie Fernandez. Agile DSL development in Ruby. 2007. URL http://tinyurl.com/69ajec. Martin Fowler. Language workbenches: The killer-app for domain specific languages? June 2005. URL http://tinyurl.com/7ppah.
A DSL for interoperability
6
2008/10/3