Domain-Specific Modeling Tools as Client Applications Providing the Production of Documents 1
2
3
, Vladimir Dimitrieski2
1 Djukic Software Solutions, Germany,
[email protected] Faculty of technical Sciences, {ivan, dimitrieski}@uns.ac.rs 3 Faculty of Science, University of Montenegro,
[email protected] 2
Abstract. Domain-Specific Modeling (DSM) and tools supporting it have so far been intended for the construction and the usage of Domain-Specific Languages (DSL) in narrow business domains, and towards the code generation for embedded systems. In this paper we present some of problems found in the complex domain of the document engineering and solutions based on DSM. We discuss in the paper the usefulness of the DSM in the document engineering with respect to: (i) simple construction and integration of DSLs used for the automation of various business activities; (ii) fast transformation of abstract models to an arbitrary target language; and (iii) the ability to use modeling tools as client applications for the business process management. We also outline some of shortcomings of existing DSM tools, which are preventing their wider application in the document and template modeling. Most important shortcomings of existing DSM tools are: (i) their lack in support of the specification of the document layout; and (ii) their inadequacy for the specification of frequent model variations.
1
Introduction
Domain-Specific Modeling (DSM) provides high development productivity, quality and a simple maintenance of source code, which is used for an implementation of embedded system functionalities [13]. Nowadays, hardware is not an obstacle for embedded systems to perform some complex control functions. Therefore, DSM methodology has a task of constant self-testing and self-improving, to be able to respond to challenges of more complex business domains. The following issues are to be addressed so as to improve DSM in an application domain: Is it possible to apply DSM in the big and complex systems? Are new methodology steps needed, or existing ones just need to be modified? How to extend the existing modeling tools so that they could be applied in the complex domains? Are there any advantages in a usage of Domain-Specific Languages (DSLs) over a usage of Unified Modeling Language (UML)? If any, what are those advantages? Can a user of a complex system recognize its full functionality and its ease of use in the early phases of specification? There are many papers answering aforementioned questions, mainly in the context of DSM application in narrow business domains [3, 9]. Therefore, we are presenting
adfa, p. 1, 2011. © Springer-Verlag Berlin Heidelberg 2011
our previous experiences from complex domains in using existing and development of new DSM tools in the document engineering. The presentation is directed towards the improvement of document modeling and production processes. Detailed descriptions of research results and corresponding tools are given in [7, 8, 15-24]. This paper is organized in seven sections, including Introduction. In Section 2 we give a description of most important document production problems in the directory publishing. Next, we present our experiences in the construction, integration and application of DSLs in the modeling of static (Section 3) and dynamic (Section 4) document characteristics. In Section 5 we present a way of expanding existing DSM tools to be able to specify the document layout and to synchronize with arbitrary humanmachine interface (HMI) components and client applications. Throughout the paper we give a comparison of applicability between existing UML and DSM, and custombuilt DSM tools used for the document engineering.
2
Document Production in Directory Publishing
In some application domains, documents are both base products and a means for progress verification of business activities, at the same time. Those domains are related to so-called document-centric systems, where all of five document dimensions are equally important. These dimensions are: content, structure, layout, dynamic characteristics (lifecycle) and meta-data [1]. In those systems, by the notion of a document, we denote a formal structure of well-formatted content units (CU), which are implemented in electronic format suitable for printing, such as PDF. Directory publishing, which main activity is the production of electronic and printed advertisements (ads) and telephone books with business ads, i.e. White Pages or Yellow Pages, is an example of a document-centric system. Document content and structure vary from small ads with one line of text, to complex books with more than a thousand of pages. Ads layout rules may vary depending on branches, regions and editions of those ads. Production process depends on a large number of parameters. Therefore, a document's lifecycle often changes and has an influence over instances which are not in any of trivial states (initial and final) [6, 18, 19, 20]. Documents which are not selfidentifiable and which do not contain meta-data about all dimensions are less usable, even if used with good document management systems (DMS). When a document is at the same time a base product and a means for progress verification of activities, its role is completely different from a role in systems in which reports are being created only in final states of activities. In such systems, the production of documents based on XSL-FO, appropriate tools and general purpose document generators (renderers) is mainly satisfactory [2, 6, 11]. However, modeling documents using general instead of domain-specific concepts makes their production more difficult. That drawback of general purpose languages usage is in practice commonly solved by programming many client applications. This simplifies the production process, but at the same time makes a development and maintenance of applications more difficult. Detailed comparative analysis of XSL-FO and DVDocLang which is used by our DVDoc approach may be found in [8].
Fig. 1. DVDocIDE for a document template modeling and meta-modeling
Our approach to a document production in the directory publishing is illustrated through the example depicted in Fig. 1. In Fig. 1. we present a main form of an integrated development environment (DVDocIDE) for document production, which is also shown in the video [20]. Graphical editor, named DVDocEditor, is displayed on the right hand side of Fig. 1. Using DVDocEditor, we specified an abstract template model of an "Offer" document. The "Offer" is a complex document that contains text, tables, pictures, footnotes etc. XML describing the content of the typical instance is shown in the top-left part of the figure. In the bottom-left part of the figure a DSL script textual notation. From the "Offer" we will consider three content units, which are linked to specific, separate activities, marked with A1-A3: A1) Verification of contractors' data; A2) Small ads production, which are printed and published on the web; and A3) Calculation of a price regarding business rules with respect to a consumption of resources for production, printing and publishing of ads.
To make things simple, we render documents only after an activity is finished, and we consider only some states as relevant. Those states are: S1) Contract participants verified; S2) Small ads produced; and S3) Price calculated; In order to automate and simplify the production process, the following requirements denoted as R1-R7 are to be considered: R1) To have offers generated automatically, without any human intervention. That requires automated space calculation based on precise layout rules for small ads and for the "Offer" document; R2) To use same content units in different document types. That requires establishing of semantic relations between specific content units, being in the same or different document types. It also requires having the inheritance from a type or an instance; R3) To provide a massive-production of typical small ads and a single-production of untypical ones using same tools and languages; R4) To have very precise and fast document rendering resulting in a good layout, which is usable for printing; R5) To specify business, price and layout rules using software models in a way which is understandable to managers, producers, controllers and to end users; R6) To be able to easily change production models and layout rules without having to stop the production itself. That requires version and model variations control, as well as having different visual and textual representations of same language concepts; and R7) To have a simple specification language, while still being powerful and generic enough to be usable for specification of small ads, different types of financial and advertising documents and books that can have thousands of pages. This is only a selected list of requirements that is verified in practice. The complete list is much longer. These requirements lead to the consideration of a practical use of DSM in a language construction process and production management based on software models. Typical lifecycle of the "Offer" document contains an initial, final and three nontrivial states (S1, S2 and S3). The initial state may be verified by rendering an empty PDF document containing only the string Offer. In the end state a PDF is rendered in which the complete content is displayed, including the footnote with payment instructions. Between the production of the initial and the final state of the document, there is a considerable amount of time spent, during which business, layout and even specific content unit production models may change, as it is presented in more details in Section 4. A complete formal specification and automation of the contractor verification (A1) can be achieved with the majority of UML and DSL tools. The production of small ads activity (A2) is the most complex one, and in order to simplify it, we constructed
two DSLs. The first one is for specifying static characteristics and small ad layouts (DVAdLang). The second one is for specifying dynamic characteristics of ads and documents (DVDocFlow). These two languages, DVDPriceLang for price rules specification (which automates A3) and some other languages, have been integrated in one meta-model, named DVDocLang [17]. Advantages of using DVAdLang to describe static and DVDocFlow to describe dynamic document characteristics in comparison to GPLs have been described in Sections 3 and 4. The price rule specification activity (A3) is an example helpful in understanding model variations. However, we do not present it in this paper in more details, due to a limited space. More information about model variations may be found in [4, 26]. We have used DVDoc approach in order to fulfill requirements marked with R1R7. This approach contains the following steps: Construction of different DSLs for automation of various activities; Integration of separate DSLs at the meta-model level in a way which guarantees a high level of integration of production activities; and Production control using software models and sets of applications which interpret meta-models and models, instead of generating application source code. The problem of having a high number of untypical document instances is treated as a problem of frequent model variations. It has been solved by introducing the concept of modifiers [15, 26], which combines language syntax features for a description of model variations [4] and multi-level modeling [14]. In next two sections, we introduce languages for describing static and dynamic document dimensions.
3
Modeling Document Structure, Content and Layout
Due to a specific role of documents in the production of small ads (A2), we have decided to construct, integrate and use separate DSLs, as well as to develop specific tools for DSM. The aforementioned document role defines a document as a base product and also describes the requirement of having an incremental verification of an activity progress using well-formatted content units. Based on our experience, UML tools and Eclipse Modeling Framework (EMF) [5] have limited usability for document production due to the following reasons: Limited capacities in construction of visual representations of language concepts; Complex integration of different meta-models (DSLs); Lack of flexibility in model transformations to an suitable target language; and Unsuitability for a specification of a larger amount of model variations. Due to these reasons, in order to have fast construction and testing of DSLs and a fast insight into a possible level of automation of activities A1-A3, we have used the commercial tool MetaEdit+ Workbench (ME+) [10].
Fig. 2. Ad modeling using graphical DSL and DSL scripts
Small ad examples (Ad_1, Ad_2, Ad_3) representing possible outputs of the production of small ads activity (A2), are shown in the left hand side of Fig. 2. An abstract model of the Ad_3 specified using a DSL named DVAdLang, is shown in the top-right part of the figure. Its textual representation, specified using DSL script, is shown in the bottom-right part of the figure. A modeling of a structure and content of ads requires a specification of various content unit types, i.e. - key, name, -phone number, etc. Those content unit types are to be connected using gray ci idated against the meta-model, which is done automatically by the DSM tool. DSM and UML tools do not support a layout specification like WYSIWYG editors do. Therefore, to achieve a layout rule specification we have developed DVDocEditor, which main form is presented in Fig. 1. Advantages of modeling by means of DSL tools are: (i) each described instance of an ad is syntactically correct and contains meta-data; (ii) validation of structure is done by the DSM tool promptly and automatically; (iii) the abstract model gives an approximate layout of a document; and (iv) from one abstract model it is possible to generate multiple DSL script variants using code generators. The fourth advantage also means that from one model it is possible to get all ads, Ad_1-Ad_3, using parameterized script generators. Even though DSM tools are not used for spatial arrangement, topological relations between content units may be represented like any other relations and the position of element of specific type can be expressed using languages for model constraints specification [10]. An ad model is not the ad as it is seen by the end user. However, an ad model is very useful to its creator. To make the ad model useful for the end user, the precise and fast transformations of abstract into concrete models (model-to-text, M2T) and a fast and precise interpretation of the DSL script are required. For fast M2T transfor-
mations we have developed the navigation language DVDocRepLang [16, 18] and an interpreter, which is similar to MERL [10]. In order to further speed up the ad modeling, instead of a graphical editor, a text editor is used. In that case, abstract ad models do not have an explicitly specified visual representation of the model. Analyzing the DSL script and bringing it into context with language concepts, a satisfactory default visual representation for usage in the DSM tool can be obtained. A textual editor is controlled using meta-models (DSL definition), which are stored in the repository and on the template server [15]. Therefore, it can be said that the textual editor is contextand document-state sensitive interpreter of meta-models. DSL script syntax that is used in textual editors is simple and has the form of CuValue. CuType is an identifier of the object type. CuMode is an optional identifier of a type variation or a layout variation, and CuValue is a value and can represent text, path to the image, logo identifier, etc. A simplified document production process by using DSM approach and tools, contains the following steps: (i) DSL construction for specific document types (ads); (ii) modeling of documents using graphical or textual editors interpreting DSL metamodel; and (iii) passing DSL scripts to the target interpreter, that is, to the document renderer. Finally, the ad is produced in a form of a picture, PDF or HTML document. In practice, the end user repeats the second and the third step normally every couple of seconds. In production, when there is up to 200 users and thousands of templates, it is usually enough to have one server with 2GHz Dual-core processor to support such document production process. Compared to the production of the same scope, which is based on XSL-FO languages, the use of human and computer resources is reduced ten times [23].
4
DSM Tools as Client Applications for a Business Process Modeling
For the purpose of modeling the dynamic characteristics of a document, we have developed new DSL, named DVDocFlow [18, 19, 21, 22]. There are many reasons why we used MetaEdit+ in a construction of DVDocFlow. The main reason is a simple integration with our custom-built DVDocIDE [20, 21, 22], integrated development environment. For integration purposes, we have developed a plug-in using the C# programming language. Detailed description and source code of the plug-in may be found in [21, 22]. DVDocFlow is similar to the UML activity and state diagrams. It allows relations to be established between activities, states, content units and their visual representations (layout definitions). The second reason refers to the possibilities of MetaEdit+ to generate code from abstract models in any of the target languages that we constructed: DVDocLang, DVDocFlow and DVAdLang. Besides, MetaEdit+ generators can be used for a generation of resources, semantic actions towards framework services and images. Images and other resources can be used as elements of client applications managing the production of documents. The third reason is that a graphical editor of DSM tools can be also used as a client application managing the
production of documents. Finally, using DSM tools, we have provided a construction and an integration of languages for specification of all five document dimensions.
Fig. 3. Dynamic system characteristics specified using the DVDocFlow language
In Fig. 3 we illustrate a simplified example of two model variations in a document production process, which are described using the DVDocFlow language. Rounded rectangles represent the A1-A3 activities we have already introduced in Section 2. On the right hand side of diagrams in Fig. 3, we present textual editors with DSL scripts for the structure, content and states of documents. S* command defines relevant state of a document after the end of an activity. The activity can be simple or composite. A place of the S* co ity-staterated in Fig. 1 and shown in [18, 19]. The textual editors are driven by production models and templates, and they enable a specification of instance content and structure. In the left hand side of Fig. 3 a model of document production process is depicted. That model consists of three activities. New activity, logo production (A2_L), is introduced. This activity can be automated by introducing a new DSL for drawing logos, which would support basic vector graphics concepts. A2_L may be also accomplished by using general purpose graphical tools. Most notable consequences by the activity, also introduces at least one new document state. In this example it is the logo produced state. In the approach, which allows an incremental specification and a rendering of documents [15] in the model driven way, every change in the production model is automatically propagated to the DSL script of the document. This DSL script is shown on the right hand side of Fig. 3 New documents may have an increment of the script: S2_L 7927, center, 15 . command represents a synchronization point between activities and reports using wellformatted PDF documents. With the integration of DVDocIDE and DSL tools using the repository, a fast synchronization of specifications is also allowed. Meta-data of each document instance, regardles lifecycle and the upholding of those modifications in the next document state. A video
that illustrates synchronization between tools and the usage of the plug-in is given in [21].
5
Connecting DSM Modeling Tools With HMI Components A part of the academic community and software industry have completely differ-
DVDoc approach considers as an equal dimension to the other five. The part of the academic community argues that with the introduction of XSL-FO language, the people from the software industry argue that this problem is yet to be solved in the next ten years [4]. According to our previous experience, the part of the academic community overlooks disadvantages of applying general solutions in highly specialized production environments. The industry expects that different console devices will need different languages for a document layout specification, as well as different interpreters for those languages. Our approach to a document layout specification, based on DSM, uses domain-specific rather than general purpose languages (GPLs). As the most of existing DSM tools are for general purposes, we have developed a component to extend the area of their application to the document layout specification. This component is named DVDocRepLangCtrl, and the report language is named DVDocRepLang [18]. To support a simple integration of our solution and the MetaEdit+ environment, our starting points were functionalities and the syntax of MERL language and code generator [10]. In Fig. 4. we present an integration model of DSM tools and HMI components using action reports on the generator level. Action reports are described in more detail in [25]. They allow navigation, setting and exchange of the property values between arbitrary user components of HMI and DSM tools. The main purpose is the extension of functionalities of Fig. 4. Integrating DSM tools and HMI components DSM tools for document layout modeling using domain-specific components for visualization. Every concept of an arbitrary DSL, for which construction DSM tools are used, has at least one visual representation independent of user components. As semantics is assigned to the DSL properties in order to model domain specific problems, semantics is also assigned to layout definition properties. User HMI component is nothing but an additional visual representation of some DSL concepts, specified without using DSM tools. Therefore, properties need to be directly linked in order to equalize their
semantics. This is presented as property linking. Through HMI action specification eports represent the composition of operations and definitions of actions performed during the exchange of property values between modeling tools and user components. The modeling instance state is changed using actions called through API functions, queries (commands) over repository or by passing functions and their parameters to the target interpreter. Connecting DSM tools with an arbitrary HMI or user control for a document layout specification, using the DVDocRepLangCtrl, may be done as follows: Using MERL, write action reports containing commands for setting properties. This is possible due to similar syntax of DVDocRepLang and MERL. Those reports are then interpreted by an arbitrary client application that uses DVDocRepLangCtrl, by setting properties of controls. Linking (mapping) of properties is provided on the generator level, in reports. A DSL script may be generated on both the DSM tool side and a client application side. The script is then passed to the document generator, in this case to the DVDocLang interpreter (DVDocRender). This allows us to debug the documents. In another words, this allows us to execute abstract models representing template or document specifications. Our present solution is applicable only to the HMI components written in Microsoft .Net framework.
6
Related work
In [12], the authors present Model Driven Engineering Framework for graphical components. They also present an editor that supports specification of three document dimensions. Solutions presented in the paper are oriented to a code generation. Due to model variations we are focusing on textual editors that are interpreting models. Our approach to the integration of DSLs, describing various document dimensions, is based on the model integration example shown in [3]. This way of integration with full control over meta-models provides a significant simplification of document production.
7
Conclusions
We developed domain-specific languages mainly to speed up the production of small ads as well as to make the production easier. It turned out that more user categories, such as: template designers, sellers, marketing personnel, programmers and application designers were much more productive in doing their jobs by using domainspecific terminology and concepts. We conclude that our application of DSM in the directory publishing (document engineering) was successful. DSM tools can also be
extended in order to be used as the client applications. The most important precondition is that DSM tools should provide generators as logically independent units that use model navigation language instead of fixed patterns.
8
References
1. Angelo Di Iorio, Luca Furini, Fabio Vitali, "Higher-level Layout through Topological Abstraction", Proceeding of the eighth ACM symposium on Document engineering, Sao Paulo, Brazil, September 16-19, 2008. 2. Apache Software Foundation: "FOP", http://xmlgraphics.apache.org/fop/0.95/index.html 3. Atzmon Hen-Tov, David H. Lorenz, Assaf Pinhasi, Lior Schachter: "ModelTalk: When Everything Is a Domain-Specific Language". IEEE Software 26(4), 2009, pp. 39-46 4. Common Variability Language (CVL), CVL 1.2 User Guide, http://www.omgwiki.org/variability/doku.php 5. Eclipse Modeling Framework Project (EMF) , http://www.eclipse.org/modeling/emf/ 6. Exstensible Stylesheet Language, Formatting Objects (XSL-FO), Reference Manual, http://www.w3.org/TR/xsl/. 7. Ivan Lukovi , Verislav Djuki , "DVQL Language Specification", www.dvdocgen.com/Framework/DVQL.pdf, Accessed: March, 2012 8. Ivan Lukovi , Verislav Djuki , DVDocLang vs. XSL-FO, www.dvdocgen.com/Framework/DVDocLang_XSL-FO.pdf 9. Juha-Pekka Tolvanen, Steven Kelly, "Integrating Models with Domain-Specific Modeling Languages". Proceedings of 10th Workshop on DSM, Reno, Nevada, USA, Helsinki Business School, 2010. 10. MetaEdit+ Workbench, MetaCase, www.metacase.com 11. Nathan Hurst, Wilmot Li, Kim Marriott: "Review of Automatic Document Formatting", Proceedings of the 9th ACM symposium on Document engineering, Munich, Germany, September 16-18, 2009. 12. Olivier Beaudoux, Arnaud Blouin, Jean-Marc Jézéquel, "Using Model Driven Engineering technologies for building authoring applications", ACM Symposium on Document Engineering, Munich, Germany, 2010. pp 279-282 13. Steven Kelly, Juha-Pekka Tolvanen, "Domain-Specific Modeling: Enabling Full Code Generation", ISBN: 978-0-470-03666-2, Wiley-IEEE Computer Society Press., March 2008 14. Thomas Kühne, Daniel Schreiber, "Can Programming be Liberated from the TwoLevel Style? Multi-Level Programming with DeepJava", Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications, Montreal, Quebec, Canada, October 21-25, 2007. 15. -Specific Modeling in Document Engineering", Proceedings of the Federated Conference on Computer Science and Information Systems, Poland, 2011
16. Verislav D www.dvdocgen.com/Framework/DVDocRepLang.pdf, Accessed: March, 2012 17. Verislav Djuki , "DVDocLang Language Reference", Accessed: March, 2012 www.dvdocgen.com/Framework/DVDocLang.pdf 18. , DVDocRepLang demo, video, Accessed: March, 2012 http://www.dvdocgen.com/Framework/ModelTransformation.wmv 19. owLang demo , video, Accessed: March, 2012 http://www.dvdocgen.com/Framework/DVDocFlow.wmv 20. Verislav Djuki , Using DVDocIDE , video, Accessed: March, 2012 http://www.dvdocgen.com/Framework/UsingDVDocIDE.wmv 21. Verislav Djuki , Using MetaEdit+ from DVDocIDE , video, Accessed: March, 2012 http://www.dvdocgen.com/Framework/DVDocIDEMetaEditCtrl.wmv 22. Verislav Djuki , MetaEdit plugin for DVDocIDE, source code, Accessed: March, 2012 http://www.dvdocgen.com/Framework/DVDocIDEMetaCase.rar 23. Verislav Djuki , "DVDoc Renderer Benchmak", Accessed: March, 2012 http://www.dvdocgen.com/Framework/DVDocRenderBench.pdf 24. Verislav Djuki , DVDocGen Framework, application interface, http://www.dvdocgen.com/Framework/DVDocFramework.pdf, Accessed: March, 2012 25. Action Reports for Testing Meta-models, Models, Generators and Target Interpreter in Domain-Specific Modeling", Proceedings of the Federated Conference on Computer Science and Information Systems, Poland, 2012, http://www.dvdocgen.com/Framework/ActionReports.pdf, Accessed: Sep., 2012 26. Untypical Document Instances using Domain-Specific Languages", Internal Report, http://www.dvdocgen.com/Framework/UntypicalDocInstances.pdf