Secure adaptation of software services

Secure adaptation of software services Thesis written by

José Antonio Martín Baena and supervised by

Prof. Ernesto Pimentel Sánchez June 23, 2012

El Dr. Don Ernesto Pimentel Sánchez, Catedrático de Universidad del Área de Lenguajes y Sistemas Informáticos de la E.T.S. de Ingeniería Informática de la Universidad de Málaga, Certifica que D. José Antonio Martín Baena, Ingeniero Informático, ha realizado en el Departamento de Lenguajes y Ciencias de la Computación de la Universidad de Málaga, bajo mi dirección, el trabajo de investigación correspondiente a su Tesis Doctoral titulada:

Secure adaptation of software services Revisado el presente trabajo, estimo que puede ser presentado al tribunal que ha de juzgarlo, y autorizamos la presentación de esta Tesis Doctoral en la Universidad de Málaga.

Málaga, enero de 2012

Fdo.: Ernesto Pimentel Sánchez Catedrático de Universidad Área de Lenguajes y Sistemas Informáticos

Acknowledgements I would like to thank. . . First of all to you, the reader, for trying to bear with me. The best of luck. To my advisor, for having borne with me. To my temporary hosts, for having borne with me, a little. Also to all the reviewers who rejected my articles, they helped me to work harder and better, sometimes randomly. To the reviewers who accepted my articles, thanks to them I never threw in the towel, often randomly, again. And to the reviewers who did not read my articles, for they taught me that, regardless of my efforts, my articles are not a pleasure to read. A special thanks to Gwen, I learnt from him how to write well and how to work hard on an article. Double thanks to Ernesto, I am still trying to learn how to multi-task like him and to manage all the stress from so much writing and hard work. And another thank you to Antonio, since there is life outside of work, and you can enjoy yourself and socialize in work. A warm hug to all my lab mates, wherever their current whereabouts, they really are the essence and spice of research. To Mayca, for helping me put things into perspective and recognise what really matters. To my family, for they have learnt with me what it is to be a PhD and they were always there for me. I learned from all of you –well, maybe not from the reader, yet– and from whoever else I may have forgotten to mention. I have learnt how to work, to research, to write, to present, to travel, to laugh, to get stressed, to speak and to eat. You shaped me.

Thank you.

6

Contents

1 Introduction 1.1 A bit of history . . . . . . 1.2 A bumpy road to success 1.3 The wild side . . . . . . . 1.4 Overview . . . . . . . . . 1.5 References . . . . . . .

I

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

The essence

11 11 12 14 16 19

23

2 Main concepts 2.1 Adaptation contracts . . . . . . . . . . . . . . . . . . . 2.2 Intensional semantics of adaptation contracts . . . . . . 2.2.1 Eager choices . . . . . . . . . . . . . . . . . . . 2.2.2 Data dependencies . . . . . . . . . . . . . . . . 2.3 Behavioural adaptors . . . . . . . . . . . . . . . . . . . 2.3.1 Synchronisation between services and adaptors 2.3.2 Deadlocks and livelocks . . . . . . . . . . . . . . 2.3.3 Adaptor synthesis . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

25 26 32 33 33 34 35 36 37

3 State of the art 3.1 Incompatibility levels . . . . . . . . . 3.2 Restrictive approaches . . . . . . . . 3.3 Generative approaches . . . . . . . . 3.4 Security adaptation by contracts . . . 3.5 Comparison of adaptation approaches

. . . . .

. . . . .

39 39 41 42 44 46

II

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

Behavioural adaptation

4 Dynamic adaptation 4.1 A WSAN example: SPIN and TinyDiffusion 4.2 Learning adaptors . . . . . . . . . . . . . . 4.3 Learning policies . . . . . . . . . . . . . . 4.3.1 Bounded learning . . . . . . . . . . 4.3.2 Prefix-driven absorption . . . . . . . 4.3.3 Reset on empty adaptors . . . . . . 4.4 Evaluation and tool support: ITACA . . . . 4.5 Zero-knowledge adaptation . . . . . . . . .

49 . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

51 54 59 63 63 64 65 66 71 7

CONTENTS 5 Web service discovery 5.1 Feature-based behavioural adaptation . 5.2 Abstracting adaptable behaviour . . . . 5.2.1 Adaptation-dependency trees . . 5.2.2 Feature-dependency rules . . . 5.3 Search tree of adaptable Web services 5.4 Conclusion and future work . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

73 77 80 80 83 86 90

6 Automatic generation of adaptation contracts 6.1 Motivating example in abstract BPEL . . . . . . . . . . 6.1.1 From abstract BPEL to behavioural interfaces . 6.1.2 Case study: a file exchange system . . . . . . 6.1.3 An alternative notation for adaptation contracts 6.2 Generating adaptation contracts . . . . . . . . . . . . 6.2.1 Graph search with A* . . . . . . . . . . . . . . 6.2.2 The expert system . . . . . . . . . . . . . . . . 6.2.3 Prototype tool: Dinapter . . . . . . . . . . . . . 6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

91 94 94 95 96 98 99 105 107 110

III

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

QoS adaptation: security

7 Security adaptation 7.1 Motivational example . . . . . . . . . . . . 7.1.1 Methodology for security adaptation 7.2 Web services with security QoS . . . . . . 7.2.1 WS-Security . . . . . . . . . . . . . 7.2.2 Secure service behaviour . . . . . . 7.2.3 Crypto-CCS . . . . . . . . . . . . . 7.3 Security adaptation contracts . . . . . . . . 7.3.1 Security contract terms . . . . . . . 7.3.2 Validation of contract terms . . . . . 7.3.3 SAC . . . . . . . . . . . . . . . . . 7.4 Summary . . . . . . . . . . . . . . . . . .

113 . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

115 120 124 125 125 129 132 137 138 143 145 149

8 Synthesis of secure security adaptors 8.1 Overview of the synthesis of security adaptors 8.1.1 Functionally-correct adaptors . . . . . . 8.1.2 Secure adaptors . . . . . . . . . . . . . 8.1.3 Motivational example . . . . . . . . . . 8.2 Security adaptors . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

151 153 154 154 156 158

8

. . . . . . . . . . .

CONTENTS 8.3 From security adaptors to Crypto-CCS . . . . . . . 8.4 Synthesis of functionally-correct security adaptors 8.4.1 Data dependencies . . . . . . . . . . . . . 8.4.2 Control dependencies . . . . . . . . . . . . 8.5 Verification . . . . . . . . . . . . . . . . . . . . . . 8.5.1 The attacker . . . . . . . . . . . . . . . . . 8.5.2 Verifying security adaptors . . . . . . . . . 8.5.3 A language to describe protocol properties 8.5.4 The most general attacker . . . . . . . . . 8.6 Securing adaptors through refinement . . . . . . . 8.6.1 Refinement . . . . . . . . . . . . . . . . . 8.6.2 Synthesis and refinement overview . . . . . 8.7 Conclusion and future work . . . . . . . . . . . . .

IV

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

Final remarks

161 165 167 168 170 171 171 172 174 175 176 176 180

181

9 Future work 183 9.1 Sound generation of adaptation contracts . . . . . . . . . . 183 9.2 Encoding adaptors into executable languages . . . . . . . . 184 9.3 Application and evaluation over a real-world project . . . . . 186 10 Conclusion 10.1 Adaptation drawbacks 10.2 Proposed solutions . 10.3 Advantages . . . . . 10.4 Summary . . . . . . 10.5 Epilogue . . . . . . .

V

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

Appendix

193

A Proofs B Tool support B.1 STS-XML . . . . . . . . . . . . . . . B.1.1 Service interfaces . . . . . . . B.1.2 Adaptation contracts . . . . . B.2 Dinapter . . . . . . . . . . . . . . . . B.2.1 Requirements and installation B.2.2 Within ACIDE . . . . . . . . .

187 187 188 189 189 190

195

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

205 205 205 207 207 207 209 9

CONTENTS B.2.3 Standalone . . . . . . . . . . . . . . . . . . . . . . . 210 B.3 Dynamic adaptors . . . . . . . . . . . . . . . . . . . . . . . 210 C Secure synthesis using ITACA 215 C.1 Convert SACs into equivalent adaptation contracts . . . . . 215 C.2 Recovering security adaptors . . . . . . . . . . . . . . . . . 215 D Adaptación de servicios software D.1 Resumen de la tesis . . . . . . . . . . . . . . . . . . . D.2 Contratos de adaptación . . . . . . . . . . . . . . . . D.3 Semántica intensional de los contratos de adaptación D.3.1 Elecciones anticipadas . . . . . . . . . . . . . D.3.2 Dependencias de datos . . . . . . . . . . . . . D.4 Adaptadores de comportamiento . . . . . . . . . . . . D.4.1 Sincronización entre servicios y adaptadores . D.4.2 Interbloqueos . . . . . . . . . . . . . . . . . . D.4.3 Síntesis de adaptadores . . . . . . . . . . . . D.5 Conclusiones . . . . . . . . . . . . . . . . . . . . . . D.5.1 Epílogo . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

217 217 220 226 226 227 228 229 230 231 232 233

Official acknowledgements

235

Bibliography

235

Glossary

245

Acronyms

249

List of symbols

251

10

If you’ve heard this story before, don’t stop me, because I’d like to hear it again. G. Marx

1

Introduction 1.1 A bit of history

Web services are everywhere in these days. The advent of early Web applications such as Web mail taught the users that the Internet could be interactive. In addition, services that were traditionally provided in the physical world are being encompassed by the Internet everyday. Starting with eMail, it was followed by digital music (e.g., Napster), retail stores (Amazon), auctioning (eBay), video (YouTube), news (Reddit), telephony (Skype), weather (weather.com), maps (Google Maps) and a myriad more. Then came the Web 2.0 and the users could manipulate the Web and easily generate new content. Wikipedia pages were created, edited and moderated, news were ranked and highlighted based on the preferences of the users, blogs and video-blogs were being created every minute in Blogger and YouTube. Although this content creation was easy and straightforward, it required a certain amount of time and effort, and hence not everyone wanted to participate in the Web 2.0 as content producers. This changed with the appearance of social networks. The Internet was now customised to your neighbourhood, favourite groups and friends. Everyone was interested in consuming personal information coming from friends and felt rewarded by posting their own thoughts. Creating information has never been easier. Then it came the hype of the smartphones. Everyone craved for the experience of their social networks, their email and their office at their fingertips 24 hours a day. You are always connected, always accessible, with a huge amount of personal information in your pocket. You could now post in Twitter what you were buying, share where you were shopping with Latitude and post how rude the shop assistant was using Foursquare. The Internet was becoming actually pervasive. With so many people relying on smartphones, tablets and other handhelds devices, a natural step was to shift the power to the clouds. Data 11

CHAPTER 1. INTRODUCTION came first. Music, pictures, books and general files needed to be stored in iTunes, Flicker and Amazon, where they are automatically synchronised between personal devices. Documents are not only stored but also collaboratively edited and commented online with Google Docs. Also, traditional taskspecific tools such as version control systems (e.g., subversion or git) are being replaced by less intrusive and general purpose applications such as Dropbox for smaller projects. And, for the most-demanding enterprise-level solutions, several cloud storage solutions are offered such as Amazon S3. Now that all our data were in the clouds, it was time to take advantage of the huge computational horsepower offered by the farms of servers and data-centres developed by the main Internet service providers. Google, Amazon and Microsoft used all the expertise they had gained from managing their world-wide services to offer both the services and the infrastructure that powered the services. Exploiting the same investment, they are now both service and infrastructure providers. Google App Engine, Amazon EC2 and Windows Azure were born. This was good news for Small-Medium Enterprises (SMEs), which could have a competitive infrastructure at a low cost. This was also good news for users, because they could now do things on their smartphones, tablets and netbooks that could not be done before. New services such as On-Live offer cross-device gaming where the AI computing and expensive 3D rendering is done in the clouds and the result is video-streammed to the device with state-of-the-art low-latency bandwith-efficient protocols. You can play Assassins’ Creed on your smartphone, providing you can bear with the controls. Of course, this was also a breakthrough for the cloud providers themselves because they make double the profit from the same infrastructure.

1.2 A bumpy road to success We do not know what new wonders this digital race holds for us but we can participate in its progress. What do all these service and platforms have in common? Some of them were not considered useful when they were initially launched. For instance Twitter struggled (and still does) to gain world-wide popularity, but social unrest and revolutions such as those in Tunisia or Egypt were a baptism of fire to global recognition [And11]. Some of the most successful Web services (WSs) were really focussed on sales, others on social information, others in advertising, and some others in providing added-value to support complementary services. Some of the biggest online services were built by huge companies, being nurtured and polished over the years, others were crazy ideas embodied in startups with 12

1.2. A BUMPY ROAD TO SUCCESS billions in revenue. However, they all share the following characteristics: Ease of use. Being either a native application for a handheld device, an online site, a Web service or an Application Programming Interface (API) all these services present a low learning curve. Their target audience may vary, some are oriented to developers, others to enterprises, others to general people but, regardless of their target, transparent, non-intrusive, automatic/assisted and low-frustration are key elements of a successful service. They exploit usually unforeseen synergies to provide added-value. Search engines with relevant advertising, low-distribution and shipping costs retailing and social information for CRM are good examples of this concept. For instance, Google learned very well how to monetize their search engine with AdSense, Apple made a ground-shaking entrance in the discographic world with its iPod and iTunes, Amazon did the same to book publishers with its Kindle and Valve became an important addition to the game distribution industry with Steam. Interoperability. Just as hyper-links are fundamental for the Web, these online services thrive when put together. Social networks plus picture storage are a great hit. Being able to call a telephone number right from your email client increases the value of both services. Discover a new trendy restaurant, obtain directions to it and finally rate it with three clicks. All these examples interconnect different services, homogenising, filtering and highlighting the most relevant information. Allowing you to react to this information to buy, book, watch, read or visit just one click away. Hybridise services in unexpected and sometimes ephemeral ways so that you make the most of them. In this manner, services share the scarce attention of the user, they promote valuable clicks from/to partner services and, together, they provide an integrated experience to the user that no single service can offer. This thesis focuses on interoperability, without forgetting about the other two features. Some commonly known examples of interoperability are feed aggregators, geo-localised information services and the new layer of social interaction that its being given to almost every service nowadays. Feed aggregators such as Google Reader, although somewhat relegated to a second place due to theirs problems of monetizing, were quite popular with the rise of blogs. Their goal was to put together information from different sources under a single and homogenised interface. Moreover, feed aggregators took advantage of this higher level of control 13

CHAPTER 1. INTRODUCTION to provide ranking, recommendation and sharing services. They exemplify the advantages of data interoperability and the new functionalities that can be envisioned with it. The main technical concept behind aggregators are ATOM and RSS. These are standard formats in which the information is published so that it can be easily merged and sorted with other sources. However, standardisation is the result of a long maturity process which is not always fit to try innovative services and new business models. Another example of service interoperability is the new layer of social information embeded in every other service. Shopping with recommendations from family, eMail knowing the latest posts from your friends, and dynamic collaboration in a document with your working group. In this way, the other services are more relevant, tailored to your interests and diretly recommended to you by people you know and trust. It is worth highlighting three cases. On the one hand, Twitter is the extreme of public availability and third-party interoperability. In fact, the success of Twitter comes from how well integrated it is with mobile phones (by SMS), first, and then with general Web pages and other social networks. A tweet takes less than a minute to write from any device and it is instantly published in all your main communication channels. On the other hand, Facebook represents isolation, a social network that attracts Web sites, games, pictures and discussions to it but it rarely lets the information out. The information in Facebook can be locked-in but, nevertheless, they also foster interoperability in the sense that you can “like” virtually any site, directly from the site and it provides comprehensive APIs to the applications and games that want to run within Facebook. We will devote a whole section to the third case.

1.3 The wild side Social networks such as Twitter or Facebook represent interoperability centred around a major service. These usually offer APIs based on JavaScript (with JSON), RESTful services (where operations are mapped to HTML requests to specific URLs) or traditional WS interfaces and protocols based on WSDL, SOAP and XML-RPC. In this way, consolidated and popular services provide an interface to attract visitors and attention from other sites. Mashups represent the other side of the coin. Mashups exemplify how heterogeneous services from different providers can be put together with little effort, sometimes without an official API or standard, and achieve unexpected results. For instance, a map which shows the zones most affected by a tsunami, or where to buy cheapest gas, or a Web page 14

1.3. THE WILD SIDE which allows you to listen from Last.fm the songs recommended by your friends in Google+ giving you the option to buy the album from Jamendo. Mashups focus on quick results and therefore they take advantage of the interfaces provided by major services but they also resort to more fragile techniques such as Web scraping, i.e., extracting information from a Web site simulating a browser. Nonetheless, service compositions are limited by incompatible interfaces and data formats, the services to compose might use different authentication schemas (and security policies, in general) and, normally, these compositions access stateless operations instead of being involved in persistent sessions. Example 1.3.1 Let us imagine that we want to create a new service that shows our tweets on a map side by side with relevant pictures. For instance you might try to use the REST (plus JSON) interface of Twitter and obtain tweets using an HTML request: GET https://api.twitter.com/1/statuses/public_timeline.json

Then you want to use the JavaScript API of Google Maps to show the tweets on a map so you have to call: map.addOverlay (new GMarker(new GLatLng(-33.9417, 150.9473)));

Now, to top it off with some relevant photos you need to send the following SOAP message to Flickr. flickr.collections.getTree 12345 12345 555

By now you have noticed that you have to deal with the different APIs and data formats of the different services but, when you dig into your private tweets and your own personal pictures in Flickr you discover that they use different formats to authenticate you as a valid user, they have different security schemas. In Flickr you have to use your developer key to obtain a Frob, i.e., a token to identify the login session. Then you need to generate another Token using that Frob, both communications being transported through an HTTPS channel. With Twitter, instead, you might want to use the Open Authorization (OAuth) protocol to identify yourself, thus you need to deal with different security schemas, and encryption, integrity, authentication, . . . 15

CHAPTER 1. INTRODUCTION Once you have done all of that, you think that the worst is over and you only have to drop all of it in an HTML page with some PHP and JavaScript to glue it altogether. But then you realise that the different authentication operations need to be called in an specific order (first get the Frob, then the Token), that you need to obtain the list of collections from Flickr before accessing their pictures and so on. You have to comply with the protocol, the behaviour of the different services and to adapt them to the behaviour you desire from the composite service you are developing. You are designing a central service which is in charge of coordinating the rest of the services so that the client needs only to connect with a single point, i.e., the orchestrator service. After some struggle with the documentation and some trial-and-error you manage to design the orchestrator able to use the different APIs, to successfully connect the different security schemas and to call every operation at its appropriate time. Now you are content with the current performance but want to include more functionality. Specifically you want to show weather icons on the map beside each of the tweets. You have to discover a new service to obtain weather forecast information and rework all the interface incompatibilities, security issues and behaviour accordingly. But, what if any of the orchestrated services goes offline? Maybe you want to be prepared to replace Flickr by Picasa dynamically just in case Flickr becomes unavailable for a while. And what if you want to discover-and-replace new services on-the-fly? You would need dynamic reconfiguration. This thesis tries to answer these questions. We go through the whole service-oriented development process from service discovery to dynamic reconfiguration automating the whole adaptation process. Incompatibilities in the operations, security and behaviour will be sorted out and an orchestrator, called adaptor, will be automatically generated.

1.4 Overview The key concept of this thesis is adaptation. The adaptor needs to adapt the operations of the services, i.e., the format in which the operations are called, their names, arguments and data encoding. We will call this the signature of the service. Example 1.4.1 We receive the following extract of a JSON message from Twitter {

16

1.4. OVERVIEW

"coordinates": [37.780, -122.396], "created_at": "Sat Sep 10 22:23:38 +0000 2011", "id_str": "112652479837110273", ...

whereas Google Maps expects the Javascript call that follows map.addOverlay(new GMarker(new GLatLng(37.780, -122.396)));

On top of signature incompatibilities, the adaptor must call the service operations when those services are prepared to respond. In addition, the adaptor has to attend the requests coming from the services. The client of the adaptor is considered another service in the orchestration. The order in which operations are requested/invoked or offered/received may include loops (e.g., while or for), internal choices (e.g., if) and external choices (e.g., pick). This is called the behaviour of the service. Example 1.4.2 We get the tweets in a single RESTful request but, in order to find relevant pictures in Flickr we need to call the following sequence of operations (Fig. 1.1). Get twits: public_timeline.json

Get Frob: flickr.auth.getFrob

Get Token: flickr.auth.getToken

Show pic: flickr.photos.getInfo

Search Pics: flickr.photos.search

Figure 1.1: Mashup behaviour to orchestrate Flickr and Twitter You cannot call getToken before getFrob and search needs to be invoked for every twit in public_timeline.json. In addition, the requirements of the adaptation and the services themselves may require and offer different Quality of Service (QoS) parameters. For instance a service may promise to answer every request in less than a millisecond and another service may require encrypted information. Example 1.4.3 The public timeline requested from Twitter comes in an HTTP channel whereas the communication with Flickr is encrypted 17

CHAPTER 1. INTRODUCTION

Adaptation

Orcheﬆration design

Deployment Contract design

Adaptor synthesis

Adaptor refinement

Monitoring

Discovery

Find services that can be adapted to fulfill the requirements

Open syﬆem Dynamic adaptation design

Ensure that the adapted syﬆem behaves properly

Reconfiguration

Figure 1.2: Overview of the adaptor-centred development process introduced in this thesis In this document we will cover adaptation of signature, behaviour and QoS security. Figure 1.2 shows the cycle of the adaptation-centred development process. It shows the different ways in which adaptation can be applied over the lifespan of a system. If we are building a new system from scratch (for us, a system is an orchestration), the first step is service discovery. Chapter 5 is devoted to this step. In general, service discovery is aimed at finding services based on a query which represents the requirements over their functionality, signature, QoS, provider and associated cost. Our contribution to service discovery is centred on considering service behaviour (not only signature), filter the services by whether they can be adapted to the query or not (they do not need to perfectly match the query, just to be adaptable to it) and scalability. Other proposals for service discovery considering behaviour usually make one-to-one comparisons between the request and the services. These comparison are both numerous (one per service in the registry) and computationally expensive (they require the exploration of the service and query behaviour). The solution presented in this thesis promotes the use of search trees based on service adaptability, and therefore it requires fewer and less expensive comparisons. 18

1.4. OVERVIEW Once we have found a set of suitable services for our system under design, we proceed to devise how to orchestrate all those services in a meaningful way. In order to design the orchestration, we are going to adapt the services one to another. The adaptation process proposed in this thesis is automatic and it requires as inputs the services to adapt and an adaptation contract (or contract, for short), which specifies what we want to achieve from the adaptation. This contract specifies in an abstract way a mapping between the operations of the services (e.g., when a service calls “buy” it is referring to the operation “purchase” in another service) and some high-level restrictions over the adaptation (e.g., the system must make at least one purchase and the credit card number must remain secret to third parties). Although adaptation contracts are usually done by hand, alternatively, we propose a technique to automatically generate adaptation contracts by matching the data exchange between the services in Chapter 6. Then, an adaptor able to orchestrate the services is automatically synthesised to be compliant with the mapping of the adaptation contract and later analysed and refined to meet the high-level restrictions (Chapter 8). We will discuss the main concepts behind adaptation in Chapter 2. In this work we will always assume to have all the needed services beforehand, otherwise, part of the system would be left open (without connected services) and hence we would have to go back to discover additional services. We are now in a position to deploy the synthesised adaptor. Being the adaptor an orchestrator which intermediates between all the other services, it is in an especially advantageous place to monitor the system. Every message is intercepted, manipulated and forwarded by the adaptor so it can verify that the messages are received at their appropriate time and with correct content. This is particularly important when we are adapting services with different security policies. In this case, security adaptors (Chapter 7) need to verify the authenticity and integrity of certain messages and ensure that sensitive information remains secret. Security adaptors are declaratively specified by security adaptation contracts (SACs). In this document we will present two different contributions in the monitoring stage: security adaptors that close further communications when they detect a possible attack, and learning adaptors (Chapter 4). The latter are dynamic adaptors which are not synthesised. Instead, they are directly deployed and they learn, at run time, how to adapt the services based on the monitored successful and failed interactions. Another advantage of learning adaptors is that they are able to react to drastic changes in the behaviour of the services or even new services which appear (or disappear) from the system. This leads to the reconfiguration of the system. 19

CHAPTER 1. INTRODUCTION

Table 1.1: Research highlights Section 2.2 Section 4.2

Section 5.3

Section 6.2

Section 7.3

Section 8.6

It formally characterises the intensional semantics of adaptation contracts It describes how to dynamically synthesise a correct adaptor compliant to the intensional semantics by learning from interaction failures It defines an scalable search-tree able to discover adaptable services, services which can be adapted to fulfil the discovery query Once we have several services which can be adapter to cooperate, this chapter defines an algorithm to automatically generate adaptation contracts to support such adaptation Secure services require security adaptation, this chapter describes security adaptation contracts and how to validate them Synthesis of secure adaptors through verification and refinement based on security adaptation contracts with regard to a given secrecy property

The main contributions of this thesis concerning to the topics above are concisely listed in Table 1.1.

1.5 References The main published contributions that justify this thesis are the following: [MP11a], published in the Journal of Logic and Algebraic Programming, was the first major publication indexed in the Journal Citation Reports (JCR). This work presented SACs for the first time (Chapter 7), analysed their implications in traditional behavioural adaptation and sketched the preliminary ideas to synthesise security adaptors which prevent attacks on the secrecy of sensitive data. [MMP12] was published in the same journal and it developed the foundations presented in the previous paper. Concretely, it elaborated on the synthesis process both directly (taking into account security) and 20

1.5. REFERENCES indirectly (converting the problem into deadlock-equivalent models fit for traditional synthesis techniques). However, the main contribution of this work was the integration of CryptoCCS, which allowed the sound verification and refinement of the synthesised adaptor so as to ensure the secrecy of the information exchanged. This work mainly corresponds to Chapter 8. [MP10b] was presented in the European Conference on Web Services in 2010 and it passed an acceptance rate of 19%. This work introduced the scalable discovery of adaptable services presented in Chapter 5. Other published results: [MP11b]. This is a presentation of an already published high-quality paper [MP11a] in PROLE’11. This presentation unified SACs and the synthesis of security adaptors. [CMS+ 09a, CMS+ 09b, CMS+ 10b, CMS+ 10a]. These papers present Integrated Toolbox for Automatic Composition and Adaptation (ITACA [CMS+ 09a]). The original contribution of this thesis to ITACA was Dinapter, the automatic generation of adaptation contracts (Chapter 6). After these results were published, learning adaptors were also integrated within ITACA and the synthesis of secure adaptors also built on top of this toolbox. It is worth highlighting [CMS+ 09a], which has 34 citations as of January, 2012 (Google Scholar). Additionally, [CMS+ 09b] included a work-in-progress that resulted in the definition of SACs, finally published in [MP11a]. [MP09a, MP09b]. The former presents the formal foundation of Dinapter and the latter is a tool paper which describes its interface and the details of its functionality. In part due to the presentation of ITACA in ICSE’09 [CMS+ 09a], Dinapter obtained some visibility which granted [MP09a] 27 citations as of January, 2012 (Google Scholar). [MBP11]. This paper contains the description of learning adaptors (Chapter 4). [MP10a, MP10c]. The former is a work-in-progress which resulted in the latter publication. In these papers the featured-based discovery of adaptable services was developed, presented in Chapter 5. [MP10d]. This work is an extended abstract which later evolved into the final paper on the synthesis of secure adaptors [MMP12]. 21

CHAPTER 1. INTRODUCTION [CNMF11]. An extended abstract which addresses contract-oriented discovery in Sensor Web.

22

Part I

The essence

Intelligence is the ability to adapt to change. S. Hawking In the struggle for survival, the fittest win out at the expense of their rivals because they succeed in adapting themselves best to their environment. C. Darwing It is not clear that intelligence has any long-term survival value. S. Hawking

2

Main concepts

Service Oriented Architectures (SOAs) are composed of interoperable WSs. However, WSs are not always compatible, a fact which hinders both their reusability, and the development and maintenance of SOA systems. This is particularly important in stateful services with complex behaviour, such as those described in Business Process Execution Language (BPEL [A+ 05]) or Windows Workflow (WF [Scr07]), where any mismatch in the sequence of the messages exchanged may lead the composition to a deadlock situation. For instance, a missing operation in a service, a mismatch in the operation name or arguments, or an unexpected sequence of messages prevent the correct termination of the services involved. Software adaptation [YS97] is a sound solution which enables WSs to interoperate despite their initial incompatibilities. The deployment of suitable “adaptors-in-the-middle" has proven to be an effective way to overcome signature and behaviour incompatibilities between services [BP06, CPS08]. Intuitively speaking, such adaptors intercept, collect, and modify the messages exchanged by two parties so as to overcome their incompatibilities. Adaptor design is a difficult task where the developer must take into account the behaviour of all the services and their possible interactions. In this process, subtle details may be missed, therefore resulting in an erroneous adaptation. We propose adaptation contracts as an abstract specification of the adaptation. These contracts will allow us to state clearly and concisely how to solve the incompatibilities between the services. This thesis has been implemented as part of ITACA, which provides a set of tools that enable the workflow for adaptor synthesis shown in Fig. 2.1. This figure shows how, once it is known which are the services to be adapted, adaptation contracts are generated either in assisted fashion using a CASE tool called ACIDE, or automatically generated using Dinapter (Chapter 6). Only then is it possible to synthesise the adaptor taking as input the generated contract and the services. Let us proceed to formally define these contracts and their implications 25

CHAPTER 2. MAIN CONCEPTS Service Interfaces (Abstract BPEL+WSDL) Designer Interactive Contract Specification + Simulation and Verification (ACIDE)

Service Protocol+Signature Extraction (WSDL2SIG+ABPEL2STS/ AWF2STS)

"

!

#

!

$ "

Similarity Computation (SIM)

Adaptor Protocol / Service Wrapper Protocols Generation ( (D)COMPOSITOR )

...

Adaptor Protocol Adaptor Protocol Filtering + Service Deployment (STS2BPEL)

Adaptation Contract

Service Interface Models (Signature + Protocol STS)

Automatic Contract Specification (DINAPTER) Service Interfaces (Abstract WF+WSDL)

Deployed System (BPEL Adaptor + Original Service Implementations)

Figure 2.1: Adaptation stage: contract design and adaptor synthesis process over adaptors.

2.1 Adaptation contracts An adaptor is specified by an adaptation contract which defines a set of vectors between actions, called adaptation vectors (or vectors, for short), and (optionally) some constraints on the use of said rules. Definition 2.1.1 (Adaptation

contract) An adaptation contract C is a Finite State Machine (FSM) Σc , Sc , sc0 , F c , T c where Σc is a set of vectors, Sc is a set of states, sc0 ∈ Sc is the initial state, F c ⊆ Sc is the set of final states, and T c ⊆ (Sc × Σc × Sc ) is a set of labelled transitions. Vectors in Σc have the form a ♦ b where:

• a, b ∈ Σa are input or output communication actions, • one side of the rule can be empty (viz., a ♦ or ♦ b), • if both a and b are present, then one is an input action and the other is an output action.

Actions contain an operation name or channel followed by a question mark for input actions or an exclamation mark for output actions and a (possibly empty) list of arguments. Adaptors act as mediators between two sides. Any communication between those sides must be intercepted and handled by the adaptor. An 26

2.1. ADAPTATION CONTRACTS

Table 2.1: Notation for generic transitions and traces

(a, l, b) ∈ T

{(a0 , τ, a1 ), (a1 , τ, a2 ), . . . (an−1 , τ, an )} ⊆ T, n ≥ 0 {(a0 , τ, a1 ), (a1 , τ, a2 ), . . . (an−1 , τ, an )} ⊆ T, n > 0

≡

≡

≡

l

a− →b

a0 − → ∗ an τ

a0 − → + an τ

action on one side of a vector denotes the complementary action that the adaptor will perform towards the service on that side. For instance, a vector such as a!args ♦ b?args0 (where args and args0 are lists of symbolic parameters) states that, if the adaptor receives a request to operation a whose content match args from a service on the left-hand side, then it will have to (eventually) call operation b using args0 to a service on the right-hand side. Every message received by the adaptor is matched against a vector, and such matching possibly updates the state of stored parameters maintained by the adaptor. As an example, once vector a!args ♦ b?args0 is triggered by a reception on a, the received content of args is stored, args0 is instantiated and b?args0 inserted in a queue of messages to be eventually sent. When the target service becomes ready to receive, then the first matching message in the queue can be delivered. The transition relation T c can impose restrictions on the order in which vectors can be triggered. In this way, T c permits enforcing high level policies on the communication such as “do not perform more than three requests” or “after every request there must be an acknowledgment”. Throughout this document we will apply the usual notation for generic transitions and traces as described in Table 2.1. We will change the arrow (in symbol or subscript) when it becomes important to distinguish different transition systems. Otherwise we the default will be the standard arrow ‘− →’. We will denote by O[M] the set of possible transition traces starting from the initial state of the state machine M . We will represent either as t˜ or t1 · . . . · tn a trace of transitions, and t ∈ t˜ will denote that transition t is in the trace t˜. Example 2.1.1 Our running example for this chapter is based on a simplified meteorology system. We have three incompatible services with complementary functionality: a) a temperature sensor service, this service could be deployed in a sink of a temperature sensor network; b) a monitoring service which registers the information, this could be located 27

CHAPTER 2. MAIN CONCEPTS in a laptop; and c) a humidity service which might be deployed in the same infrastructure as the temperature sensor network or otherwise. The signature of the services, i.e., their operation names and arguments) are listed in Table 2.2. The temperature service (service a) has output operations user!usr and pass!psw to authenticate with its user name (argument typed usr) and password ( psw); an operation to notify of the current temperature, i.e., upload!temp; two input operations for the upload to be either denied? or answered with a new interval of time prior to the next notification (delay?time); and finally, an output operation to notify that it has finished it current session, end!. Intuitively speaking, input actions (e.g., denied?) represent the availability of service operations while output actions represent service requests (e.g., upload!temp), both followed by the types of their arguments (temp in this case). The monitoring service (service b) might be a new version or come from a different vendor so that it has operations with similar functionalities but an incompatible signature. Instead of operations user?usr and pass?psw, expected by service a, it has a single authentication operation login?usr, psw. The authentication can be re jected! or connected!. It receives the temperature notifications with an operation register?temp and it sends the answer always through answer!time. This service can receive a quit? request and it notifies of the finished session with end!. The monitoring service requires humidity information (typed humid ) before deciding how long to wait for the next temperature update. For this reason, it requests the humidity information from the humidity service (service c) through the request and response getHumid! and getHumid?humid . The latter is understood by service c but, instead of the former, service c needs

Table 2.2: The signature of the running example Service a

user ! usr pass ! psw

Service b

Service c

login ? usr, psw connected !

upload ! temp denied ? delay ? time end ! 28

register ? temp re jected ! getHumid ! getHumid ? humid answer ! time end !

getHumid ? temp getHumid ! humid f inish !

2.1. ADAPTATION CONTRACTS the temperature information to do some calibration via getHumid?temp and it finally ends its session with f inish!. Figure 2.2 illustrates a possible adaptation contract for these services. Rule vu enables the adaptor to receive action user and refers to its argument as U . Rule vl first receives the password (in P) with action pass and, as a consequence, it eventually sends a login message with both the user U and password P previously received. The rest of the vectors behave accordingly. The automaton of the contract states that the goal of the system is that, if service c sends connected (vc ), then the temperature update must be eventually replied to with an answer and sent as a delay message to service a through vector va . In addition, the automaton states that the session should finish (through ve , ve0 and ve00 ) either at this point or before connecting (i.e., before vc ).

As we have seen in the example, services can employ different alphabets of actions (different names of actions as well as different names, number or order of parameters). The vectors of the contract (Σc ) tell us how to solve these signature incompatibilities. In addition, services might also lock due to behavioural incompatibilities between them. Behavioural incompatibilities arise because the sequence in which the operations of a service are offered and requested (its behaviour) is not complemented by another service. For example, maybe the temperature service first requests the upload!temp and then it provides its credentials (via user!usr and pass!psw) whereas the monitoring service expects its respective operations in the opposite order (login?usr, psw followed by an eventual register?data) so their adaptation would deadlock despite the signature adaptation. Adaptation contracts, their FSMs concretely, do not necessarily says how to solve behavioural incompatibilities because the behaviour of the services might be unknown or change. In general, we define the behaviour of a service, i.e., its behavioural interface, as a FSM. Definition 2.1.2 (Service interface) A service interface is formally de fined as a FSM Σi , S, s0 , F, T where: Σi is the set of labels associated to transitions, S is a set of states, s0 ∈ S is the initial state, F ⊆ S are final (or stable) states, and T ⊆ (S × Σi × S) are the transitions. There are several languages with their own semantics in the literature to describe Web service orchestrations [BBDN+ 06, CSC+ 07, SBS06]. However, we have chosen this formalization because it is simple and abstract enough to cover our needs for behavioral matching. 29

CHAPTER 2. MAIN CONCEPTS

Σc = {

user!U ♦ , pass!P ♦ login?U, P, ♦ connected!, upload!D ♦ ,

(vu ) (vl ) (vc ) (v p )

♦ register?D), getHumid?D ♦ getHumid!,

(vr ) (vg )

delay?T ♦ answer!T,

(va )

getHumid!H ♦ getHumid?H,

(vt )

denied? ♦ re jected!,

(vd )

♦ quit?,

(vq )

end! ♦ , f inish! ♦ ,

(ve ) (ve0 )

♦ end!}

(ve00 )

(a) Adaptation vectors Σc \ {va , ve , ve′ , ve′′ } Σc \ {vc , ve , ve′ , ve′′ }

Σc \ {ve , ve′ , ve′′ }

{va }

{vc } {ve }

{ve }

{ve′ }

{ve′′ }

(b) Contract FSM

Figure 2.2: An adaptation contract The labels in Σi are either internal transitions (τ ) or communication transitions which start with the operation name followed by an ‘!’ or ‘?’ if they are output or input actions, respectively. Labels also contain an expression which symbolically describes the content of the message. This content is offered in output actions and required in input actions. In general, this expression is a list of argument types which represents the list of arguments of the action. If needed, operations can be renamed and prefixed by the identification of their services in order to distinguish them within services. For instance, we can rename the operations of our running example so that we can 30

2.1. ADAPTATION CONTRACTS

(a) Service a, tem- (b) Service b, monitoring station (c) Service c, humidity senperature sensor sor

Figure 2.3: A possible behaviour for the services of our running example. Underlined letters will be used to abbreviate the labels differentiate a:end! and b:end! between services a and b. Example 2.1.2 Figure 2.3 shows a possible behaviour for the services of our running example. The initial state is marked with an incoming arrow without source and final states are filled. Internal choices (e.g., if-then-else or switch conditionals) are modelled by τ actions as usual, while external choices (e.g., pick in BPEL) are modelled by input-action-labelled transitions leaving from the same state. In particular, we can see that service b starts with an external choice in which it is ready to receive a login request or to send an end message. After the login, service b can internally decide to proceed to the left branch and reject the session or to the right branch and send the notification connected . The intensional semantics of the contract specifies the desired interactions between the services to be adapted. In other words, the 31

CHAPTER 2. MAIN CONCEPTS intensional semantics of a contract describe what are the necessary conditions for an adaptor to be compliant with a given adaptation contract.

2.2 Intensional semantics of adaptation contracts The intensional semantics of an adaptation contract provides the interactions between the services and the adaptor allowed by the contract. Formally, the intensional semantics of an adaptation contract c is defined by a labelled transition system − →c x over configurations of the form hs, ∆i where s is the current state of the contract and ∆ is a multiset of pending actions that the adaptor will have to eventually x perform. A transition hs, ∆i − →c hs0 , ∆0 i indicates that an adaptor could, by contract c, execute action x in state s with pending actions ∆. The transition system − →c x is defined by the following inference rules:

(s, a ♦ b, s0 ) ∈ T c |a

hs, ∆i −→c hs0 , ∆ ∪ {b|}i

(s, a ♦ b, s0 ) ∈ T c

(I1)

(I2)

b|

hs, ∆i −→c hs0 , ∆ ∪ {|a}i (I3)

hs, ∆ ∪ {x}i − →c xhs, ∆i

where the complementary action of a non-internal action a is denoted by a (e.g., if a = do! then a = do?, and vice-versa). Note that the labels denoting the actions of the adaptor are annotated with a left-hand or right-hand bar to explicitly represent whether they are communication actions performed by the adaptor towards a service on the left-hand side (|a) or towards a service on the right-hand side (b|), respectively. Note also that an ordered semantics of pending actions is assumed, namely, in rule (I3) we assume that if there is more than one x in the multiset ∆, then the emitted x is the oldest in ∆. Finally, since in a vectr a ♦ b of an adaptation contract either a or b may be absent, the definition of − →c x includes also the following rules:

(s, a ♦ , s0 ) ∈ T c |a

hs, ∆i −→c hs0 , ∆i

(I4)

(s, ♦ b, s0 ) ∈ T c b|

(I5)

hs, ∆i −→c hs0 , ∆i

The State Machine (SM) resulting from rules I1-5 will be denoted as

Ac0 and represents the (possibly non-de-terministic) intensional semantics

of an adaptation contract. If we denote as Σa the actions on the sides of an adaptation contract, i.e., (Σa × Σa ) ∪ Σa = Σc , then the alphabet of Ac0 is {|·, ·|} × Σa }. This vertical-bar notation has been introduced for the sake of clarity but, when 32

2.2. INTENSIONAL SEMANTICS OF ADAPTATION CONTRACTS the alphabet of actions are disjoint between the services, then such vertical bars are not needed to differentiate services. Indeed, this is a particularisation of renaming and therefore, alternatively to the vertical-bar notation, we will often rename the operations and prefix them with a unique identifier of their corresponding services when we have more than two services and it does not exist the concept of sides. Using renaming instead of the vertical-bar notation, we keep the alphabet of Ac0 in Σa .

2.2.1

Eager choices

It is worth noting that the intensional semantics defined by rules (I1) to (I5) may force eager choices. Such eager choices may occur, for instance, when an adaptation contract contains more than one vector for an action a. Example 2.2.1 Consider a simple contract c = hΣc , Sc , sc0 , F c , T c i where

Σc = {a ♦ b, a ♦ c}; sc0 = s0 ; T c = {(s0 , a ♦ b, s1 ), (s0 , a ♦ c, s1 )}

Sc = {s0 , s1 } F c = {s1 }

Then, the following two transitions, which start from the initial state, would force an eager choice of the adaptor. The adaptor must arbitrarily decide to either execute b| or c| when it receives |a due to: |a

hs0 , 0i / −→c hs1 , {b|}i

and

|a

hs0 , 0i / −→c hs1 , {c|}i

Intuitively, such an unnecessary eager choice may lead the adaptor to fail to adapt some interactions. We could enforce contracts to be deterministic and say that contracts such as the one in Example 2.2.1 is invalid. Instead, we allow such flexibility by providing a lazy choice alternative which processes those eager choices to obtain deterministic adaptors. Lazy choice is modelled by lifting transition system − →c x so as to deal with sets of pairs hs, ∆i using powerset construction [Sip96]:

A0 = {hs0 , ∆0 i | ∃hs, ∆i ∈ A . hs, ∆i − →c xhs0 , ∆0 i} 6= 0/ x

A ,→c A0

(L)

The SM resulting from this lifted transition system will be denoted as

Ac . It is trivial that Ac accepts the same traces as Ac0 and Ac0 Ac . 33

CHAPTER 2. MAIN CONCEPTS

2.2.2

Data dependencies

So far, the intensional semantics of a contract represented by the transitions x system (,→c ) does not take into account the symbolic parameters that correspond to the arguments received and sent by the adaptor. These symbolic parameters represent the way in which the adaptor receives and sends data, and hence they impose a restriction on the sequence in which vectors can be applied. For instance, we know that an adaptor should receive the message corresponding to a!D ♦ before being able to send ♦ b?D since it needs the former to know the value of D required by the latter. In a similar fashion as we did with eager choices and rule L, we will also automatically refine the intensional semantics of adaptation contracts so as to incorporate these data restrictions in Section 8.4.1 but, for the time being, we can consider that those restrictions are explicitly encoded in the transition system of the contract, T c . x We will denote as Ac the transition system (,→c ) that represents the deterministic intensional semantics of contract C.

2.3 Behavioural adaptors An adaptor is an special service which complies with the intensional semantics of a given adaptation contract. In order to formally define an adaptor we first need to define a compliance relation between a FSM and an adaptation contract. For this we will use the notion of simulation () from process algebras but redefined to deal with the final states of the FSMs. Definition 2.3.1 (Simulation) A simulation relation between the

states of two given state machines A1 = Σ1 , S1 , s10 , F1 , T1 and

A2 = Σ2 , S2 , s20 , F2 , T2 is defined as RA1 ,A2 ⊆ O1 × O2 such that the following conditions hold for all (s1 , s2 ) ∈ RA1 ,A2 : α

1. For every transition (s1 −→ s01 ) ∈ T1 it must exist a transition α (s2 −→ s02 ) ∈ T2 such that (s01 , s02 ) ∈ RA1 ,A2 . 2. If s1 ∈ F1 then s2 ∈ F2 . From the previous notion, a simulation relation can be naturally derived for state machines.

A1 A2 iff it exists a RA1 ,A2 such that (s10 , s20 ) ∈ RA1 ,A2 where s10 and s20 are the initial states of A1 and A2 , respectively. 34

2.3. BEHAVIOURAL ADAPTORS Now it is trivial to define an adaptor compliant with a given adaptation contract. Definition 2.3.2 (Adaptor) An adaptor A compliant with a contract C is a deterministic state machine such that A Ac . Determinism must be understood as the restriction that all the transitions outgoing from the same state must present different labels and none of them can be internal transitions (τ ). l

l

1 2 ∀{s0 −→ s1 , s0 −→ s2 } ⊆ T

2.3.1

then

τ 6= l1 6= l2 6= τ

Synchronisation between services and adaptors

The transitions of an adaptor (Definition 2.3.2) and the transitions of service interfaces (Definition 2.1.2) have different kind of labels. The difference is that the arguments of the labels in an adaptor are represented by a symbolic parameter whereas arguments in service interfaces are represented by their types. We will consider that every symbolic parameter has a data type associated to it. Therefore, we can obtain the service interface corresponding to an adaptor by substituting every occurrence of a symbolic parameter in the alphabet of the adaptor by their corresponding types. Again, the vertical bars denoting the side of the communication can be omitted if the actions are renamed accordingly among the service interfaces. Example 2.3.1 For our running example, the contract (Fig. 2.2) requires the following substitution to obtain an adaptor interface compatible with the services shown in Fig. 2.3.

θ = {usr/U, psw/P,temp/T, humid/T,time/D} Now we can model the possible synchronisations between interfaces (either adaptor and service interfaces) by their parallel composition (⊗). α¯

α

s1 −→ s01

τ

α

s −→ s0

s2 −→ s02

α

s ⊗ s1 −→ s0 ⊗ s1

s1 ⊗ s2 − → s01 ⊗ s02

Communications are synchronous. We have omitted the symmetric rule for ‘⊗’. Being given the initial state of two interfaces sa0 and sb0 ), then the initial state of their composition is sa0 ⊗ sb0 and we know that

{(sai , sbi ) | sa0 ⊗ sb0 − → ∗ sai ⊗ sbi } τ

35

CHAPTER 2. MAIN CONCEPTS are all the possible reachable states during their interaction. It is worth highlighting that, by the associativity of ⊗, it is included the parallel composition of more than two services since, for instance, the initial state of the composition s0 can be equal to the parallel composition n services, i.e.:

s0 = (s10 ⊗ . . . ⊗ sn0 )

Similarly, the composite state is considered final or stable if all the states which compose the current state are also final w.r.t. their respective interfaces. Operations can be renamed to force synchronisations only through the adaptor. This parallel composition operator will be revisited to support the dynamic adaptation of services (Chapter 4) and the communication of cryptographic data (Chapter 7).

2.3.2

Deadlocks and livelocks

As we saw in Section 2.2, it is not enough just to be compliant with a contract since undesirable situations might arise. Depending on the internal choices of the services, they can lead the adaptor to a deadlock state. This is caused by the behavioural incompatibilities between the services, therefore they must be solved as well by the adaptor. Definition 2.3.3 (Deadlock) An interface presents a deadlock situation if it arrives to a state where it cannot reach a final state. More formally, there exists a deadlock state s such that

s0 − → ∗ s and @s0 ∈ F s.t. s − → ∗ s0 τ

τ

The previous definition covers both deadlocks and livelocks because it guarantees that every state can eventually reach a final state through τ -labelled transitions. For the rest of this document, we will only mention deadlocks as a general term including livelocks as well. Example 2.3.2 For instance, if we consider the behaviour of the services of our running example depicted in Fig. 2.3. Now, assuming that service b internally decides to connect (right-hand side τ ), then the intensional semantics of the contract in Fig. 2.2 allows the sequence of rules vu ·vl ·vc ·v p ·vq (which, among others, corresponds to trace |?u · |?s· !l|· ?c| · |?p· ?q| where actions are represented by their underlined characters and ‘·’ is the append operator). This sequence would lead the system to a deadlock because, at that point, service b cannot participate in the rules needed for service a to reach a final state (i.e., vd and va , at least). 36

2.3. BEHAVIOURAL ADAPTORS

Figure 2.4: Static most-general adaptor compliant with the contract and services shown in Fig. 2.2 and Fig. 2.3, respectively. Actions have been replaced by their underlined letter and prefixed by their corresponding service identifier

2.3.3

Adaptor synthesis

The basic definition of an adaptor (Definition 2.3.2) only depends on the given adaptation contract. In contrast, deadlock-free adaptors depend on the actual services that are communicating through the adaptor. Therefore, we will call these deadlock-free adaptors as service adaptors, because they depend on the actual behaviour of the services. The final purpose of service adaptors is to control the services to lead them to successful states while avoiding locks. Service adaptors are yet another refinement of the intensional semantics of adaptation contracts. This refinement is the key concept of traditional adaptor synthesis proposals [AINT07, BP06, KNB+ 09, MPS08]. These related works are focused on design time and they require knowing in advance the behaviour of the services. Alternatively, we show how such an adaptor is either: a) dynamically generated without knowing the actual behaviour of the services in Chapter 4; or b) synthesised, verified and refined in Chapter 8 if the services present incompatible security policies. Example 2.3.3 The most general adaptor compliant with the contract in Fig. 2.2 and the services shown in Fig. 2.3 is depicted in Fig. 2.4. Actions in Fig. 2.4 have been reduced to their underlined letters in the contract and have been prefixed with the identification of the communicating service.

37

Tact is the art of making a point without making an enemy. I. Newton

3

State of the art The incremental construction of future Internet applications through the adaptation of reusable software services specified through their behavioural interfaces is an error-prone task. It is possible to assist developers with the automatic procedures and tools supplied by model-based software adaptation. Unfortunately, in most cases it is impossible to modify services in order to adapt them, since their internal implementation cannot be inspected or modified. Thus, due to the black-box nature of services, they must be equipped with external interfaces giving information about their functionality. In particular, the interfaces of the constituent services of a system do not always fit one another and some features of these services may change at run-time, therefore they require a certain degree of adaptation in order to avoid mismatching during the composition.

3.1 Incompatibility levels Incompatibilities between services make their composition impossible due to mismatches that disable the communication and lead the services to deadlock situations. Software adaptation [YS97, CPS08] is conceived as an approach to allow the proper communication of services despite these incompatibilities, which can range in the following levels: Signature incompatibilities happen when services present different operation signature than expected. This level covers Interface Description Languages (IDLs) (e.g., interfaces described in CORBA [COR], Java or WSDL [WSD01] descriptions in the case of WSs) provide operation names, type of arguments and return values, as well as exception types. Adaptation at this level implies solving syntactical differences between signatures, such as different operation names, different argument orders, missing argument, or different argument types. As an example, we could 39

CHAPTER 3. STATE OF THE ART think of two services that, in spite of presenting the same functionality, offer incompatible WSDL descriptions:

helloWorld!

vs

holaMundo!

Behavioural level takes into account not only the signature of the operations but also the particular sequence in which those operations must be called or offered. Loops, internal choices (i.e., if-sentences) and external choices (pick-sentences) are also covered by the behavioural specification of services. These kinds of incompatibilities are more evident in stateful services such as those described as WF or BPEL processes. However, REST and JavaScript APIs (ubiquitous nowadays) can also present these kind of incompatibilities as they have implicit control dependencies in the form of cookies or authentication tokens that require certain actions (typically, a login) to happen before some other (privileged) actions. Incompatibilities at this level may lead the composition to undesirable states where one service expects a call from another service which is not able to comply, therefore the system reaches a deadlock. Examples of behavioural IDLs are BPEL and WF, used to describe Web service orchestrations, or high-level Message Sequence Chart (hMSC [IT99]). Example: A BPEL process which waits for an invocation to operation A and then B would deadlock if composed with a service that first invokes B and then A, even though they have compatible WSDL signatures. reimburse?conf

fly!conf

a −−−−−−−−−→−−−−→ b

vs

fly?conf

reimburse!conf

x −−−−−→−−−−−−−−−→ y

Non-functional incompatibilities are those beyond the signature and behaviour of the services. These can be further classified into QoS incompatibilities, e.g., when the throughput of a service is expected to be higher than it is (this could be adapted, for instance, using an adaptor to perform load-balancing with two instances of that service running in parallel), or a service which expects in clear text what another service is sending encrypted. On top of that, we have semantic incompatibilities like, for instance, implementing a list from a stack or using a search engine as a spell checker. Ontology-based languages such as Web Ontology Language for Services (OWL-S [OWL04]) or Web Services Modeling Ontology (WSMO [WSM05]) are often required to solve semantic incompatibilities. Current industrial platforms only provide means to describe components or services at their signature level. However, mismatches may occur at any 40

3.2. RESTRICTIVE APPROACHES of the interoperability levels. For istance, incompatibilities at the behavioural level are common due to mismatches in the order of the exchanged messages between components or services, which can lead to deadlock situations. Therefore, it would be desirable to address all the levels together. Software adaptation [CMP06, YS97] is a technique to compose services with mismatching interfaces in a non-intrusive manner by automatically generating mediating adaptor services. Deriving adaptors is a complicated task since, for instance, in order to avoid undesirable behaviour, the different behavioural and security constraints of the composition must be respected.

3.2 Restrictive approaches A first class of existing approaches can be referred to as restrictive approaches [BP06, AINT07, NBM+ 07, NRXB10]. These approaches favour the full automation of the process, trying to solve interoperability issues by pruning the behaviour that may lead to mismatches, thus restricting the functionality of the services involved. Brogi and Popescu [BP06] propose an adaptation methodology where they encode BPEL processes as Yet Another Workflow Language (YAWL [vdAtH05]) workflows. They perform a reachability analysis on these workflows to obtain a tree representing the possible execution states of the services, then they automatically obtain the complementary adaptation workflow for these trees and, finally, they do a lock analysis to check if there is any adaptation path which might lead the services to a deadlock situation. Their approach does not use adaptation contracts since the interfaces of the services are assumed complementary. This decision has the benefit of avoiding the task of designing such a contract but, even though it allows message-reordering adaptation (where the adapter in the middle acts as a buffer which reorders the exchanged messages to be sent when the destination service is ready to receive them) it does not support any adaptation at signature level, where operation might present mismatches in their names or arguments. Finally, this automated generation of BPEL adapters does not provide the means to enforce any further property beyond deadlock freedom. A similar approach is presented by Autili et al. [AINT07] with their tool S YNTHESIS. They model the behaviour of the services with hMSC. Unlike the previous approach, S YNTHESIS supports the specification of the desired behaviour expected from the interaction among the adapted services expressed with a Labelled Transition System (LTS [Plo81]). Using the hMSC descriptions of the services and the LTS of the desired 41

CHAPTER 3. STATE OF THE ART behaviour the tool generates a distributed set of adaptation wrappers which, deployed around the services, provide the expected behaviour without deadlocks. This generation is done thanks to a graph unification algorithm followed by a deadlock analysis. S YNTHESIS includes the feature of enforcing other properties (the desired behaviour LTS) beyond deadlock freedom. However, it still lacks support for signature adaptation. Nezhad et al. [NBM+ 07] presented some techniques in order to provide semi-automatic support for the identification and resolution of mismatches between services at their signature and protocol levels. First, the authors describe some techniques for signature matching based on XML schema matching [RB01] (1-1 correspondences). Then, the authors use the protocol definitions expressed using FSMs to find mismatch situations at the protocol level. Deadlock resolution is tackled through the generation of mismatch trees, which present to the developer potential execution scenarios where the services deadlock. This approach deals with some kinds of mismatch automatically, but requires user input to overcome others. Therefore, this work can be considered as a hybrid solution that shows some characteristics similar to generative approaches. The situations which can be adapted are quite limited when compared with other generative approaches. In particular, correspondences between operations are static, and 1-0 correspondences (operations with no match on the counterpart interface) are not supported. This approach does not enable the user to write a contract representing an abstract specification of the adaptation but, instead, it presents to the user each mismatch case which is not automatically solvable between two interfaces in a mismatch tree. In [NRXB10], the authors extend the static matching presented in [NBM+ 07] to support 1-N correspondences. Furthermore, they improve the protocol matching proposed previously using depth-based comparison and iterative reference-based interface matching. These techniques are limited since they are not able to fix subtle incompatibilities between service protocols by renaming, splitting, merging, remembering and reordering messages and their parameters when necessary.

3.3 Generative approaches A second class of solution which can be referred to as generative approaches [BBC05, CPS08, CCP10, MPS08]. These approaches generate new adapter behaviour in order to avoid the aggresive prunning of restrictive approaches. Therefore, they support the specification of more 42

3.3. GENERATIVE APPROACHES advanced adaptation scenarios. Generative approaches build adaptors automatically from an abstract specification of how the different mismatch situations can be solved, often referred to as adaptation contracts. Bracciali et al. [BBC05] present a methodology for component adaptation where interfaces are extended with a specification of the behaviour of the component expressed in a subset of the π -calculus [Mil99]. Moreover, interfaces can be specified as a set of roles, and the parameters on the different operation signatures are divided into data parameters (data exchanged among components), and link parameters (channel names sent or received by the components). They specify adaptors by means of a interface mapping relating parameters and messages on two interfaces. The process of adaptor generation guarantees that the resulting adaptor will be deadlock-free. Although the authors interestingly address situations where the topology of communication among components is not necessarily static, it is worth observing that the approach is restricted to two-party interactions. In [CPS08], Canal et al. proposed a model-based adaptation approach which takes as input a description of component interfaces augmented with protocol information, and an adaptation contract giving an abstract description of the composition constraints in order to produce an adaptor protocol. This approach is based on synchronous product [Arn94] and Petri nets [Mur89] to tackle adaptation. The signature of the interfaces omits the elements related to data exchange, thereby only operation names are represented. Protocols are specified by using LTS with operation names in its transitions. Regarding adaptation contracts, the authors rely on synchronous vectors [Arn94], which express correspondences among messages. In addition, they use vector LTS, which has its transitions labelled with vectors, in order to specify advanced properties, such as (i) incompatible orders of operations, which are solved by desynchronising message emissions with their corresponding receptions in the other components, or (ii) choices between vectors for advanced adaptation scenarios where correspondences between operations may change over time. The adaptor generation process is supported by a tool capable of loading and visualising the different inputs. However, no tool support is provided to generate the adaptation contract, which is a required input of the process. Cubo et al. [CCP10] present DAMASCo, a framework for discovery, adaptation and monitoring of context-aware services and components. DAMASCo combines efforts to address the four interoperability levels together, modelling services and components with interfaces constituted by a context and semantic information, a signature, and a behavioural description. Specifically, the authors advocate the extension of traditional signature level interfaces with context information (service level), protocol 43

CHAPTER 3. STATE OF THE ART descriptions maintaining conditions (behavioural level), and semantic representation instead of a syntactic one by means of ontologies (semantic level). They propose software adaptation mechanisms to generate adaptation contracts automatically, which considers all the possible mismatch situations solved in a service discovery process executed previously. This approach can generate a whole adaptor service that addresses the four levels, by considering not only the signature and behaviour levels of the services, but also some mismatch situations related to the quality of service and semantic levels. The authors rely on semantic matching to automatically generate adaptation contracts through ontologies, but there is no service market existing today which produces ontologies in a standard way. Furthermore, they do not tackle other non-functional requirements, such as temporal requirements or security, in addition to context-aware information. Mateescu et al. [MPS08] base on an extension of the interface model presented in [CPS08] to consider value passing. Thus, in this work, protocols are described be means of Symbolic Transition Systems (STSs) based on Symbolic Transition Graphs (STGs) [HL95]. Using this new service specification, the authors present an approach to generate an adaptor protocol and to deploy it as a BPEL orchestrator. In the generation process, first, the adaptation contract and protocols are encoded into the Language Of Temporal Ordering Specification (LOTOS [ISO89]) process algebra. Once the LOTOS encoding is obtained, the authors carry out deadlock elimination and behavioural reduction (i.e., eliminating internal actions introduced by the encoding of the adaptation constraints) simultaneously. Finally, verification techniques are proposed based on existing model-checking tools in order to verify both general properties (placeholder occurrence, action preserving, etc.) and specific properties which differ depending on the case at hand (safety and liveness). The overall approach is tool-supported except for the construction of adaptation contracts, which are considered inputs of the process. Although generative approaches result in a more general and satisfactory solution while composing and adapting services, writing the contract is a difficult and error-prone task. Adaptation contracts, which match the operations required by the services with those offered by others, may contain incorrect correspondences between operations in service interfaces or syntactic mistakes, which are particularly common in cases where the contract has to be specified using cumbersome textual notations. Therefore, it is important to provide assisted environments for contract design, analysis and verification and, alternatively, algorithms to generate contracts automatically. 44

3.4. SECURITY ADAPTATION BY CONTRACTS

3.4 Security adaptation by contracts Han and Khan [HKK06] proposed a framework for the dynamic adaptation of services based on the negotiation and enactment of security contracts. The security contracts presented in their work contain both security objectives such as “the patient name is confidential” and security properties, which are cryptographic functions such as “the patient name is encrypted with this key”. Based on these contracts, agents are used on behalf of the different services to negotiate the provided and required security of each service while pursuing the global security goal of the system. An iterative and collaborative negotiation is handled through bilateral communications between the agents until a consensus is reached about which services can be used with which particular security schema, while complying with the security goal of the system. Then, model-checking techniques are used to verify that the security goal is preserved and the composition of services is instantiated and deployed. The services are monitored and, on QoS failures or goal changes, the contracts are re-negotiated and the system is reconfigured. The usage of a common semantic model to express service functionality, the different security properties and goals assumes that there are no signature incompatibilities. In addition, the authors assume services with single input-output actions where no complex behaviour is supported. Sec-MoSC is a toolset presented by Souza et al. [SSL+ 09] to assist the security-aware development of service compositions. A key aspect of their proposal is the Business Process Modeling Language (BPMN) description of the expected orchestration. They extend the official BPMN notation with generic security requirements called NF-Actions. Some of these NF-Actions are UseCryptography, UseAccessControl and CheckDataIntegrity, among others, and several ways are proposed to implement each of these generic actions. This coarse-grained security modelling is oriented to assist throughout the whole development cycle with the agreement (and compatibility) of all the parties involved. Some automation is provided to generate actual BPEL code from the BPMN process but, in this approach, it is not clear how NF-Actions are encoded into BPEL and they claim only platform-specific support for these (Apache ODE Rampart, specifically). The main contribution of this toolset is the inclusion of security concerns throughout the whole development cycle by means of BPMN descriptions. Its main limitation is the loose relationship between security requirements and their actual implementation. The BPMN description of the system can adapt possible incompatibilities between the services, but this is not the main focus of this work. Martinelli and Matteucci [MM10] developed a framework for the 45

CHAPTER 3. STATE OF THE ART automatic generation of security controllers. They tackle scenarios where part of the system can be manipulated by an attacker with certain internal knowledge. In this setting, a security controller is deployed as a wrapper around the processes susceptible to being compromised and this controller intercepts every interaction with the rest of the system, only allowing those that are permitted by the security policy. Service are modelled as Crypto-CCS [Mar03] processes –similar to Calculus for Communicating Systems (CCS [Mil89] but parametrised with a cryptographic inference system (see Section 7.2.3)– the security policy is expressed in equational µ -calculus and the synthesis is performed via partial model-checking techniques. In this way, the robustness of the system against attacks to the security policy is formally verified by construction. This approach could be classified also as restrictive and it exhibits the same limitation as those, i.e., they are not able to solve incompatibilities as they directly restrict the system to those execution traces which where already compatible.

3.5 Comparison of adaptation approaches A classification of the approaches as described in the preceeding sections is presented in Table 3.1. In order to compare them, we now present different criteria related to mismatch situations at different levels of interoperability. Signature level

• (NM) Name mismatch: may occur when a service expects a different operation name than what is being offered.

• (AM) Argument mismatch: occurs when arguments have different orders or types in the operations, or a particular argument does not exist in the counterpart operation.

Behavioural level

• (MN) M-N correspondences: when a message on a particular interface corresponds to several others (or none) on the interface of its counterpart. • (AR) Argument reorder: when the data arguments coming from a single message should be reordered and scattered among none, one or several messages. • (MR) Message reordering: the order of operation invocations among the different protocols involved in the collaboration is not compatible. • (DF) Deadlock freedom: this is preserved when it is guaranteed that every service is always capable of reaching a successful or final state. 46

3.5. COMPARISON OF ADAPTATION APPROACHES

• (MC) Multiparty collaborations: when operation correspondences are

required in more than two different services, which may give rise to deadlock situations in the case that no counterpart for an operation is found in any service.

Security level

• (CC) Context change: corresponds to the service level (considering

context information as a non-functional property), and occur when the context information changes dynamically and the scope of change and the reaction of the system depending on certain situations is not properly controlled. • (BE) Behaviour enforcement: it is possible to specify certain properties or behaviour which is enforced on part (or the whole) system. • (CR) Cryptography adaptation: cryptographic functions are used to receive or send messages, infer the knowledge of the attacker, or enact the security properties over the messages. Semantic level

• (AL) Argument level: is handled whenever an ontology is used to infer semantic relationships and matching between service arguments.

• (OL) Operation level: similarly to the argument level, the semantic

operation level deals with the semantic relationships and matching between service operations. In addition, it must be considered the complementary direction (input, output) of the matched operations. • (SL) Security level: this level is covered when the semantics behind the security properties of the system are considered during its composition and adaptation.

47

48

X X X X

X X X

X X X

X X

AM

X

X X X X X X

NM

Signature

X

X

X

X

X X X

AR

X

X

X

X

X X X

X X X

MR

X

X

X

X

X X X X X X

MN

X X

∼∗

X

X X X X X X

X

MC

X

X

X

∼ X X X X X X

DF

Behavior

X

X

X

X

X

BE

X

X X X

CR

X

X

CC

Security

X

AL

X

X

OL

∼ X

SL

Semantic

Table 3.1: Comparing contract-based service composition and adaptation. * Only in absence of active attackers

Contract generation Chapter 6 Adaptable discovery Chapter 5 Dynamic adaptation Chapter 4 Secure adaptation Part III

[BP06] [AINT07] [NBM+ 07, NRXB10] [BBC05] [CPS08] [CCP10] [MPS08] [HKK06] [SSL+ 09] [MM10]

Publication

CHAPTER 3. STATE OF THE ART

Part II

Behavioural adaptation

All men make mistakes, but only wise men learn from their mistakes. Winston Churchill

4

Dynamic adaptation

The wide adoption of Web service standards has considerably contributed to simplifying the integration of heterogeneous applications both within and across enterprise boundaries. The languages to describe messaging (SOAP), functionalities (WSDL) and orchestration of services (BPEL) have been standardised, but the actual signatures and interaction protocols of services have not. For this very reason, service adaptation [BBG+ 06, BBC05, CPS06, YS97] remains one of the core issues for application integration in a variety of situations. Overcoming various types of mismatches between services developed by different parties, customising existing services to different types of clients, adapting legacy systems to meet new business demands, or ensuring backward compatibility of new service versions are typical examples of such situations. Various approaches have been proposed to adapt service signatures [DSW06], process behaviour [BP06], quality of service [HD07], security [MP11a] or service level agreements [NPKR07]. In this chapter we focus on signature and behaviour incompatibilities, whose occurrence can impede the very interoperability of services. Many signature and behaviour incompatibilities can be solved by applying existing (semi-)automated adaptation techniques. However such techniques present two limitations: i) they require signature and behaviour of the services to be known before their interaction starts, and ii) they are computationally expensive since they explore the whole interaction space in order to devise adaptors capable of solving any possible behavioural mismatch. In this chapter we focus on the problem of dynamic adaptation in applications running on limited capacity devices, as in typical pervasive computing scenarios where (unanticipated) connections and disconnections of moving peers continuously occur. Unfortunately, the limited computing, storage, and energy resources of such devices inhibit the applicability of most existing adaptation approaches. For instance, related work for adaptor synthesis [AINT07, CMS+ 09a, 51

CHAPTER 4. DYNAMIC ADAPTATION KNB+ 09, Pad09, PS07] is focussed on design time and, as a result, they carry out an extensive analysis on the adaptor protocol, thus obtaining a complete and correct adaptor. However, they assume that the behaviour of the services is known beforehand and the synthesis of complete adaptors presents an exponential complexity with regard to the number and size of the services to be adapted. These requirements are not suitable for dynamic adaptation under our scenario of nodes with restricted capabilities where unforseen services might appear. Nonetheless, we build on their concepts and show some similarities with their work. For instance, adaptation vectors in this work are similar to the adaptation operators presented in [DSW06] and to the mismatch patterns introduced in [KNB+ 09], but their approaches are focussed on design-time. So there are a few related approaches which aim to address both runtime and lightweight behavioural adaptation at the same time. One of them is [CCS09], where an ontology is required to generate a mapping between the operation of the services. Some properties (expressed in a temporal logic) are dynamically verified by performing forward-search analyses on the behaviour of services. While similar properties can be encoded with the FSM of our adaptation contract, contrary to ours, [CCS09] requires the behaviour of services to be known and it has to support the cost of the forward-search analysis. Wang et al. [WDOV08] propose the dynamic application of adaptation rules. These rules are triggered by the input actions received by the adaptor and then an output action is generated. Our approach is similar to theirs in the sense that we also apply the adaptation contract dynamically without generating the whole adaptor. However, their rules must specify how to solve both signature and behavioural incompatibilities, hence requiring prior knowledge of the behaviour of the services. Our contracts, in contrast, only specify how to solve signature incompatibilities and an optional description of the adaptation goal. So, our adaptors dynamically learn how to solve behavioural incompatibilities. Another related approach is [Hol10], where the problem of controlling services with unknown behaviour is discussed. Intuitively, this work shares with our approach the idea of progressively refining an over-approximated controller when failures occur. The authors of [Hol10] perform said refinement by exploiting (bounded) model checking, whose overhead is not bearable in applications running on limited capacity devices. We present a lightweight adaptive approach to the adaptation of services that is capable of overcoming signature and behaviour mismatches that would otherwise impede service interoperation. The approach is lightweight in the sense that it requires low computing and storage capabilities to run. 52

The adaptation is governed by adaptation contracts that specify in a declarative way the set of interaction traces that are allowed. Adaptation contracts specify how to solve signature incompatibilities, but they do not necesarily require behavioural information (e.g., the partial order with which service operations are offered or requested) to be known a priori. Actually, as we will see, the behaviour of the services to be adapted can even change during the lifespan of an adaptor. The adaptation process is itself adaptive in the sense that an initial (possibly the most liberal) adaptor behaviour is progressively refined at run-time by learning the behaviour of the services from failures that may occur during service interactions. Roughly speaking, the adaptor initially allows all the interactions that satisfy the current adaptation contract. If an interaction session between the services fails w.r.t. the contract, the adaptor memorises the interaction trace that led to the failure in order to inhibit it in subsequent sessions. Intuitively speaking, the adaptor refines its behaviour based on previous failures so as to converge on allow only deadlock-free interactions between the services. Learning and inhibiting erroneous traces tackle permanent failures. For instance, a behavioural incompatibility which leads the system to a deadlock situation, or a hardware malfunction (maybe due to low battery) which disables part of the functionality. In addition, communications in pervasive computing can be unstable due to changes in the environment. For instance, shadow fading [KqL99, BH05], where messages might be lost due to the presence of possibly moving obstacles, has a profound impact on the reliability of communication channels. We propose several learning policies which tackle these sporadic errors. Inhibited traces learned by the adaptor are eventually forgotten so that the adaptor can re-adapt itself to drastic changes in service functionality, temporal changes in the environment or sporadic communication failures. As one may expect, the results of the refinement performed by this adaptive adaptation approach are particularly interesting when the process starts with a non-empty adaptation contract. Nonetheless, the approach can overcome message ordering mismatches [KNB+ 09] also in the extreme situation in which no such adaptation contract is available using zero-knowledge adaptation. When compared with the few other existing proposals of lightweight behaviour adaptation of services, such as [CCS09] for instance, our approach features the important advantage of requiring just an adaptation contract based on the services signatures, it does not require knowing the interaction behaviour of the services that need adaptation. In other words, the adaptor is not synthesised at design time, instead, it is directly deployed with no other information than an 53

CHAPTER 4. DYNAMIC ADAPTATION

a. SPIN node

b. TinyDiffusion node with adaptor

c. SPIN node

Adaptation contract May be interested in requesting info

It either intermediates or consumes the info

Advertises and provides info

Figure 4.1: A TinyDiffusion node intermediating between two SPIN nodes adaptation contract and it will successively learn the behaviour of the services and how to solve their behavioural incompatibilities. As regards the complexity in time and space of learning adaptors, these only depend on the size and structure of the adaptation contract. Main contributions: A lightweight and dynamic learning adaptor that, unlike related work, it does not require to know the behaviour of the services to adapt beforehand and, indeed, it does not need to be synthesised at all. Instead, the adaptation contract which characterises the adaptor is directly deployed and it dynamically learns from service interactions so as to converge on correct adaptors. The structure of the section is the following. We present a real-world example based on Wireless Sensor Networks (WSANs) in Section 4.1. Then we present a lightweight adaptive approach to dynamic service adaptation, called learning adaptors in Section 4.2. We describe several learning policies for these adaptors in Section 4.3. Then we proceed to evaluate the implementation with an example based in Section 4.4. Finally, we conclude in Section 4.5 with an alternative without explicit contracts called zero-knowledge adaptation.

4.1 A WSAN example: SPIN and TinyDiffusion We now present an example inspired in two real-world data-diffusion protocols for sensor networks: TinyDiffusion [MGO+ 03] and Sensor Protocol Information via Negotiation (SPIN) [HKB99] (Fig. 4.1). Example 4.1.1 Table 4.1 shows the signature of SPIN and TinyDiffusion. On the one hand, TinyDiffusion can be understood as a pull protocol, which 54

4.1. A WSAN EXAMPLE: SPIN AND TINYDIFFUSION

Table 4.1: Signature of SPIN and TinyDiffusion SPIN

adv ? int adv ! int req ? int req ! int data ! info data ? info end !

TinyDiffusion

interest ! int interest ? int data ? info data ! info end !

is briefly described as follows. First, a participating node notifies the rest of the network (by message flooding) of its interest in certain information identified by argument int. This interest is stored in each of the nodes which received it and, when any of these nodes generates information matching the interest (data), it sends that information to the node it first received the interest from. Following this chain, the information finally reaches the node which was initially interested in it. On the other hand, SPIN is a push protocol. It builds over the assumption that control messages are smaller, and hence less expensive than data messages. SPIN nodes do not notify their interest, instead they do advertise new generated information via adv!int to their direct neighbours. Each interested neighbour then requests the information via req!int, receives it through data?info and finally it re-advertises it to its own neighbours, excluding the one which provided the information the first time. This last re-advertisement step is optional and it depends on the current battery level. For our example, we assume that some TinyDiffusion nodes have been deployed among SPIN nodes. Concretely, we are going to consider that a TinyDiffusion node (service b) which can either request information to, or intermediate between, two SPIN nodes (services a and c). Example 4.1.2 Knowing the signature and functionality of the services involved, we can design the following adaptation contract (Fig. 4.2). The contract matches operations and data between services with vectors (Fig. 4.2(a)), and it specifies that a successful session is one where all three services reach a final state (Fig. 4.2(b)). For this example we have arbitrarily decided to place the adaptor with service a on the left-hand side and services b and c on the right-hand side. 55


Σc = {

a:adv!I ♦ , a:req?I ♦ , a:data!D ♦ , a:end! ♦ , ♦ b:interest?I, ♦ b:interest!I, ♦ b:data?D, ♦ b:data!D, ♦ b:end!, ♦ c:adv?I, ♦ c:req!I, ♦ c:data?D, ♦ c:end!D }

(va ) (vr ) (vd ) (vea ) (vb ) (vi ) (vt ) (vt 0 ) (veb ) (vc ) (ve ) (vd 0 ) (vec )

(a) Adaptation vectors Σc \ {vec }

Σc \ {veb } {vec }

Σc \ {vea } {veb }

{vea }

(b) FSM

Figure 4.2: An adaptation contract

We have used a parameter I to receive (and later send) an interest. For instance, vector va receives the interest on I from service a. Parameter D is used to move around the data between the services. It is worth noting that, due to the pull-push contradiction between SPIN and TinyDiffusion and the internal choices of the services, it was not possible to use two-sided vectors such as a:data!D ♦ b:data?D. Now, regarding the automaton of the contract, we have kept it simple and it only states that, for the adaptor to be in a final state, it must have received the end messages of services c, b and a of Fig. 4.3. These messages could be received in any order but, for simplicity reasons, we arbitrarily forced the order c, b, and a. 56

4.1. A WSAN EXAMPLE: SPIN AND TINYDIFFUSION In following sections we will describe how we dynamically learn the concrete behaviour of the adaptor based on a contract (such as in Fig. 4.2) without knowing the actual behaviour of the services. Our dynamic adaptors will analyse the messages exchanged at run time and they will use that knowledge to learn the behaviour of the services and, more importantly, their behavioural incompatibilities. Additionally, as soon as these incompatibilities are recognised, learning adaptors control the services so as to avoid such incompatibilities henceforth. In order to better understand the behavioural incompatibilities between the services, we proceed to present their actual behaviour. However, it is worth clarifying that this information is unknown to the dynamic adaptor and will not be used for the learning process.

(a) Service a. A SPIN protocol which expects info

(b) Service b. An intermediary TinyDiffusion protocol

(c) Service c. A SPIN protocol which provides info

Figure 4.3: Instances of SPIN and TinyDiffusion protocols 57

CHAPTER 4. DYNAMIC ADAPTATION Example 4.1.3 Figure 4.3(b) shows the behaviour of a TinyDiffusion node which can either start the protocol or intermediate on behalf of some other TinyDiffusion node. The internal decision on which of these actions to perform is represented by a τ transition. Figure 4.3(c) shows the behaviour of a SPIN node which generates and advertises some information and where the behaviour of a receiving SPIN node (which does not re-advertise but it is possibly interested depending on an internal choice) is shown in Fig. 4.3(a). We make explicit the end of the behaviour of each of the services with an output action end!. In addition, all their actions have been prefixed with their identifications (i.e., a, b or c) so that we can distinguish between the actions of the two different SPIN nodes. If we knew the behaviour of the services beforehand (Fig. 4.3), other approaches could be used at design time to statically synthesise the behaviour of the adaptor. The static synthesis process entail exponential complexity. The approach presented in this chapter is able to learn, after sufficient interactions with the services, how to mimic the behaviour of the adaptors that would have been statically synthesised.

Figure 4.4: Static most-general adaptor compliant with the contract and services in Fig. 4.3 Example 4.1.4 The most general adaptor compliant with the contract (Fig. 4.2) and the services (Fig. 4.3) is shown in Fig. 4.4. This FSM has 58

4.2. LEARNING ADAPTORS 70 states and 170 transitions. For the sake of clarity, actions have been reduced to their underlined letters in the contract and have been prefixed with the identification of the communicating service. This is a complete and correct adaptor that could be statically synthesised having as input the behaviour of the services and the adaptation contract. Our learning adaptors only have access to the latter but they will dynamically converge on behave in the same manner as the adaptor shown in Fig. 4.4.

4.2 Learning adaptors Our proposal is to avoid the costly synthesis process and to directly deploy an adaptor with no other information than the adaptation contract. Then, the adaptor will dynamically learn the behaviour and the incompatibilities of the services and how to better solve them. The approach is to initially support every communication allowed by the intensional semantics of the contract without any guarantee of the successful termination of the current session. The adaptor learns which sessions ended correctly and, on failures, it will forbid the last communication which led to the failure. The goal is to make this process converge on the most general adaptor which complies with the adaptation contract and the given services. However, depending on the contract and the services (they might not be controllable due to their internal choices) it is possible that no such an adaptor exists. In this case, the process will converge on an empty adaptor (single initial state with no transitions) where no communication is allowed. We are going to build on top of the transition system which represents x the intensional semantics of an adaptation contract (,→c in Section 2.2) to x define the following transition system − → which models the way in which an adaptor wraps the service it adapts and interacts with the rest of the environment. An adaptor wrapping a service according to an adaptation x contract c is denoted in the transition system − → by a term of the form: hA, I,tic [P] where A is a set of pairs hs, ∆i (s is a state of the contract and ∆ the multiset of pending actions that it should eventually perform), I is a sequence of inhibited traces that have previously led to unsuccessful interactions according to what the adaptor has learned so far, t is the trace of actions executed so far by the adaptor during the current interaction session, c is the adaptation contract and P is the current state of the service being adapted (which is not known by the adaptor). An adaptor at the beginning of a session is denoted by hA0 , I, λ ic [P] where A0 = {hsc0 , 0i} / and λ is the empty trace. If the adaptor has not learned anything yet, then I is empty. 59

CHAPTER 4. DYNAMIC ADAPTATION In general, I can be modelled as a set, a sequence or a tree. Independently of its implementation, we will write I ∈ t to denote that trace t is inhibited by I . When no trace is inhibited we say that I = 0/ . For the sake of simplicity, for this document, we will consider I as a sequence of inhibited traces. Each of these traces is a sequence of communication x actions which range over Σa∗ (Σa being the set of labels of ,→c ). We will denote by t ·a the sequence obtained by appending element a to sequence t , by a.t the sequence obtained by prefixing element a to sequence t , and by t ::t 0 the sequence obtained by concatenating sequences t and t 0 . We will also say that sequence t is a prefix of t::t 0 , where both t and t 0 can be empty, being λ the empty sequence. A natural way to define the inhibition of traces is given by I ∈ t iff ∃I1 , I2 . I = (I1 ·t)::I2 . Rules E XT and I NT describe the steps that the adaptor can take by offering a communication to the external environment and by interacting with the service it wraps, respectively. |a

A ,→c A0 ∧ I ∈ / t·|a a

hA, I,tic [P] − → hA0 , I,t·|aic [P] b|

(E XT)

b

A ,→c A0 ∧ P − → P0 ∧ I ∈ / t·b| τ

hA, I,tic [P] − → hA0 , I,t·b|ic [P0 ]

(I NT)

Note that the communications offered by the adaptor only depend on the current state of the adaptor, not on the other services. Rule I NT models synchronisations between the adaptor and the service to be adapted as silent actions τ , since such interactions are not visible by the external environment. Also the internal steps independently taken by the wrapped service are modelled as silent actions (TAU). Rules S YN and PAR model (commutative) parallel composition between services and adaptors with synchronous communications in the standard way, as it was introduced in Section 2.3.1: τ

P− → P0

(TAU)

τ

A[P] − → A[P0 ] a

a

a

P− → P0 ∧ Q − → Q0 τ

P⊗Q − → P0 ⊗ Q0

(S YN)

P− → P0 a

P⊗Q − → P0 ⊗ Q

(PAR)

By rule O K, an adaptor can consider an interaction session successfully terminated when it is in a final state of the adaptation contract and there are no more pending communications to perform. Let OKc = {hs, 0i / | s ∈ F c} 60

4.2. LEARNING ADAPTORS be the set of states which are considered successful terminations of the adaptor behaviour.

A ∩ OKc 6= 0/ ∧ A0 = {hsc0 , 0i} / ok(t)

(O K)

hA, I,tic [P] −−−→ hA0 , I, λ ic [P]

Rule L EARN describes how an adaptor can autonomously decide, after a timed wait, to inhibit the trace corresponding to an interaction session that has not (yet) successfully terminated.

A ∩ OKc = 0/ ∧ A0 = {hsc0 , 0i} / add(t,I)

(L EARN)

hA, I,tic [P] −−−−−→ hA0 , add(t, I), λ ic [P] Note that rule L EARN does not constrain the way in which timed waits will be actually realised in the underlying implementation. From the viewpoint of the external environment, a learning step made by the adaptor is an internal action of the later which may take place at virtually any moment. In Section 4.3 we will show different definitions of add(t, I) that can be employed to define different learning policies for rule L EARN. The simplest definition of add consists of appending the new trace to the sequence of previously learned traces, i.e., add0 (t, I) = I·t . Note also that rules O K and L EARN specify that the adaptor will be restarted (to its initial state A0 ) when it detects the successful termination of an interaction session or when it performs a learning step. Rules O K and L EARN do not enforce an immediate restart of the wrapped service P and of the service Q interacting with P through the adaptor in a configuration Q| hA, I,tic [P]. The restart of P and Q can be autonomously performed by P and Q (with a timeout, for instance). Alternatively, it can be triggered by the adaptation contract itself, which can include explicit restart messages. A natural assumption on the services deployed in limited capacity devices is that their behaviour is bounded in length. This does not necessarily mean that the services will expire but, instead, it means that interactions consist of finite sessions that can be run over and over again. In the sequel we assume bounded services whose behaviour consists of a finite set of finite length traces. Informally, we say that a learning function add is monotonic if add(t, I) inhibits (when used in rules E XT and I NT) all the traces inhibited by I . In order to formalize this monotonicity notion, we introduce the set of traces prefixed by a sequence I as follows:

prefixed(I) = {t::u | I = (I1 ·t)::I2 ∧ u ∈ Σa∗ } 61

CHAPTER 4. DYNAMIC ADAPTATION Definition 4.2.1 (Monotonic learning function) A learning function add is monotonic if add(t, I) is a monotonic extension of I and t ∈ prefixed(add(t, I)), for each t and I . We say that add(t, I) is a monotonic extension of I (I v add(t, I)) if prefixed(I) ⊆ prefixed(add(t, I)). We say that add(t, I) is a proper monotonic extension of I (I @ add(t, I)) if prefixed(I) ⊂ prefixed(add(t, I)). Obviously, the v relationship defined in sequences of traces is a pre-order. We now establish that the adaptation process converges if a monotonic learning function add is employed in rule L EARN to adapt bounded services. Proposition 4.2.1 (Convergence) Let S and P be two bounded services, A0 be an adaptor for contract c in its initial state A0 = {hsc0 , 0i} / , and I0 be a (possibly empty) sequence of inhibited traces. If the adaptor employs a monotonic learning function, then there exists a sequence I0 , I1 , . . . , In , with a finite n ≥ 0, such that: τ

I j+1

1. ∀ j ∈ [0, n) ∃S0 , P0 . S|hA0 , I j , λ ic [P] − → ∗ −−−→ S0 |hA0 , I j+1 , λ ic [P0 ] with I j v I j+1 , and τ

In+1

2. 6 ∃S0 , P0 , In+1 . S|hA0 , In , λ ic [P] − → ∗ −−−→ S0 |hA0 , In+1 , λ ic [P0 ] with In @ In+1 . The previous proposition shows that the training process with bounded services is finite and it always converges on a sequence of inhibited traces In . We call such a sequence a complete sequence of inhibited traces for S and P. Now, to establish the correctness of our proposal, we prove that an adaptor with a complete sequence of inhibited traces In always leads the interacting services to successful states of the contract (OKc ) while avoiding locks. Proposition 4.2.2 (Correctness) Let S and P be two bounded services, A0 be an adaptor for contract c in its initial state A0 = {hsc0 , 0i} / . If the adaptor employs a monotonic learning function, and I is a complete sequence of inhibited traces, then for every S0 , A0 , t 0 and P0 such that

τ S | hA0 , I, λ ic [P] − → ∗ S0 | A0 , I,t 0 c [P0 ] 62

4.3. LEARNING POLICIES where A0 6= A0 , there exists a sequence of τ transitions

τ S0 | A0 , I,t 0 c [P0 ] − → ∗ S00 | A00 , I,t 00 c [P00 ] such that A00 ∩ OK c 6= 0/ . Note that Theorem 4.2.2 excludes the particular case of the empty adaptor (since A0 6= A0 ) for two reasons: i) the adaptor cannot guide the system if it does not participate in its communications; and ii) if there is no correct adaptor for the current services then the learning adaptor converges on the empty adaptor. Theorem 4.2.2 is particularly interesting in those cases where the adaptation contract guarantees that the services have successfully terminated, i.e., those in which S00 and P00 are also final states of their respective services. This happens in our running example because the contract automaton (Fig. 4.2(b)) is aware of the ending of the services due to vectors vec , veb and vea . It is worth noting that the sequence {Ii }i∈{0,...,n} of inhibited traces derived from Theorem 4.2.1 could be different for each run-time session. In this way, different learning iterations may lead to different complete sequences of inhibited traces. Thus, we need to establish that the learning process is well defined, in the sense that the learning process does not depend on the execution. The following proposition illustrates this result. Proposition 4.2.3 (Well-definedness) Let S and P be the initial states of two bounded services. Let us consider an adaptation contract c which corresponds to an adaptor with an initial state A0 and a monotonic learning function. If I and I 0 are complete sequence of inhibited traces resulting from a learning process starting in S | hA0 , I0 , λ ic [P], then

I v I0

and

I0 v I

4.3 Learning policies We now show how different definitions of add(t, I) can be employed to define different learning policies for rule L EARN.

4.3.1

Bounded learning

An upper bound to the number of traces that are inhibited by an adaptor at any given time may be set for different reasons. The most common 63

CHAPTER 4. DYNAMIC ADAPTATION is memory capacity, which may limit the size of learned information that can be kept in the memory. To respect such a limit, adaptors may need to forget some previously inhibited traces when learning a new trace to be inhibited. A simple bounded learning policy is to forget (if needed) the oldest learned trace when learning a new one:

add1 (t, I) =

J :t I :t

if outOfBound(I : t, β ) and I = u.J otherwise

where outOfBound(I : t, β ) holds if the size of I : t exceeds the maximum allowed size β 1 . Other types of bounded learning policies can be implemented by defining different outOfBound boundedness conditions (e.g., on the number of traces —rather than on their size) and/or by choosing differently which trace(s) to forget (e.g., one of the longest traces —rather than the oldest one). For instance:

add10 (t, I)

=

del({u}, I) : t I :t

if outOfBound(I : t, β ) and u ∈ longest(I) otherwise

where longest(I) = {u ∈ Σa∗ | I ∈ u and 6 ∃t ∈ Σa∗ . I ∈ t and |t| > |u|} and del is recursively defined as follows:

  del(D, J) u.del(D, J) del(D, I) =  λ

4.3.2

if I = u.J and u ∈ D if I = u.J and u 6∈ D if I = λ

Prefix-driven absorption

The way in which adaptors forget inhibited traces affects the overall performance of learning adaptors as much as the way in which they learn them. While bounded learning policies indirectly define a (boundedness determined) forget policy, trace prefixing can be exploited to intentionally define a forget policy to shrink the size of learned information. Intuitively speaking, the inhibition of a trace t which is a prefix of a previously inhibited trace t :: u subsumes (by rules E XT and I NT) the inhibition of the latter, which hence does not need to be explicitly stored with the inhibited traces anymore. A learning policy based on prefix-driven absorption can be easily specified by defining the add function as: 1 Since boundedness conditions are often application- and device-dependent, bounded learning policies are parameterised w.r.t the maximum allowed size β .

64

4.3. LEARNING POLICIES

add2 (t, I) = del(prefixedBy(t, I), I) : t where prefixedBy(t, I) = {u ∈ Σa∗ | I ∈ u and ∃v ∈ Σa∗ . u = t::v} is the set of traces in I that are prefixed by t . It is worth observing that different learning policies can be combined together. For instance, prefix-driven absorption and bounded learning policies can be naturally combined into a single policy as follows.

add2+1 (t, I) =

J I0

if outOfBound(I 0 , β ) and I 0 = u.J otherwise

where I 0 = add2 (t, I). It is also worth observing that prefix-driven absorption can also be exploited to identify temporary failures not due to service protocol incompatibilities. To do that we must distinguish “simple” prefixes from “non-simple” prefixes. We say that t is a simple prefix of t :: u if u contains only one element. The normal learning process of an adaptor may inhibit a simple prefix t of a previously inhibited trace t : a whenever the adaptor realises that there is no alternative extension of t . On the other hand, the inhibition of a trace t which is a non-simple prefix of a previously inhibited trace t :: u might be caused by some temporary failure that intervened (e.g., physical communication problems —such as shadow fading or increased physical distance). The detection of temporary failures can be exploited to define refined prefix-driven absorption policies that maintain the set IT of traces learned from temporary failures separate from the set IP of traces learned from (supposedly) permanent failures2 , such as the following definition of add . Let us define a one prefixed by function, opb(t, I) = {t·α | I ∈ t·α} be the set of traces t·α in I which are prefixed by t and which are only one element longer than t . Then:

 hdel(J, IP ) : t, del(J, IT )i       hdel(K, IP ), del(K, IT ) : ti add3 (t, hIP , IT i) =       hIP : t, IT i

if opb(t, IP ) 6= 0/ and

J = prefixedBy(t, IP :: IT ) if opb(t, IP ) = 0/ and prefixedBy(t, IP :: IT ) = K and K 6= 0/ otherwise

Combined policies whose bounded learning and/or time-to-forget components prioritarily forget traces corresponding to temporary failures can be easily defined. For instance, let hIP0 , IT0 i = add3 (t, hIP , IT i), then: 2 Rules E XT and I NT trivially extend to the case in which I is modelled as a pair hI , I i, P T viz., by turning I ∈ / t into (IP :: IT ) ∈ / t.

65


 0 hI , J i   P T hJP , 0i / add3+1 (t, hIP , IT i) =   0 0 hIP , IT i

4.3.3

if outOfBound(IP0 ::IT0 , β ) and IT0 = u.JT if outOfBound(IP0 ::IT0 , β ) and IT0 = 0/ and

IP0 = u.JP

otherwise

Reset on empty adaptors

The aforementioned learning policies aim to reduce the memory requirements (add1 and add2 ) and mitigate sporadic errors (add3 ). In particular, the main problem of the basic learning policy (add0 ) with sporadic errors (unforeseen failures in the synchronisations due to instabilities in the communication channels) is that it tends to converge on the empty adaptor. This happens because add0 does not forget the inhibited traces due to these sporadic errors and, as a result, the adaptor behaviour is constantly reduced each time one of these errors occurs. A straightforward solution to this issue, is to recognise when the process has converged on the empty adaptor and then reset the inhibited traces so that the adaptor can converge on better solutions. This is formalised with the following function.

add4 (t, I) =

λ if t = λ I·t otherwise

Intuitively, add4 behaves as add0 when the adaptor is not empty. If it becomes empty, and this is not considered valid by the given contract, then the only rule that can be triggered is the rule L EARN inhibiting the empty trace (t = λ ) as far as no synchronisation is possible with the empty adaptor. When this happens, function add4 clears the inhibited traces so that the adaptor can synchronise again. As usual, add4 can be combined with other learning policies, e.g.:

add4+2+1 (t, I) =

λ if t = λ add2+1 (t, I) otherwise

It is easy to prove that addi , i ∈ {0, 2, 3} (and their combinations) are monotonic by Definition 4.2.1 whereas add1 , add4 are not (deliberately). We will see in Section 4.4 that, although non-monotonic learning policies do not necessarily converge, they have the advantage of overcoming sporadic errors with high success rates. 66

4.4. EVALUATION AND TOOL SUPPORT: ITACA

4.4 Evaluation and tool support: ITACA Learning adapters have been implemented in a prototype and included in the ITACA toolbox. This toolbox is implemented in python 2.73 . Additionally, the implementation of learning adapters has been published as an standalone application at: https://github.com/jamartinb/adaptor We have evaluated the validity of our approach with two real-world datadiffusion protocols for sensor networks: TinyDiffusion [MGO+ 03] (based on [HSI+ 01]) and SPIN [HKB99] (see Section 4.1). In this experiment, a TinyDiffusion node was adapted to participate and intermediate in the communication between two SPIN nodes. We gathered various statistics simulating this real-world example. In these simulations, random traces are simulated one by one, the traces which are stuck in non-final adapter states (those not in OK c ) trigger rule L EARN and the gathered data is analysed after a certain interval of traces. The number of simulated traces is shown in the horizontal axis. These simulations are repeated 10 times in order to plot their arithmetic mean and the sample standard deviation. Different learning mechanisms are compared. Line reg corresponds to a regular adaptor using add2 . Line dthr represents an adaptor using add4+2+1 with a dynamic threshold β ∈ N, initially set to 0, which is incremented and decremented each time rules O K and L EARN are respectively used. We identified with athr the adaptive adapter, which also uses add4+2+1 with a dynamic threshold β 0 ∈ N but, in this case, β 0 is always set to be equal to the number of transitions in the adapter. Finally, noi represents an adapter which does not learn, i.e., an adapter where I is always empty. The latter is used as the comparative baseline for the other approaches. The vertical axis of Fig. 4.5(a) represents the cardinality of the list of inhibited traces. The running example can be solved with a minimum of 55 inhibited traces (corresponding to 7123 adapter transitions which do not need to be stored in memory) and it allows a maximum of 5466 successful traces, where the services and the adapter end in a final state. Other solutions are possible with lower and higher number of inhibited traces, however, these imply a lower variety of successful traces. We can see in Fig. 4.5(a) that both reg and athr approximate that amount of inhibited traces (55) before 4000 simulated traces. The dthr adapter, due to the high success rate (55%) in spite of behavioural incompatibilities, it always maintains the list of inhibited traces close to empty. The ordinates of Fig. 4.5(b) show the success rate, i.e., the percentage 3 http://www.python.org/

67


100

% Successful traces

Cardinality of I

60 50 40

athr dthr reg

30 20 10 0

75 50

athr dthr reg noi

25 0

0

500

1000 1500 2000 2500 3000 3500 4000

0

2

4

Iteration

6

8

10

12

14

16

18

20

Iteration (thousands)

(a) Size of I in adapters reg, dthr and athr (b) Comparative between adapters noi, reg, when learning without sporadic errors dthr and athr using various TER values I E F

70 60

Cardinality

50 40 30 20 10 0 0

2

4

6

8

10

12

14

16

18

20

Iteration (thousands)

(c) Details of the adaptive adapter athr Iters. (k.): TER:

(0, 4] (4, 6] (6, 10] (10, 12] (12, 14] (14, 16] (16, 18] 0 10−4 10−3 0.01 0.1 10−3 10−4 (d) TER values during different ranges of thousands of iterations

(18, 20] 0

Figure 4.5: Statistics gathered from the simulation with different adaptors and TER values. In Fig. 4.5(a), TER = 0 and, in Fig. 4.5(b) and Fig. 4.5(c), TER varies according to the ranges shown in Fig. 4.5(d)

of simulated traces which were successful in the current interval. To illustrate the behaviour of each of the adapters under sporadic communication errors, we include a new parameter: transition error rate (TER ∈ [0, 1]). The value of TER represents the probability of a given synchronisation to fail due to sporadic errors. 68

4.4. EVALUATION AND TOOL SUPPORT: ITACA It can be seen that noi remains close to the success rate mentioned before (55%), which is reduced proportionally to the TER. Adapter dthr performs slightly better than noi but not significantly due to its low threshold of inhibited traces (β ). It only takes advantage of its learning capabilities on those rare occasions where the same failure occurs in consecutive simulated traces. The other adaptors take advantage of the learning process and achieve success rates close to 100%. However, when sporadic errors start to occur (starting from simulation 4000), adaptor reg, which is not able to forget inhibited traces, quickly converges to the empty adaptor and remains so for the rest of the simulation. Finally, athr is also affected by high values of TER but it is able to recover when sporadic errors cease to occur. Indeed, besides sporadic errors, athr is able to solve the remaining traces which were not successful due to behavioural incompatibilities, achieving rates of success close to 100%. A detail of the previous simulation is shown in Fig. 4.5(c) but focussing on the athr adapter. This last diagram shows the current amount of inhibited traces (I ), the amount of sporadic errors (E ) and the total number of failed traces (F ≥ E ). Although the sample standard deviation of the latter seems high at first sight, it is worth noting that the range of ordinates shown in this diagram is [0, 75), in contrast to the maximum value of 1000. As in Fig. 4.5(a), the list of inhibited traces initially approximates the desired value of 55. However, when sporadic errors appear (4000), new inhibited traces reduce the size of the adapter (i.e., number of transitions), this reduces the threshold which finally reduces the number of inhibited traces. Intuitively, this means that the adapter reduces its knowledge because it cannot trust it. This phenomenon reappears when TER is increased in subsequent iterations (6000, 8000, 10000 and 12000). The final range (14000, 20000] is more interesting. We can see that, although athr succeeded in recovering from temporary failures and it achieves a success rate close to 100%, it does so at the cost of obtaining a suboptimal, but correct, solution. In other words, depending on where the sporadic errors occurred, adapter athr might prune bigger parts of the behaviour than what is needed to avoid failures. These parts usually represent different interleaving sequences of the rest of the adapter, or they correspond to external choices which can be controlled by the adapter. Interestingly, adapter reg enhanced with similar reset capabilities as dthr (i.e., reg+reset using add4+2 ) was able to match athr4 . This fact leads to the conclusion that it is not the dynamic threshold what matters but to be able to notice the convergence to empty adapter, and thus reset 4 The

statistics characterising reg+reset are indistiguishable from those of athr.

69

CHAPTER 4. DYNAMIC ADAPTATION the inhibited traces. To sum up, the most promising adapter is reg+reset (add4+2 ) because of its simplicity and because it matches athr in converging to correct but incomplete adapters which overcome behavioural incompatibilities and sporadic errors. Algorithm 4.4.1 synchronise It returns the actions enabled by the adaptor and the destination states after synchronising on these actions inputs: The current state of the adaptor hA, I,tic being A =

{hs1 , ∆1 i, . . . , hsn , ∆n i}

output: A map between enabled actions and their destination adaptor states. For instance, if synchronise(hA, I,tic )[|a] = A0 is not empty, then it a exists an adaptor transition hA, I,tic [P] − → hA0 , I,t·|aic [P] 1: enabled := dict() {A dictionary/hash map of empty sets} 2: for all hsi , ∆i i ∈ A do {for every sub-state, see rule (L)} 3: for all α ∈ ∆i . I ∈ / t·α do {for every non-inhibited queued action} 4: enabled[α].add(hsi , ∆i − {α}i) {offer and dequeue that action} 5: end for 6: for all (s0 , a ♦ b) ∈ outgoing_rules(si ) do {for all two-sided outgoing rules} 7: if I ∈ / t·|a then {if left-side is not inhibited} 8: enabled[|a].add(hs0 , ∆i ∪ {b|}i) {offer it and enqueue right-side} 9: end if 10: if I ∈ / t·b| then {if right-side is not inhibited} 11: enabled[b|].add(hs0 , ∆i ∪ {|a}i) {offer it and enqueue left-side} 12: end if 13: end for 14: for all (s0 , a ♦ ) ∈ outgoing_rules(si ) . I ∈ / t·|a do 15: enabled[|a].add(hs0 , ∆i i) {offer every non-inhibited left-sided rule} 16: end for 17: for all (s0 , ♦ a) ∈ outgoing_rules(si ) . I ∈ / t·a| do 18: enabled[a|].add(hs0 , ∆i i) {offer every non-inhibited right-sided rule} 19: end for 20: end for 21: return enabled Algorithm 4.4.1 shows the pseudocode executed before each synchronisation with the adaptor. It returns the set of actions enabled by the adaptor and the next adaptor state after a synchronisation on any of those actions. 70

4.5. ZERO-KNOWLEDGE ADAPTATION Function outgoing_rules : Sc 7→ {Sc × Σc } returns the set of transitions (destination state and transition label) outgoing from the given state of the contract automaton. Regarding the computational complexity of learning adapters, each synchronisation with the adapter requires a transition in the adapter behaviour and the possible inclusion of a new inhibited trace. Assuming hash sets and hash maps with constant complexity for membership queries and insertions, the time complexity is O(|Sc ||Σc |l ) where |Sc | is the number of states in the contract automaton, |Σc | is the number of vectors in the contract and l is the maximum length of a trace. The spatial complexity of our approach with addi , i ∈ {0, 2, 3, 4} is given by the combined size of: the inhibited traces , the adapter state and the adaptation contract. The space required by inhibited traces can be reduced either by storing them as a tree or using any learning policy based on add1 (where the size of the inhibited traces is bounded by β ). Both approaches result in an spatial complexity of O(|Sc ||Σc |l ). Let us remember that l is bounded because we deal with bounded services but it is further restricted by acyclic adaptation contracts. In the latter case, the time and spatial complexity are exponential with regard to the size of the contract but they do not depend on the number or size of the services to be adapted. In addition, both complexities are greatly reduced if the adaptation contract is deterministic in the sense that it does not require the lazy-choice represented by rule L (see Section 2.2.1). In this case, at any given adapter state hA, I,tic it happens that A contains a single element hs, ∆i. This simplification results in a time complexity of O(max(|Σc |, l)) and a spatial complexity of O(|Sc ||Σc | + |Σc |l ).

4.5 Zero-knowledge adaptation We have presented a new lightweight approach to behavioural runtime adaptation. Our approach requires an adaptation contract based on the signatures of the services (the collection of operations they require and offer), but no previous knowledge on the behaviour of the services is needed since it will be dynamically learned. We have shown how adaptors can incrementally learn from interaction failures at run time so as to eventually converge on the same behaviour that could be a priori synthesised by means of (computationally expensive) design-time analyses on the behaviour of the services. The learning adapters presented in this thesis can be applied to perform 71

CHAPTER 4. DYNAMIC ADAPTATION zero-knowledge adaptation, i.e., adapters without adaptation contracts. In this case, although no contracts are specified, there is an implicit contract between each source and destination service. This implicit contract assumes that every source and destination service shares the same alphabet of actions, therefore presenting a trivial set of one-to-one vectors. Having such a zero-knowledge contract, which is dynamically inferred, the adapter does not perform any adaptation at signature level (it simply forwards messages), but it does learn from possible incompatibilities between the behaviour of the services (such as messages expected in a distinct order). Therefore, zero-knowledge contracts avoid deadlocks that would be present without adaptation.

72

5

You cannot teach a man anything; you can only help him discover it in himself. G. Galilei

Web service discovery Service discovery aims to find services which match a query with the purpose of directly accessing the service or including it into an orchestration. However, WSs are not always compatible and, even though they might present the functionality and QoS expected by the query, they cannot be directly used because of these incompatibilities. These incompatibilities can be present in their signatures but also in their behaviours. The latter applies to stateful services (such as those described as BPEL processes or WF) where any mismatch in the sequence of the messages exchanged may lead the composition to a deadlock situation. For instance, a missing operation in a service, a mismatch in the operation name or arguments, or an unexpected sequence of messages makes impossible the correct termination of the services involved. Service Discovery has to face this issue. We assume that, similarly to behavioural services, discovery queries have signature and behaviour. We call these behavioural queries. Then, the services to be discovered must comply with the signature and behaviour required by the discovery Discovery query

? Matching service

Adaptable service

Service composition

Figure 5.1: Discovery of stateful services which i) match the query, ii) can be adapted to the query, iii) are composed to fulfill the query 73

CHAPTER 5. WEB SERVICE DISCOVERY query, otherwise incompatibilities might lead the whole system to deadlock situations where none of the services can reach a stable state (Fig. 5.1). As an example of these behavioural queries and the possible incompatiblities that can arise, Spanoudakis et al. [ZSD08, KSZ+ 07, SZK05] introduced service discovery techniques which take into account the possible complex behaviour of the services and rank them according to their discrepancies with the query. They presented a comprehensive architecture for service discovery where services are filtered and ranked depending on their signature and QoS and, in a second stage, the behaviour of the filtered services is compared to detect incompatibilities or discrepancies in their protocol. We consider that this approach in two steps (first compare the signature and then the behaviour) increases the complexity and can result in suboptimal solutions. For instance, the behaviour of the service might indicate that the first operation is incompatible with the query so a comparison with the rest of the operations is unnecessary or, alternativelly, even if there are signature incompatibilities, it is possible that those incompatible operations do not interfere with the part of the behaviour required by the query, therefore they are not an issue for the discovery. Unlike theirs, our approach will compare directly the behaviour of the query and the services in the registry in a single, integrated step. Moreover, in their work they succeed in discovering services where some transitions are missing in the query. We go beyond such discrepancies since our approach supports adaptation, therefore we analyse if such behavioural discrepancies can be automatically adapted. They return perfectly matching and similar services whereas we will return perfectly matching and adaptable services, in the sense that we guarantee that the services can be adapted whereas they do not. In addition, their performance evaluation [ZSD08] states that the behavioural comparison between services and the query amounts for most of the computation time even after the previous filtering stages. Our approach, instead, keeps a linearithmic time complexity with regard to the number of services in the registry. Our approach lacks any ranking system since it aims at a yes/no decision. Adaptable services pass, incompatible services beyond adaptation do not pass. If such a ranking is desired, our work could be used as an initial filter in their architecture. In this case, their functions to evaluate the signature and behavioural distances should be customised to consider the adaptation effort of the service instead of just counting missing edges. A similar approach is presented by Benigni et al. [BBC07] where the discovery process is divided in two steps: functional analysis (where they check I/O dependencies among services using OWL-S ontologies) and behavioural analysis, where they verify deadlock freedom through Petri Nets obtained from their OWL-S descriptions. However, they go one step beyond 74

since their work is able to calculate minimal service compositions which fulfil the query. In addition, they overcome certain incompatibilities due to the ontology relationships expressed in OWL-S. Being a two-step approach, their work presents the same drawbacks as that previously mentioned. Our work to service discovery lacks their sematic information but, instead, we do consider the presence of adaptors in the composition, therefore we are able to overcome more incompatibilities than this related approach. Both previous approaches focussed on comparing one-by-one the query and the services (or composition of services). Our work, however, centers on the creation of search structures to enable scalable service discovery where the returned services can be adapted automatically to fulfil the query. Our search tree for service discovery was inspired by graph indexes [WHW07, YYH05] commonly used in the area of chemistry to find molecules from among a database of compounds. This area places a lot of emphasis on the scalability of the algorithms and some of the proposals are based on graph sub-isomorphism queries and different notions of distance between the graph of the query and the sub-isomorphism found in the compound. A lot of this knowledge could be reused for the discovery of behavioural services. However, chemistry related indexes are usually too optimised to non-directed and non-labelled graphs to be easily customised for feature-enabled discovery of adaptable services. We will approach service discovery by means of software adaptation. Software adaptation [CMS+ 09a, CPS08, YS97] is a sound solution which enables Web Services to interoperate despite their initial incompatibilities. This adaptation is achieved by deploying an adaptor, either as a set of wrappers or as a centric orchestrator, which is in charge of receiving, translating and rearranging the messages in the way expected by the destination service. Adaptors can be either generated beforehand and stored for their later use, or created automatically [MP09a, CNP09] at run time. Adaptation is based on common features between the expected and the provided functionality, i.e., the adaptor can change the signature and behaviour of the service but it cannot change the underlying functionality and semantics of the service. Therefore, adaptation is not possible in the cases where there is no such mapping between expected and provided features (e.g., no adaptor can make a booking out of a weather service). Main contributions: In this chapter, we propose to apply software adaptation techniques to service discovery. The goal is to discover those services in the registry whose behaviours can be adapted to the query (see Fig. 5.2). Once discovered, these services can be adapted and composed 75

CHAPTER 5. WEB SERVICE DISCOVERY

Figure 5.2: Overview of the discovery of adaptable services

using behavioural adaptors such as those described in Chapter 4 and Chapter 8. Therefore, we focus on defining the conditions for adaptability and designing a scalable search tree over services depending on those conditions. The benefits of using software adaptation in the discovery of behavioural (i.e., stateful) services are threefold: i) the behaviour of the services can be abstracted to a reduced set of features required for the adaptation; ii) feature-oriented representation of the services decreases the computational complexity of the discovery process; and iii) discovery of adaptable services obtain more results as it gathers services that perfectly match the query and those which can be adapted to fulfil the signature and behaviour required by the query. To our knowledge, there is no other service discovery approach enhanced with behavioural adaptation and supported with a search tree. In most other work, incompatibilities between services are restricted to missing transitions or ontology matching and analysis, and they do not take advantage of all the efforts made in Software Adaptation [CPS08, YS97]. We will exemplify the model and algorithms involved with a query (Fig. 5.3) and two services inspired in some operations of the Picasa Web 76

5.1. FEATURE-BASED BEHAVIOURAL ADAPTATION feeds1 (Fig. 5.4) and Flickr SOAP API2 (Fig. 5.6). We use these services to illustrate feature-based adaptation in Section 5.1. The adaptation requirements presented in the previous section are used in Section 5.2 to reduce the services to the dependencies among the their provided and required features. Section 5.3 introduces the search tree we developed to support the discovery process based on feature-based service abstraction. We conclude with some final remarks in Section 5.4.

5.1 Feature-based behavioural adaptation Behavioural adaptation [CPS08] is based on the special role of the adaptor as man-in-the-middle of the communication which intercepts the requests of the services, recompose those requests as expected by the destination service(s), and replies with the appropriate message at the appropriate time. Message names and structures are altered as needed by the adaptor (the adaptor merges, splits and transforms messages to overcome incompatibilities), hence the restrictions to the adaptation are service controllability, the data received and sent by the adaptor, non functional properties and the side-effects of the operations. Controllability will be covered in Section 5.2 and we handle data, non functional properties and side-effects as features of the operations in the service behaviour. These features can be semantic annotations on the behaviour of the service or obtained from the arguments in the service signature, for example. Features can be either provided or required by parts of the service behaviour. The adaptor can only provide features which have been previously offered to it. For example, the adaptor cannot create new data so every argument sent by the adaptor must be previously received. Following this reasoning, the only feature-wise restriction to the adaptability of the composition of several services is that some service is able to provide a feature before another service requires it. In the case of data, these might be reused in several iterations, therefore the adaptor can provide several times an already received feature, e.g., the correlation set parameter in BPEL, which must be resent in each message within the same session. We now generalise the content expression of the labels in service behaviours (Definition 2.1.2) so that, instead of representing a list of arguments, they now represent general features. Therefore, the labels 1 http://code.google.com/apis/picasaweb/docs/2.0/reference.

html 2 http://www.flickr.com/services/api/

77


0 initialise! K 1 login! Usr,Pass

0 ! K,Usr,Pass

2 getPics!

1,2,3 ? I,Url

3

4

pushPic? Url,I 4

pushPic? Url,I

(a) Query LTS

(b) AD tree

Query

Figure 5.3: Behaviour and AD tree of the query in Σ are either internal transitions (τ ) or communication transitions which start with the operation name followed by an ‘!’ or ‘?’ if they are output or input actions, respectively followed by a list of features, which are provided features in output actions and required features in input actions. Labels are expressed with the following notation: operation(!|?){, feature}∗ . This service description for WS discovery has been chosen because it is simple, graphical, and it can be easily derived from existing implementation languages (see for instance [CSC+ 07, FUK06, FBS04, SBS06] where such abstractions for Web services were used for verification, composition or adaptation purposes; and [DOS09] for a description of the operational semantics behind the composition of this kinds of services). Discovery queries represent the complementary behaviour of what is expected from the service to be discovered. Therefore, queries are modelled as service behaviours with complementary communication actions. Example 5.1.1 Figure 5.3(a) represents the behaviour and signature required by the discovery query. The query contains an initialisation message with a feature K (an authentication token), then it requests to login with Usr and Pass, it sends a message asking for pictures and finally expects to receive one or several pictures with their Url and information (I ). The Picasa Web in Fig. 5.4(a) provides similar functionality to the one requested by the query. However, the Picasa service can provide pictures to anonymous 78

5.1. FEATURE-BASED BEHAVIOURAL ADAPTATION

OR 0 ? Usr 1,2,3 OR

! I,Url

? Pass

OR

? Pass getFeed? Usr getFeed! Usr,Url,I

auth? Usr,Pass

OR

! I,Url,Com

auth!

! Com

? Code

unload? Code getFeed? Usr getFeed! Usr,Url,I,Com

? Code

? Code 4

(a) Picasa Web LTS

(b) Picasa Web AD tree

Figure 5.4: Behaviour and AD trees of Picasa Web

q:initialise? K

q:login? Usr,Pass

p:getFeed? Usr,Url,I

p:getFeed! Usr

q:getPics?

q:pushPic! Url,I

Figure 5.5: An adaptor for Query (q in Fig. 5.3) and Picasa web (p in Fig. 5.4) 79

CHAPTER 5. WEB SERVICE DISCOVERY and authenticated users. In the latter case, it additionally provides the comments of the pictures (Com) and it requires a Code to end the session. The adaptor represented in Fig. 5.5 would make the behaviour of the query (Fig. 5.3(a)) and the behaviour of Picasa Web (Fig. 5.4(a)) to cooperate properly in spite of the incompatibilities in the number and name of the messages, and their arguments. For instance, the login and getPics request in the query should correspond to the getFeed operation in Picasa Web, therefore two messages are transformed into a single request and feature Pass is not used. In addition, none of the operation names are compatible, i.e., offered and requested operation names are different such, for instance: login!Usr, Pass vs auth?Usr, Pass. Communication actions in the behaviour of the adaptor have been prefixed with ‘q:’ (the query) and ‘p:’ (Picasa Web) depending on the service that must synchronise with the adaptor. In this example, features represent the arguments sent and received by the service and the adaptor. When the adaptor needs to send a message to a service with some certain features (e.g., pushPic?Url, I ), the adaptor should have received such features from previously processed messages (e.g., getFeed!Usr, Url, I ).

There are several approaches to generate adaptors either automatically [MP09a] or based on adaptation contracts that must be previously designed [CMS+ 09a]. Henceforth, discovery queries are considered to be equivalent to behavioural services, therefore the discovery process will consist in checking whether the query and each of the services in the registry are adaptable or not.

5.2 Abstracting adaptable behaviour We need an efficient algorithm to check whether two stateful services are adaptable or not. The behaviour of the service is needed to find out possible deadlocks. However, since an adaptor is going to detach the execution of the services, we can restrict our analysis only to the necessary conditions over the behaviour of the services. These conditions are defined by a set of dependencies modeled as Adaptation-Dependency trees (AD trees) which are further reduced into Feature-Dependency rules (FD rules). 80

5.2. ABSTRACTING ADAPTABLE BEHAVIOUR

OR 0 ? K,Usr,Pass 1,2,3 ! F,Tok,Perm

removePhoto! P

getFrob? K,Usr,Pass

OR

!C

?P

?C

getCollections? K,Usr

getFrob! F

getCollections! C getToken? K,F getToken! Tok,Perm,Usr

OR

getColInfo? K,C

OR

! Ti,Desc,P

! I,Url

! Ti,Desc,P

getColInfo! Ti,Desc,P OR

OR

OR

getPhoto? K,P ! I,Url

getPhoto! I,Url 4

(a) Flickr LTS

(b) Flickr AD tree

Figure 5.6: Service inspired in some methods of Flickr API

5.2.1

Adaptation-dependency trees

Message names are irrelevant in terms of adaptability, therefore we can abstract the behaviour of the services and reduce them to the possible interactions (i.e., sequence of actions) in which the features are exchanged. We will refer to this control-flow abstraction as AD tree. An AD tree is an abstraction of all the possible traversals of the behaviour of the service. AD trees will serve for us to analyse the controllability of the query and the services based on their features. Definition 5.2.1 (Adaptation-dependency tree) An AdaptationDependency tree (AD tree) is a tree with OR nodes, AND nodes and END nodes where edges are labelled with a (possibly empty) set of features. Sets of features are prefixed by ‘?’ or ‘!’ depending on whether these features are required or provided, respectively.

Example 5.2.1 Figure 5.3, Fig. 5.4 and Fig. 5.6 show the behaviour 81

CHAPTER 5. WEB SERVICE DISCOVERY and AD tree3 abstractions of the query, Picasa Web and Flickr services, respectively. Dashed transitions in the AD tree represent different interleaving cases of the operations covered by solid transitions, therefore these parts have been omitted for the sake of clarity. There are several differences between Flickr and Picasa Web. Flickr uses a token K (the API key) to associate each received message with a session, and it uses another token called frob (F ) during the authentication. Additionally, instead of returning the Url of the pictures directly, it is required to request the collection of the Usr (C), then receive the information of that collection (Ti, Desc and P), and finally use P to obtain the Url of the picture. In AD trees, there are OR nodes and AND nodes which can have several children. On the one hand, OR nodes represent that the choice of which branch to follow depends on external communications, i.e., any external service –or the adaptor– can send a particular message to decide which branch is going to follow the current service (e.g., a pick activity in BPEL). On the other hand, AND nodes are internal choices of the service, therefore the composition with this service must deal with every branch outgoing from these nodes (e.g., a if activity in BPEL). Edges are labeled according to whether the service provides/sends (prefixed with exclamation marks) or requires/receives (prefixed with question marks) certain features. Whether these features are given in a single or several messages is irrelevant for the AD tree, therefore sequences of edges with the same type are collapsed in a single edge. For instance, the features provided by initialise and login in the query (Fig. 5.3(a)) are collapsed in a single edge in its AD tree (!K, Usr, Pass in Fig. 5.3(b)). initialise!K

login!Usr,Pass

−−−−−−→−−−−−−−−→ Service interface

!K,Usr,Pass

⇒

−−−−−−−→

Service AD tree

Adaptors are able to reuse previously received features, because of this, once a given feature occurs in an edge, that same feature does not appear in any descendant edge. getFeed?Usr

getFeed!Usr,Url,I

getFeed?Usr

−−−−−−−→−−−−−−−−−−→−−−−−−−→ Service interface

?Usr

⇒

!Url,I

−−−→−−−−→

Service AD tree

In order to check the adaptability between two given AD trees we must start at the root of both AD trees and traverse them while matching the branches as follows: 3 Trees are drawn as directed-acyclic graphs with starting and ending nodes for the sake of simplicity.

82

5.2. ABSTRACTING ADAPTABLE BEHAVIOUR 1. If it is an AND node, there must be a matching branch in the other AD tree for each branch outgoing from this AND node. 2. If it is an OR node, we can continue with any (or several) of the outgoing branches. 3. If it is a ‘?’ edge, each feature in that edge must be present in ancestor ‘!’ edges of the other AD tree. 4. Edges with provided features (i.e., ‘!’ edges) can be ignored in the matching. They are only required by ‘?’ edges in the other AD tree in step 3. 5. Un-labelled nodes are skipped and END nodes must match with END nodes in the other AD tree. Example 5.2.2 The AD tree representing the query (Fig. 5.3(b)) matches with both Picasa Web (Fig. 5.4(b)) and Flickr (Fig. 5.6(b)). Bold edges with empty arrowheads in the behaviour and AD trees of the services represent the branches that fulfilled the query. Nodes have been labelled to illustrate the correlation between the query, its AD tree, and the AD trees of Picasa Web and Flickr. Let us note that our approach automatically recognised and overcame the dependencies in Flickr (F,C and P) to match the query. In the particular case of Flickr (blue-bold transitions with empty arrowheads in Fig. 5.6), the adaptor must first get the collections of pictures, then get the information about those collections and finally obtain the URLs of the photos in each collection. Using AD trees for comparing queries and services is less demanding in computational terms than comparing behaviours. The reason is that we compare trees (limited in depth by the number of features) instead of graphs, which can entail exponential complexity. However, it is not enough just to compare the discovery query with a single service but, instead, we want to compare the query with all the candidate services in the registry and, therefore, we are going to further reduce AD trees into FD rules.

5.2.2

Feature-dependency rules

Comparing AD trees still requires some complexity due to the nature of AND nodes (where the comparison must be scattered between several branches, step 1) and OR nodes (where the comparison is based on a try-and-error approach in every branch, step 2). Additionally, AD trees 83

CHAPTER 5. WEB SERVICE DISCOVERY avoid loops but they suffer the interleaving of operations during the execution of the service. For example, Fig. 5.6(b) has several omitted parts (dashed edges) which contain other interleaving cases between the getCollections, getColInfo and getPhoto branches. This is the reason we abstract AD trees and reduce them further into FD rules. Definition 5.2.2 (Feature-dependency rule) FD F × 2F × 2F and are denoted by:

rules

belong

to

f ← { f1r , . . . , fnr } | { f1e , . . . , fme } where fir , f je ∈ F, i ∈ [1, n], j ∈ [1, m] and F is the set of features. A FD rule f ← F r | F e represents that a given service provides feature f if it is given the required feature F r . In addition, features F e are needed for that service to reach a final/stable state after f has been provided. Features in F e are ending features. Every FD rule preserves that F r ∩ F e = 0/ . The intuition behind FD rules is to express which features are required in a single session with a particular service to obtain a given feature and complete the session. Behavioural adaptors support reordering [MPS08], therefore the adaptation is possible if the adapted service corresponding to the rule is provided with the required features in any order (required and ending features are sets), then we obtain the requested feature f and, finally, the ending features must be given in any order as well for the adapted service to end properly. Example 5.2.3 For instance, Picasa Web (Fig. 5.4) respects, among others, the following FD rule:

Com ← {Usr, Pass} | {Code} This rule means that, if we wanted the comments of a photo (feature Com), then we had to be ready to provide in advance the required features Usr and Pass. At that moment, Picasa Web delivers feature Com. Eventually, we want the Picasa Web service to reach a final/stable state so we have to call unload?Code, and hence Code is an ending feature. Abstracting service behaviour into sets of FD rules we lose the correlation between the behaviour of the service and the different rules, that is, it might happen that even though a service complies with two rules, it cannot offer both during the same session. In those cases, we assume that we can establish several concurrent sessions with the same service (e.g., it is 84

5.2. ABSTRACTING ADAPTABLE BEHAVIOUR possible to have several concurrent instances of the same BPEL process), therefore we can use each of these sessions to obtain different features. FD rules are obtained from the AD tree of a given service. While traversing an edge in the AD tree, every provided feature in that edge generates a new FD rule, all the features which were needed to reach that edge are the required features of the rule and, from that node on, every feature needed to reach a final state is included as an ending feature. If the same feature can be obtained in different paths with OR nodes, several FD rules are generated to provide the same feature. In addition, AND nodes must be considered in an special way:

• If a feature is not provided in every branch of an AND node, no FD rule

is generated for that feature after the AND node. This means that, if the service can internally decide whether to provide a feature or not, that service does not qualify to provide that feature. • If the feature is provided in every branch of an AND node, all the required and ending features gathered in those branches are joined in the same rule. That means that the rule requires everything that might be needed regardless the internal choices of the service. If there are two FD rules which provide the same feature with similar requirements we can discard the rules which demand more features. This is formalised in Definition 5.2.3. Definition 5.2.3 (partial order between FD rule) Two FD rules a = f ← Ar | Ae ∪ C and b = f ← Br ∪ C | Be are related by a partial order (a ≤ b) if, and only if, Ar ⊆ Br and Ae ⊆ Be where Ar , Ae , Br , Be and C are sets of features.

The intuition behind this partial order is that a rule requires less than another if it has less required features, less ending features or some required features are placed as ending features . The first two conditions are straightforward and the third is reduced to saying that the rule requires the same set of features but it requires them later, therefore the rule is less restrictive. From a generated set of rules R, we keep only the minimal rules according to this partial order (i.e., {r ∈ R |6 ∃r0 ∈ R . r0 ≤ r}). Therefore, minimal rules subsume all the other more demanding rules. A query represented by an AD tree can be transformed into a set of FD rules. This is done in the following steps: i) we replace the features in the AD tree of the query with their complements; ii) we generate all the possible combinations of AD trees that would be created by replacing every OR node in the AD tree of the query with one of its branches; iii) we convert 85

CHAPTER 5. WEB SERVICE DISCOVERY each of these AD trees into FD rules following the algorithm described previously. In this way, we obtain a set of sets of FD rules. The upper bound for the number of generated FD rules is exponential with regard to the number of different features in the query but it is independent of the behaviour of the query and it is reduced by the partial order between FD rules. A service fulfils a query when the FD rules of the service subsume (with regard to ‘≤’) all the rules of any of the sets of FD rules generated from the query. Example 5.2.4 The FD rules of the query (Fig. 5.3(b)) are:

I ← {K,Usr, Pass} | 0/ Url ← {K,Usr, Pass} | 0/

(5.1) (5.2)

by abuse of notation we can omit empty sets and we can collapse several rules into a single expression. For instance, rules 5.1 and 5.2 can be represented as:

I,Url ← {K,Usr, Pass} | The FD rules of Picasa Web (Fig. 5.4(b)) are:

I ← {Usr} | Url ← {Usr} | Com ← {Usr, Pass} | {Code}

(5.3) (5.4) (5.5)

Picasa Web also complies with rules I, Url ← {Usr, Pass} | {Code}, which correspond to the authenticated part of Picasa Web where feature Code was required to finish the session, but these rules were subsumed by rules 5.3 and 5.4. The FD rules of Flickr (Fig. 5.6(b)) are:

F, Tok, Perm,C, Ti, Desc, P ← {K,Usr, Pass} | I ← {K,Usr, Pass} | Url ← {K,Usr, Pass} |

(5.6) (5.7) (5.8)

Rule 5.1 is subsumed by the rules 5.3 and 5.7. Similarly, rule 5.2 is subsumed by the rules 5.4 and 5.8. As expected, both services satisfy the query. FD rules are the key to processing discovery queries, therefore we focus on these to develop a search tree, called Feature-Dependency tree (FD tree), which performs well in finding those services which comply with a certain FD rule. 86

5.3. SEARCH TREE OF ADAPTABLE WEB SERVICES

5.3 Search tree of adaptable Web services FD trees represent the registry of services. FD trees return those services which comply with a given FD rule. A discovery in the registry corresponds to several traversals of the FD tree, one per FD rule in the query. We will describe FD trees using a generic FD rule ( f ← Fr | Fe ) as the query. A FD tree consists of a first level of edges which are labeled with the feature that is requested ( f ). From that point on, each node in the tree has two sets of services: one set contains the services that are able to provide the feature at that point, and the other set contains the services that are able to finish at that point. Every edge will be labeled with a different feature, first the required features (Fr ) and then the ending features (Fe ). In this way, in a single trace throughout the tree, the search returns all those services which comply with a FD rule. The search algorithm (Algorithm 5.3.1) processes each FD rule in the query (lines 2–23), it traverses the FD tree using the required features first (lines 7–11) and ending features later (lines 12–15) while collecting the desired services. Considering set structures that perform membership queries in logarithmic time, a FD rule query has a complexity in time of O(S · log(S) · F), where S is the number of services in the registry and F is the number of different features. Therefore, the discovery of all the services which can be adapted to a given behavioural query has a time complexity of O(Q · S · log(S) · F), being Q the number of FD rules in the query. Example 5.3.1 The FD tree in Fig. 5.7 contains the FD rules that represent Picasa Web and Flickr. Transitions outgoing from dashed nodes have been omitted for the sake of clarity as they represent different interleaving of sequences expressed in other paths. As an example, the Picasa rule Com ← {Usr, Pass} | {Code} is included as a first transition labelled with Com, two transitions labelled with Usr and Pass, then the service can provide Com (so it is placed in the left-hand side set of services) and finally Picasa Web can end after receiving Code. It is worth mentioning that the interleaving in FD trees, which is the reason for the FD tree having several superfluous nodes, can be avoided by imposing an arbitrary order among features. In order to favour discovery requests, features should be sorted according to their frequency as required features and ending features in service behaviours. Example 5.3.2 Figure 5.8 displays the FD tree corresponding to the one in Fig. 5.7 but with an additional service (a) and enforcing an order among 87


!Tok

!Perm

!C

!Ti

!P !Desc

? Usr ? K

!F

? Pass

?K

!Url

!Com

? Pass ? Usr p

?K

!I

? Pass

? Usr ? Code

? Pass

p

? K ? Pass

? Pass p

? Pass f

?K f

? Code p

Figure 5.7: FD tree which contains the FD rules that represent Picasa Web (p) and Flickr (f)

!I

!Url !P

!Desc !F

? Usr p

!Tok !Perm !C !Ti

!Com

? Usr

!K

? Usr

? Usr

? Pass

? Pass

p ? Pass

? Pass p

?K f

f

a

a

? Code p

Figure 5.8: FD tree with sorted features (Usr < Pass < K < Code < . . . ) and a new service (a) which provides K ← {Usr, Pass} | 0/ 88

5.3. SEARCH TREE OF ADAPTABLE WEB SERVICES Algorithm 5.3.1 discovery Processes a discovery query considering feature-based behavioural adaptation inputs: An FD tree and a query consisting in a set of FD rules output: Set of services which can be adapted to fulfil the query

init = >; sol = 0/ for all f ← { f1r , . . . , fnr } | { f1e , . . . , fme } in query do match = 0; / part = 0; / end = 0/ node = tree.root.getChild( f ) match.addAll(node.endingServices) {Ending services are feasible solutions} 6: part.addAll(node.providedServices) {Services that must be checked for their ending features} 7: for all fi in f1r , . . . , fnr do {Follow required features} 8: node = node.getChild( fi ) 9: match.addAll(node.endingServices) 10: part.addAll(node.providedServices) 11: end for 12: for all fi in f1e , . . . , fme do {Check ending features} 13: node = node.getChild( fi ) 14: end.addAll(node.endingServices) 15: end for 16: match.addAll(part ∩ end) 17: if init then 18: sol = match 19: init = ⊥ 20: else 21: sol = sol ∩ match {Return services which satisfy every rule in the query} 22: end if 23: end for 24: return sol 1: 2: 3: 4: 5:

features (Usr < Pass < K < Code < . . . ). Let us note that this enforced order must be respected by the for conditions in lines 7 and 12 of the discovery algorithm (Algorithm 5.3.1).

89

CHAPTER 5. WEB SERVICE DISCOVERY FD trees can be used for compositional discovery thanks to their efficient search algorithm and their support of behavioural adaptation. Compositional discovery aims to look for possible compositions of services that, together, achieve to fulfil the query. There are two complementary ways to approach compositional discovery with FD trees. First, we can allow to obtain the needed features from different services, i.e., we can use FD rules from several services to subsume the rules of the query. This initial approach is straightforward and it has the same complexity. A second manner would be to extend the results with services that are further down in the tree that which is allowed by the features (required and ending) of the original query. This means that if a features f not provided by the query is required to reach valid services, then we can automatically generate a new discovery query looking for that very feature f whose required and ending features are the same as the original query. Once the services are discovered, behavioural adaptation [CMS+ 09a, MP09a, MPS08] will take care of orchestrating the services accordingly while avoiding deadlocks and livelocks. Example 5.3.3 Let us consider the same Picasa Web and Flickr services (Fig. 5.4 and Fig. 5.6), a third auxiliary service (a) which, being given a user and password (Usr and Pass), it provides the Flickr API key (K ) for that user. In other words, service a complies with K ← {Usr, Pass} | 0/ . With these three services in the registry we want to discover a composition for the following discovery query:

Desc ← {Usr, Pass} | 0/

(5.9)

In this case, the query does not provide feature K and the discovery search (Algorithm 5.3.1) ends at the greyed node in the FD tree (see Fig. 5.8) without finding any suitable service. However, there exists an edge outgoing from the greyed node which would deliver service f, but it requires the missing feature K . Therefore, we can still succeed in our original query if we create an additional auxiliary query K ← {Usr, Pass} | 0/ . This latter query returns service a and, with it, we obtain the final solution for the discovery: the composition of Flickr and service a.

5.4 Conclusion and future work We have presented a search tree over an abstraction of behavioural services which is able to discover services that can be adapted to fulfil 90

5.4. CONCLUSION AND FUTURE WORK the query. Software adaptation enhances service discovery by allowing us to find both services which perfectly match the query and services that could be automatically adapted to the query. Therefore, this approach returns more results than traditional service discovery techniques without adaptation. The discovery of adaptable services is based on features provided and required by the services and the query. The feature-based abstraction of stateful services presented in this thesis resulted in smaller and more efficient representation of services. These abstract representations of adaptable services are gathered into a search tree structure to support efficient discovery queries. Queries executed on this structure have a computational complexity of O(Q · S · log(S) · F) where S is the number of services in the registry, F is the number of different features and Q is the number of FD rules in the query. The services discovered through this search tree are guaranteed to be adaptable using behavioural adaptors such as those presented in Chapter 4 and Chapter 8. We consider that this low complexity, the additional results obtained because of adaptation, and the inclusion of behavioural concerns in the discovery makes our approach well suited for the scalable discovery of stateful services. Regarding future work, we plan to handle differently features that should not be reused by the adaptor or explicitly avoided by it. These features could be side effects, such as payments or removals, which require further considerations than those proposed in this chapter. This could be done by enforcing Linear Temporal Logic (LTL) properties over the features of the services as a second stage of the discovery. Finally, we plan to work further on compositional discovery and we have to evaluate the memory requirements of the search trees presented in this work and the complexity of creation, updates and removals of services from the search tree.

91

57. It is easier to change the specification to fit the program than vice versa. A.J. Perlis - Epigrams in programming

6

Automatic generation of adaptation contracts Application design using black-box software such as WSs and Software Components has several advantages such as greater productivity and software reusability. Nevertheless this design based on black-box software has to face an important issue: the adaptation of services with mismatches at signature and behaviour levels [BBG+ 06]. IDLs (like CORBA and WSDL for components and services, respectively) allow the composition of software written in different languages but, even though IDLs help solve the language barrier, they do not address behavioural incompatibilities. Most of the time, services cannot be reused as they are because interactions among them would lead to an erroneous execution, namely a mismatch. Formally, cases of mismatch lead the whole system into deadlock states. In practice, mismatch situations may be caused by message names which do not correspond (regular use of the services makes them interact on the same names of messages), or by the order of messages not being respected, or a message required by one service not being provided by its partner. In previous chapters we have described one possible solution to overcome such mismatches using adapters as services “in-the-middle” capable of mediating between the involved parties. Adapters can be seen as WS orchestrators which intercept client requests and forward them to the services while preserving a deadlock-free composition. In this way, adapters serve as a service replacement which properly support the interface expected by the clients. In Chapter 4 we presented adaptors that performed better if they were given an adaptation contracts. Other approaches [BBC05, BCP06] require the presence of such a contract. However, no insight was given about how this contract is constructed and it is assumed to be done by hand. This is an error-prone task which obliges the designer to have a full understanding of all the service details. Therefore, this chapter is devoted to introduce 93

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS a tool we developed, called Dinapter, which addresses this issue and it is able to generate valid adaptation contracts automatically. Software adaptation is a very promising topic and it has been successfully applied to different implementation platforms such as BPEL [BP06] or WF [CSC+ 07]. Several proposals [BCP06, CPS08, YS97] already focused on signature and behavioural adaptation. However, all these approaches do require a manual design of the adaptation contract which may be tricky when the service protocols are complicated. Our solution complements these approaches by generating adaptation contracts from behavioural descriptions of services, which makes the adaptation process completely automated. Moser et al. [MRD08] developed a platform (VieDAME) based on ActiveBPEL for the monitoring and service adaptation of BPEL processes. They dynamically replace services based on QoS in a non-intrusive manner using aspect oriented programming. They adapt services using Transformers but these transformers must be designed manually. Their work can be complemented by our approach by automatically generating these transformers. As regards the automatic generation of adaptation contracts, Schmidt and Reussner [SR02] focused on the synchronization of two components accessing, or being accessed, by a third one. They introduced an algorithm based on synchronous product computation to semi-automatically solve missing message incompatibilities, but their approach fails to overcome signature mismatches and behavioural incompatibilities such as message reordering or message splitting/merging. Autili et al. [AINT07] proposed a methodology for the automatic synthesis of adapters considering as input behavioural descriptions of components and a specification of the interactions that must be enforced in the system. Then, their tool (Synthesis) generates composition code that exhibits only the specified interactions, and prunes those which lead to deadlocks. Similarly to [SR02], the same names of messages are assumed and some behavioural mismatches cannot be solved (such as message splitting/merging). In addition, this approach relies on a high-level description of the composition goal, and therefore does not work without said specification. Let us now mention two related articles [BP06, MNBM+ 07] that tackled WS adaptation. In the first one, Brogi and Popescu [BP06] outlined a methodology for the automated generation of adapters capable of solving behavioural mismatches between BPEL processes. In their adaptation methodology they use the YAWL workflow as an intermediate language. Once the adaptor workflow is generated, they use lock analysis techniques to check if a full adaptor has been generated or only a partial one (some 94

! "

Figure 6.1: Combination of expert system and A* interaction scenarios cannot be resolved). They solve message reordering incompatibilities but their approach fails with signature mismatches. In addition, even if we applied our approach to BPEL services as well, we want our approach to be more general by not only working on abstract descriptions of services that can be extracted from BPEL but also from other programming languages and platforms such as WF [CSC+ 07]. Motahari Nezhad et al. [MNBM+ 07] presented an approach for assisting the developer to adapt new versions of existing WSs. In their approach, they use a schema matching tool called COMA++ [ADMR05] over the service WSDL signatures. Our approach has some similarities with their work (our heuristic plays a similar role as their evidences), and they introduce some interesting ideas about deadlock handling. However, although they are able to generate a mismatch tree that gathers all protocol mismatches, its resolution is not automatic. Main contributions: We propose to generate contracts incrementally. Step by step, we explore the behaviour of the services adding the messages found to the contract in all possible ways. An exhaustive exploration would lead to an explosion of partial contracts so we guide the search with a heuristic to restrict the number of contracts to explore. The exploration is made by an informed-search algorithm (A* [RN95]) whereas the contract validation is made by an expert system [FH03] (Fig. 6.1). The rest of the chapter is organised as follows. In Section 6.1 we introduce the subset of abstract BPEL used to describe service interfaces, which is later illustrated with an example based on a simplified file exchange system. In Section 6.2, we explore the different parts of our approach and the details of the process. We conclude with possible future extensions and final comments in Section 6.3. 95

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS

6.1 Motivating example in abstract BPEL Througout this document we assume that the behaviour of the services is modelled with FSMs according to Definition 2.1.2. This is not true in the real world where the behaviour of the services is documented either in natural language or in an executable languague such as BPEL. Therefore, as part of the ITACA toolbox, we developed an automatic conversion tool from a subset of abstract BPEL into our model for service interfaces.

6.1.1

From abstract BPEL to behavioural interfaces

In our model, we focus on the following BPEL activities (in bold):

and activities are represented by output actions op!args where the arguments (args) are: either the single inputVariable attribute (or variable in ), or several arguments contained inside a element. corresponds to an input action op?args where args are handled in a similar way as the activity. is modelled using a sequence of transitions in the FSM. corresponds to more than one transition outgoing from the same state, being all of them labelled with different input actions. Each of those represents a different element.

activities are expressed by a furcation of τ -labelled transitions. Conditional expressions are abstracted by silent actions (τ ) so the adaptation contract must assume that every execution branch is possible. Dinapter is restricted to work only with services which state which branch has been selected in order to continue with the adaptation; therefore, every branch has to start with a different activity which results in output actions right after the τ -labelled transitions.

and . Because of the critical role played by the condition of these activities we model them as or activities depending on whether the decision is made locally or on reception of a particular message. The branches of these activities are allowed to be loops, therefore we distinguish between pick-loops and if-loops. 96

6.1. MOTIVATING EXAMPLE IN ABSTRACT BPEL

6.1.2

Case study: a file exchange system

We now introduce a case study to illustrate our approach. It consists of a file exchange system composed of a client and a server, but these were built in different contexts so they have mismatches in their signature and behaviour. We provide the abstract BPEL code of the server1 in Listing 6.1. Although the rest of this document supports multi-party adaptation, Dinapter is focussed on the adaptation between two parties. Therefore, our BPEL code assumes that every message is received/sent from/to the other party instead of using partnerLinks and portTypes to define the particular source or destination. Listing 6.1 Abstract BPEL code of the server process.

true

Example 6.1.1 Fig. 6.2 contains the FSMs of the client and server processes. The server (Fig. 6.2(b)) accepts a connection given a user name and a password (login?) and it confirms that the user has logged in 1 The

abstract BPEL code of the client is a single sequence of activities.

97

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS login?name,pass connected! quit? user!name

password!pass

result!file

noSuchFile! getFile?file

download!file  data?filedata

(a) Client process behaviour



(b) Server process behaviour

Figure 6.2: Example processes (connected!). The user may perform several requests in a single session (getFile?) and every request has its single response. This response can be either the requested file (result!) or a notification that the requested file does not exist (noSuchFile!). Finally, when the client does not need more files, it leaves the session (quit?) and ends. On the left-hand side, the client (Fig. 6.2(a)) was designed with fewer transitions than the server. Name mismatches occur in every message and, although the client has similar request and data retrieval methods (download! and data?, respectively), it fails to receive the log-in confirmation (connected!) and the notification of noSuchFile! from the server. The client has the log-in request split into two actions (user! and password!) where the parameters name and pass satisfy their counterparts in the login? action of the server. The client also fails to call the quit operation of the server. It is obvious that, even though these services cannot interact properly, they could be adapted to cooperate in most cases. In order to achieve this goal we must obtain vectors between the operations with signature incompatibilities but we must also merge messages (the log-in requests), and include the missing operations (connected and quit) in such a way that both services end up in a final state.

6.1.3

An alternative notation for adaptation contracts

Dinapter promotes the generation of balanced contracts, which intuitively means contracts where roughly the same number of actions are mapped from one service to another. In order to better visualise this balance we are going to abuse of the notation and allow vectors to contain more than one action on their sides, e.g., la1 , . . . laL ♦ ra1 , . . . raR . This alternative notation follows the one described by Brogi et al. [BCP06] and it can be directly 98

6.1. MOTIVATING EXAMPLE IN ABSTRACT BPEL

Σc = {

user!N, password!P ♦ login?N, P, ♦ connected!, download!F ♦ getFile?F, data?D ♦ result!D, data?D ♦ noSuchFile!, ♦ quit? }

(v1 ) (v2 ) (v3 ) (v4 ) (v5 ) (v6 )

(a) Vectors

v

i c T c = {sc0 −→ c | vi ∈ Σ };

c0 = Σc , {sc0 }, sc0 , {sc0 }, T c

(b) Contract FSM

Figure 6.3: Adaptation contract for the client (Fig. 6.2(a)) and the server (Fig. 6.2(b)). transformed into our adaptation contracts (Definition 2.1.1) by transforming these list of actions into all the possible interleavings among them. Example 6.1.2 For instance, v1 in Fig. 6.3 corresponds to the following vectors in the original contract notation.

user!name ♦ password!pass ♦ login?name, pass v7

v8

where the contract transitions must include sc0 −→c −→c sc0 .

(v7 ) (v8 )

Example 6.1.3 Now we will explain what a programmer would do to design an abstract contract such as c0 in the example of Section 6.1.2. The programmer knows how a download session must proceed (its behaviour) and the correlation among the arguments. With all that knowledge, it is common sense that login must match the combination of user and password. It is so because they are at the beginning of their respective services so they will be called or received at the same stage of the communication. Also, as far as login requires the arguments of the other two actions, they will all be merged in the same vector (v1 ). Vectors v3 and v4 are simpler versions of the same case. These two are perfect vectors because they directly adapt a single call with its reception and all the arguments are satisfied. They only overcome signature mismatches. 99

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS The vectors v2 and v5 allow transitions unsupported by one of the services to be ignored and proceed with the communication. The vector v5 requires additional consideration because the argument D is not provided and it is required to reach a final state. Finally, we must call the quit method when the transaction ends (accomplished by v6 ). Note that the execution of v6 is not triggered by any other interaction so the adaptor which complies with this contract must trigger v6 based on its knowledge of the process behaviours. Our approach takes advantage of the following information in order to achieve similar reasoning abilities to those stated above: Behaviour We traverse the execution of the services using their behaviour. We can analyse the sequence of actions taking place and evaluate possible vectors for those sequences. Vectors are evaluated according to the compatibility in their communications (invoke and receive pairs), well balanced vectors with similar number of actions in both sides, and the satisfaction of their arguments. These concepts will be explained in detail in Section 6.2.1. Arguments We try to the best of our ability to satisfy the arguments required by one side of the vector with those provided in the other side of the vector. It is still possible to generate contracts where the reception of the argument is in one vector and it is used in a different vector but this would require the adaptor to keep track of the arguments received. Therefore, we promote adaptation contracts where no argument memory is needed for the sake of scalability. Anyway, if there were no other alternatives, the actions with these arguments would be split into different vectors.

6.2 Generating adaptation contracts We tackle the generation of adaptation contracts by adding, step by step, new vectors to an empty contract. During this process, we may only modify the last vector or append a new one at the end (see Figure 6.4). The behaviour of the services are traversed and the actions found are introduced into the contracts (underlined actions) in all possible combinations (i.e., on the left-hand side of the last vector, on the right-hand side of the last vector, or on either side of a newly created vector). In this way, we iteratively create more complete contracts. 100

6.2. GENERATING ADAPTATION CONTRACTS

Figure 6.4: Part of a graph of incremental contracts2 .

In the example (Fig. 6.2), the arguments are already matched between the services. This is a requirement of our approach, i.e., both services must be defined with the same set of argument names. One way to achieve this match is to represent the arguments by their data types. In this case, our approach will promote vectors which adapt messages with the same data types.

6.2.1

Graph search with A*

The evaluation of all possible combinations of the service behaviours would lead us to an explosion of states (partial contracts). Therefore, the search through those states must be guided to reduce the number of explored states. The concepts stated at the end of Section 6.1.3 can be translated into a heuristic to guide the search using an informed-search algorithm, specifically A*. Informed-search algorithms require a cost and a heuristic function. The former is the cost to reach a particular point of the search while the latter is a guess of how much further the solution might be from that point. During the incremental construction of the contract, the cost of a contract is how many mismatches have been assumed in conjunction with how many partial contracts are in the path to that contract. The solution to this search will be a complete adaptation contract with the lowest cost and heuristic. We will design the heuristic and cost functions based on the vectors which constitute the contract.

2 The function f , which represents the penalization associated to the given contract, will be explained in detail in Section 6.2.1.

101

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS Definition 6.2.1 The valuation v of a vector (la1 , · · · , laL ♦ ra1 , · · · , raR ) is defined as follows: L L R R v(m) = k3 ∑ r(lai ) − ∑ r(rai ) + k3 ∑ s(lai ) − ∑ s(rai ) i=1 i=1 i=1 i=1

+ mindet(m) + k6 ∗ ins(m)

m

=

(6.1) (6.2) (6.3)

where

 k4 ∗ r(la1 )     k ∗ r(ra ) 4 1 mindet(m) =  k ∗ r(la 1 ) ∗ r(ra1 ) 5    0 1 if x = op?args r(x) = 0 otherwise s(x) = 1 − r(x)

if R = 0 ∧ L > 0 if L = 0 ∧ R > 0 if L > 0 ∧ R > 0 otherwise

(mindet )

(rec) (sen)

The function ins : M → N is defined in such a way that ins(m) is the number of unsatisfied arguments in m, that is, the number of provided/required arguments in one side which are not required/provided by the other side. Positive constants k3 , k4 , k5 and k6 weigh the different valuation terms. The purpose of the valuation of a vector (v) is to represent how bad a single vector is. The higher the vector valuation, the worse it is for the adaptation. A perfect vector should have a value of 0. The function v is informally explained as: Balance: The first line (6.1) of the equation defining v includes a penalization because of an unbalanced vector. If two services are directly adaptable, an ideal adaptation contract would contain a vector for each pair of actions. Each of these vectors would contain a single action on each side (one per service) representing that these actions must be directly adapted. Therefore, the actions of the services are adapted one to one. For instance it is more specific to have a vector such as

download!F ♦ getFile?F where it is guaranteed that for every download request there will be an eventual getFile, than the more general vectors

download!F ♦ ; 102

♦ getFile?F

6.2. GENERATING ADAPTATION CONTRACTS where there is no direct correlation between the two actions. Vector indeterminism: Line 6.2 stands for the penalization of vectors which start with receive actions on both sides. This is so because the adaptor should trigger those vectors under its own responsibility, without receiving any message from the services indicating such a thing. Nonetheless, in some cases, it is possible to know without ambiguity when such vectors should be triggered depending on the behaviour of the services. As an example, a vector such as

data?D ♦ noSuchFile! is preferred over a single data?D ♦ . The former univocally specifies that the vector is triggered only when noSuchFile is received by the adaptor whereas the latter is ambiguous on that regard. Satisfiability: Every argument sent should be used and every argument needed must be satisfied. We can achieve these objectives by promoting those vectors where all the arguments are used and satisfied. If all the arguments are satisfied in the same vector (and not in subsequently fired vectors) no argument memory will be required in the adaptor. The penalization for unsatisfied arguments in line 6.3 serves to correlate actions based on their arguments and it enhances the adaptor efficiency. There are constants to weigh balance (k3 ), vector indeterminism (k4 and k5 ) and satisfiability (k6 ) according to our adaptation policy. An adaptation contract can be indeterministic in two ways: when a vector is not triggered by any message received (vector indeterminism), and when the same sequence of messages triggers several vectors (contract indeterminism). In order to define the latter, we need to know when two sequences of messages are distinguishable by the adaptor. Definition 6.2.2 Two sequences of communicative actions ~a = a1 , . . . , an and ~a0 = a01 , . . . , a0m are distinguishable if, and only if:

dist(~a,~a0 )⇔∃ ˙ j > 0 | ∀i, 0 < i < j, (ai = a0i ), s(ai ) = s(a0i ) = 1 and: i) if m < j ≤ n then s(a j ) = 1 ii)

if n < j ≤ m then s(a0j ) = 1

iii) if j < n, m

then

a j 6=

s(a j ) + s(a0j ) a0j

(dist )

≥ 1 and 103

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS Informally, two sequences are distinguishable if they differ in at least one invoke action and if all the previous pairs of actions are the same and are invocations. This particular definition of dist requires a timeout in the adaptor to distinguish between sequences where only one of the first different actions is an invoke operation. For instance, the sequences a1 !, a2 !, . . . and a1 !, a3 ?, . . . require a timeout because, after receiving a1 !, we cannot know if we need to call a3 ? or if we must wait to receive a2 !. For this reason, we must delay (with a timeout) the invocation of a3 ? to wait for the possible reception of a2 !. vectors (m and m0 ) such as m = (la1 , . . . laL ♦ ra1 , . . . raR ) and m0 = (la01 , . . . la0L0 ♦ ra01 , . . . ra0R0 ), are ambiguous if, and only if: Definition 6.2.3 Two

~la, ~la0 ), ¬dist(~ amb(m, m0 )⇔¬dist( ˙ ra, r~a0 ) and either: i)

ii)

L, L0 > 0 and la1 = la01 , s(la1 ) = s(la01 ) = 1 0

R, R > 0 and ra1 =

ra01 ,

s(ra1 ) =

s(ra01 )

(amb)

=1

Two vectors are considered ambiguous if they are triggered by the same sequence of invocations and their sides are not distinguishable. Definition 6.2.4 Contract indeterminism is penalized by the function cindet : C → N, defined as:

cindet(c) =

k7 0

if ∃i, j, i 6= j | mi , m j ∈ c and amb(mi , m j ) otherwise

(cindet )

We can define the heuristic and the cost of a contract depending on how bad its vectors are (v). As we saw in Figure 6.4, any child contract (right hand side) has one action more than its father. This action will be either joint with the last vector of its father or it will create a new vector. Therefore, only the last vector can be modified so all the other vectors are immutable in the descendants of the contract. The value (v) of the last vector belongs to the heuristic because of its dynamic nature while the values of the other vectors belong to the cost because they will not be changed. We promote contracts with the lowest number of actions so every action included in a contract will increase the cost of that contract. Therefore, the number of remaining actions is a good estimation of the future cost of the final solution and it belongs to the heuristic. 104

6.2. GENERATING ADAPTATION CONTRACTS Definition 6.2.5 The heuristic (h) and cost (g) functions (h, g : C → N) establish the decision criteria of the A* algorithm. Given a contract c = [m1 , . . . mn ], h and g are defined as follows:

h(c) = k2 (cindet(c) + v(mn )) + k1 ∗ max (0, N − n(c))

(h)

g(c) = k1 n(c) + k2 ∑ v(mi )

(g)

n−1 i=1

where N is the number of communicative actions in the services, and n : C → N the number of communicative actions in the given contract. Constants k1 > 0 and k2 ≥ 0 adjust the importance given to the number of actions in the contract, and the weight of v and cindet , respectively.

Table 6.1: Heuristics, costs and values of contract c in Fig. 6.3

v(v1 ) = k3 ; v(v2 ) = k3 ; v(v3 ) = 0 v(v4 ) = 0; v(v5 ) = k6 ; v(v6 ) = k3 g(c) = 11k1 + k2 (2k3 + k6 ) h(c) = k2 k3 Example 6.2.1 In Figure 6.4, we can see that the most promising contract is the one on the top. This contract will be selected for further exploration because it has the lowest value of f , being provided that the satisfiability of the arguments has a positive weight, i.e., k6 > 0.

f (c0 ) = g(c0 ) + h(c0 ) = (3k1 ) + (7k1 + k2 k3 ) This exploration process will finally generate, among others, the proposed solution contract c in Fig. 6.3 with the cost and heuristic shown in Table 6.1. This methodology for the generation of partial contracts along with the costs and the heuristics fits with any informed search algorithm. We use an A* variation in order to perform and guide the search. We have modified the A* algorithm because we do not just need one A* search but several of them in parallel. The reason for this are the branches in the execution flow caused by conditional behaviour, i.e., activitios. When the service is able to receive several messages and it follows its execution based on the message received (i.e., ), we can model those branches directly 105

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS by different branches of the A* tree. This is so because of the crucial role played by the communication in the decision. It is completely different with local choices where the decision is made by the service without any communication (i.e., ). Hence, we have to create several new search trees, one per different choice. Finally, these different trees will eventually collide (when the conditional behaviour ends) and, therefore, we have to merge those partial contracts into a new complete one. We will not go into details but the creation and merging of these trees has a significant impact on the performance. Example 6.2.2 Due to the bifurcation in the behaviour of the server (Fig. 6.2(b)) we have, among other, the following contract when the node is explored:

c1 = {user!N, password!P ♦ login?N, P; ♦ connected!; download!F ♦ getFile?F} At that point, two different A* trees are created to explore each branch of the condition. These branches result in the following partial contracts:

c2

c3

user!N, password!P ♦ login?N, P ♦ connected! download!F ♦ getFile?F data?D ♦ result!D

user!N, password!P ♦ login?N, P ♦ connected! download!F ♦ getFile?F data?D ♦ noSuchFile!

which are finally merged into

c4 = {user!N, password!P ♦ login?N, P; ♦ connected!; download!F ♦ getFile?F; data?D ♦ result!D; data?D ♦ noSuchFile!} The search graph generated by the A* algorithm can be reused once a solution is found in order to discover alternative solutions. Figure 6.5 106


Figure 6.5: Output of Dinapter showing several alternative solutions for the same example shows how Dinapter can offer several solutions, some of them with the same valuation. If the weights (ki ) are set to obtain an optimistic herusitic function h, then these alternative solutions are guaranteed to have a higher cost than previous solutions. An optimistic heuristic function (sometimes called admissible heuristic function) is one that, for every generated contract ca and every one of the possible contracts cb which are generated from ca it happens that

h(ca ) ≤ g(cb ) − g(ca ) + h(cb ) In addition, being A* an exhaustive search algorithm, it will explore every possible contract (ordered by their cost and heuristic) until it finds every possible solution. Therefore, as far as (i) it is fed with every possible combination which follows a given partial contract, (ii) the behaviour of the services have one or more reachable final states, and (iii) the cost and heuristic function avoid infinite paths since the cost is strictly increasing, the search algorithm will eventually find a deadlock-free solution, at least. In the worst case, the A* algorithm has an exponential time and memory complexity. If no better solution is found, our approach will generate a trivial contract. A trivial contract is one with single-sided vectors of just one actions which describes an adaptor that accepts (and ignores) every message and calls 107

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS every needed operation with made-up arguments. Because it is possible to always find a trivial contract no matter how incompatible the services are, Dinapter does not make any decision about whether the given services should be adapted or not. To check the adaptability of the services it should be used either the mechanisms described in Chapter 5 or similar work [DOS09]. The heuristic and cost values ( f ) of the contracts are good measures of the incompatibilities between the services but the contract should be further verified before synthesising and deploying it.

6.2.2

The expert system

It was a requirement that all the valuation criteria stated in Section 6.2.1 could be easily replaced or modified. In this way we could test new valuation techniques, include semantic information, or customize our contract generation process to our particular needs or environment restrictions. This is achieved using an expert system [FH03]. The expert system is in charge of traversing the behaviours of the services to generate the different alternatives available from a given partial contract. It calculates their costs and heuristics based on the new included action and it feeds the A* algorithm with the generated graph. The search algorithm makes its choice and marks it to be further explored so the expert system starts again from the chosen partial contract (Fig. 6.1). The expert system also recognizes when a generated contract is a possible solution by examining the traces which led to that partial contract. If both traces reach a final state and the contract contains all the required vectors, the contract is complete. Once the A* explores any of these complete contracts, the process ends returning that contract and all the other solutions with the same value of f . All of this functionality is achieved by 62 rules (Fig. 6.6) which are executed every time the A* algorithm explores a new partial contract. Listing 6.2 Rule which starts the search split. (defrule split-graph (BehaviorNode (nodeType “IF”) (OBJECT ?ifActivity)) ?fact (retract ?fact) (foreach ?child ((?behaviour getChildren ?ifActivity) toArray) (splitGraph ?ifActivity ?child ?side ?contract)))

108

6.2. GENERATING ADAPTATION CONTRACTS Expert System Unit Tests (18 rules)

Split Graph (1 rule)

Mark solutions (3 rules)

Generate Children Contracts (8 rules)

Valuate mappings (4 rules)

Merge Graphs (5 rules)

Prune equivalent contracts (2 rules)

Set contract cost (3 rules)

Behaviors Contract to explore

Set contract heuristic (4 rules)

Contract Graphs

Contract Graphs [updated]

Auxiliary rules (14 rules)

Figure 6.6: Activity diagram of the expert system rules. Each activity corresponds to one or more rules of the expert system. Listing 6.2 contains one of the rules stated above, in particular the rule in charge of splitting the search graph when it finds conditional branches. The client and server processes are differentiated by their ?side. This rule is triggered when a contract is marked for further exploration by the A* algorithm (childrenNeeded TRUE) and the node to process is an ?ifActivity. Then it retracts the fact that the must be processed and splits the search into as many graphs as conditional branches.

6.2.3

Prototype tool: Dinapter

Dinapter

belongs

to

ITACA

and

it

is

available at https: It takes as inputs the behaviour of the services encoded in abstract BPEL. Those behaviours are internally modelled into our behavioural interfaces that will be explored during the automatic generation of the contracts. The output is a set of adaptation contracts expressed in the notation introduced in Section 6.1.3. Table 6.2 shows statistics gathered from Dinapter run in several examples with the following values for ki :

//github.com/jamartinb/dinapter/.

k1 = k2 = k3 = 1 ; k4 = 0 ; k5 = 50 ; k6 = 3 ; k7 = 100 The rows are as follows. Picks and Ifs are the number of event driven conditions () and regular conditional behaviour () in the services. Loops is the presence of loops in the example. The number of communicative actions (, , , and ) in the client and the server are Client and Server respectively. The next 109

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS

Table 6.2: Some statistics obtained from Dinapter e01

s02

v03

e06

e10

e07

e04

e12

e02c

e16

1 0 ×

1 0 ×

2 0 ×

2 1 ×

2 0 ×

0 1 ×

2 2 √

1 1 √

2 7

3 1 ×

5 12

6 6

4 6

3 3 ×

21

41

52

34

76

45

95

31

57

87

Trees Exp. trees

1 1

1 1

1 1

19 9

1 1

50 18

43 31

74 48

155 53

310 116

Contracts Exp. con.

31 10

79 25

120 32

82 38

191 43

180 100

341 142

206 142

440 258

681 365

Alt. con. Solutions

0 1

0 1

0 2

0 2

0 1

0 9

2 4

2 0(1)

2 4

0 1

Picks Ifs Loops Client Server Vectors

3 4

6 5

5 6

5 4

9 7

7 7

column is the total number of generated Vectors followed by the number of search Trees. Con. is the number of partial contracts generated. Exp. trees and Exp. con. are the number of search trees and contracts explored before reaching a solution. Alt. con. is the number of solutions that, in spite of being deadlock-free, adapt a branch of an event driven condition where no useful results are obtained from the adaptation (e.g., a client which only connects and disconnects without doing any computation). This happens because the heuristic and cost functions consider that it is better to connect and disconnect than to deal with the incompatibilities that would be found otherwise. The last row is the number of valid Solutions found. Let us comment on two of these examples. The two processes in e12 are able to accept or reject the communication before performing their core functionality so, as their behaviours are quite incompatible, the first contract returned by Dinapter makes them refuse to communicate and end up in a final state. Nonetheless, if we execute another iteration of the process in the example e12, it returns a valid solution. The example e02c is the one described in Section 6.1.2. Something to be remarked upon this table is the fact that the most relevant factor for the complexity of our approach is the number of nodes which alter the execution flow (, , and loops) which is much more important than the number of transitions. Another important point 110


Figure 6.7: Integration between ACIDE and Dinapter.

is that, if the ki parameters are not adjusted in accordance with our adaptation policy or the services have unsolvable incompatibilities, it might yield useless results (Alt. con.) as in the examples e02c, e04 and e12. Another interesting point is the relevant role played by the A* algorithm and the underlying heuristic function which, even though there is a state explosion if the problem is difficult enough, it reduces the number of explored nodes to half of the nodes generated, approximately. As we stated at the end of Section 6.2.1, the number of generated search trees (Trees) is proportional to the number of explored contracts (Exp. Cons.). Any enhancement in the heuristic and the procedure for generating and merging those trees will greatly improve the efficiency of our tool. The integration within ITACA has enabled Dinapter to accept more input formats (WFs and STSs). Furthermore, the tool ACIDE (Fig. 6.7, also included in ITACA) provides a graphical interface where the developer can be assisted to design his own contracts or review, modify and accept the contracts generated by Dinapter directly from ACIDE. Another tool in ITACA, called Compositor [MPS08], accepts as input the contracts returned by Dinapter and applies the vectors in the proper 111

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS order to generate the protocol of the final adaptor. A comparison of the semantics behind the operations to be adapted is provided by another tool in ITACA, called Sim, which uses WordNet::Similarity [PPM04]. The inclusion of this semantic information in Dinapter enhances the matching of name-mismatch situations, improves the adaptation of event driven conditions (picks) by taking into account the underlying semantics, and it reduces the number of search steps needed to find a correct contract.

(a) The client behaviour

(b) The medical service behaviour

Figure 6.8: Adaptation with semantic information using WordNet::Similarity. For instance, Fig. 6.8 shows part of the example e021. This example consists of a medical service (Fig. 6.8(b)) where patients request appointments with either a physician (P) or a specialist (S). On the other side (Fig. 6.8(a)), the client component presents name mismatches and it requests doctors (D) or pediatrists (E ). The arguments match and there is a reception for every emission so the four combinations of branches are possible (D ♦ P and E ♦ S; D ♦ P and E ♦ P; D ♦ S and E ♦ S; D ♦ S and E ♦ P). However, through queries to WordNet, Dinapter recognizes that “physician” is a synonym of “doctor”, “pediatrist” is an hyponym of 112

6.3. CONCLUSION “specialist”, and it returns the right contract (D ♦ P and E ♦ S).

6.3 Conclusion We have shown an approach for the automatic generation of adaptation contracts which overcomes signature and behavioural mismatches. The generated contracts successfully solve missing messages and they are able to merge and split messages depending on their arguments. There are several papers in the literature [BCP06, CPS08, YS97] which use these contracts to automatically build behavioural adaptors. Traditionally, these contracts were manually written and they required the designer to fully understand the details of the services involved. Our proposal complements these approaches with the automatic generation of adaptation contracts and it is supported by the Dinapter tool. Similarly, the proposed techniques for dynamic adaptation, service discovery and secure adaptation (Chapter 4, Chapter 5 and Chapter 8, respectively) can take advantage of the adaptation contracts generated using Dinapter. In order to generate optimal solutions, the A* algorithm requires the heuristic function to be admissible and monotonous. The proposed heuristic function (h) may decrease drastically in following steps and this causes the heuristic not to be admissible. This inconvenience can be controlled by the constants k1 and k2 . With values of k2 ≥ k1 we can promote a faster and narrower search assuming the risk of missing the best solution or, otherwise, we can force the tool to find the best solution generating more partial contracts with k2 ≈ 0. Other informed search algorithms can be used instead of A*. Our work has been focused on contract generation between two services. Future work is to extend our approach to more expressive languages and semantics such as SCC [BBDN+ 06], which focuses on service orchestration and it supports explicit session and exception handling. In any case, this work shows the feasibility of the proposed approach. Regarding validation, another of our future lines of research is to apply our tool results to other methodologies for adaptor generation and to check their bisimilarity with adaptors generated from hand-written contracts. The combination of heuristic, cost and A* quickly solves simple mismatches but, the bigger the incompatibilities are, it consumes more time exploring other ways to overcome them. This allows our approach to tackle different degrees of incompatibility but it wastes too much time when the incompatibilities are irremediable. One course of action would be to complement our proposal with an algorithm to automatically recognise 113

CHAPTER 6. AUTOMATIC GENERATION OF ADAPTATION CONTRACTS these irremediable incompatibilities or an algorithm to cut the behaviour of the services into smaller adaptable pieces.

114

Part III

QoS adaptation: security

God does not play dice. A. Einstein God not only plays dice, He also sometimes throws the dice where they cannot be seen. S. Hawking God does not play dice with the universe; He plays an ineffable game of his own devising, which might be compared, from the perspective of any of the other players, to being involved in an obscure and complex version of poker in a pitch dark room, with blank cards, for infinite stakes, with a dealer who won’t tell you the rules, and who smiles all the time. T. Pratchett

7

Security adaptation

Security concerns over WSs are regarded as a major research challenge for service oriented computing [PTDL07]. The loose coupling feature of WSs, which enables their higher reusability and interoperability, is often constrained by new security requirements over those services. If we assume that the services we want to orchestrate are incompatible at signature or behavioural levels, and therefore the messages sent and expected are incompatible, we will have to deal with the mismatches among the security elements in those messages. For instance, a password that is to be encrypted but it is sent as clear text, some data which must be sent with their hash values in order to check their integrity but those digests are missing, a nonce (which stands for number used once) that it is sent to correlate a request with its reply but that nonce is not returned in the reply, or something that must be signed but it is not. In this chapter, we formalise stateful services with security requirements over their messages. This will allow us not only to adapt security-enabled services but also to express in a concise manner the security requirements for the whole orchestration and to analyse security properties over the expected behaviour of the system. The adaptation of security-enabled messages, such as SOAP messages enhanced with WS-Security, comprises a new set of problems since the different parts of a message might be encrypted, signed or digested. In this case, security adaptors must be able to i) decrypt and verify some of these data on receptions, and ii) encrypt, sign and digest some other parts of the message as expected by the destination service. Example 7.0.1 Please compare the following two interface actions

login!user, pass login!User, Enc(Key, Pass)

(7.1) (7.2)

On the one hand, Equation 7.1 is an example of interface actions (see Definition 2.1.2) as the ones we have been used throughout previous chapters. 117

CHAPTER 7. SECURITY ADAPTATION It can mean, for instance, that a service want to log in using its username and password. On the other hand, Equation 7.2 is also an interface action so both expressions belong to Σi . Its meaning is the same but, in this case, the type of the second argument is structured with cryptographic operations, and hence it implies that the password is encrypted using a key. It is worth noting that, to better differentiate interface actions with and without security, we are going to denote the types of the former with capitalised words. Example 7.0.2 Being given the following security adaptation vector

encrypted!Enc(Key, Data), Hash(Data) ♦ cleartext?Data, Timestamp the adaptor should be able to receive action encrypted , to decrypt its first argument to obtain Data, and to verify its integrity using its hash value given in the second argument. Then, Data and a Timestamp with the current time must be sent as cleartext to the other side using action cleartext . In addition to single message security, security protocols such as those expressed by WS-SecureConversation or WS-Trust, impose further restrictions between the sequences of exchanged messages. These protocols might involve session keys (i.e., keys which are created, shared and used throughout the session), timestamps and nonces to avoid replay attacks, and proof of possession mechanisms (where a service authenticates its identity by proving that it has certain private information, e.g., a credit card number or a PIN code), among others. This manipulation of securityenabled messages is particularly difficult due to the sensitive nature of their contents so, depending on the particular system, security adaptors must be able to access some sensitive information (such as private keys), to generate new security tokens such as new keys and timestamps, and establish and assess trust relationships. All these new capabilities complicate even more the design of security adaptors. For instance, an adaptor compliant with Example 7.0.2 needs to posses key Key in order to decrypt the Data and it must be capable of generating Timestamp. In this chapter we present security adaptation contracts (SACs) that not only address incompatibilities between services but also cover several security-related WS-* specifications in a high-level and integrated manner, hence reducing the effort required from the system architect. SACs enable us to specify how to adapt signature, behaviour and security incompatibilities among services; they describe the security checks that must be performed over the received messages; and, due to their centric role in the conversation, they provide a formal framework to analyse the behaviour of the system. 118

Several proposals [MPS08, DSW06, ITY07] focus on solving behavioural incompatibilities among services using adaptation contracts. In [DSW06], the authors presented an approach to behavioural adaptation based on a set of adaptation operations which defined basic relation patterns between message names. They also presented a visual notation for describing the mappings between services. The authors of [ITY07] focused on dynamic adaptation of BPEL processes using semantic rules. These rules could be considered as semantic adaptation contracts which generate appropriate orchestration of services when they receive new requests. Mateescu et al. [MPS08] established the foundations of this work as they propose a generative approach to adaptors based on regular expressions of vectors with the advantage of using a process algebra encoding and on-the-fly generation techniques. However, all the papers mentioned above lack the support of security concerns which is the main contribution of this chapter. The problems faced by behavioural WS adaptation are similar to those present in the automatic generation of controllers for composite systems. The authors of [PMBT05] aimed at the automatic composition of distributed business processes. Given a set of BPEL processes and an abstract description of the composition, expressed in their own goal-oriented language, called E AGL E, they are able to automatically generate a controller implemented in BPEL. This controller plays an identical role to our adaptor. However, there are several differences due to the fact that they specify the composition with its goal whereas we pay particular attention to the detailed mapping between services, expressed in SACs. This additional information has allowed us to express and adapt security requirements, something which was not addressed in [PMBT05]. In addition, in order to track and match the information exchanged, they use information about the internal operations of the services and they use knowledge-level planning techniques [PMBT05] to obtain better scalability by abstracting the actual values. Likewise, our approach handles symbolic values but no internal information of the services is required apart from their public behaviour. The SACs described in this chapter allow us to verify at run time the security requirements over the messages exchanged and to adapt services with different security policies. As it is stated in WS-Security, messages with security requirements are not necessarily secure in the face of certain attacks so further mechanisms to prove its security are required. Security adaptation does not make the orchestration secure against attacks which were not foreseen by the policies of the services orchestrated and, in fact, it multiplies the points of attack by the amount of services orchestrated. Therefore, extensive security analysis is required to prove that the orchestration is safe against some attacks. This security 119

CHAPTER 7. SECURITY ADAPTATION analysis and synthesis of secure security adaptors is covered in Chapter 8. The authors of [AB05] proposed two approaches to express and prove secrecy properties of security protocols, one based on type systems and another based on logic programming. The security constructors used by SACs are taken from theirs, so both approaches perform a generic treatment of cryptographic operations and the orchestrations obtained by our approach could be easily verified by their work. The deployment of security adaptors is a particularly important issue because, depending on the location where the adaptor is deployed, one or several of the service security policies will prevail in the medium. In Example 7.0.2 for instance, if the adaptor is deployed around the left-hand side service, then the data go unprotected in the medium. This would not happen if the adaptor were, instead, deployed around the right-hand side service. One application for SACs is to generate security adaptors as wrappers [Sal08] for services without security capabilities in order to orchestrate them within a security enabled system. In this scenario, wrappers are imposed by the orchestrator to make the communication secure but, on the service provider side, they must verify that the wrappers to be deployed do not interfere with their system or pose any security risk. In this situation, Proof Carrying Code [NL98] is a promising solution where an untrusted third-party can generate the adaptation code corresponding to a given SAC. This SAC serves as the safety policy that must be proved by the adaptor provider, and finally, service providers will be able to verify the proof against the SAC and the received code before deployment. SACs are intended to be used at design time to increase the interoperability with legacy services or services under different security policies. This is done by including security capabilities into the behavioural adaptors seen in Section 2.3. Sánchez-Cid et al. [SCMS+ 09] propose a radically different approach where security is not tightly coupled with services but defined as other compositional entities of the system, therefore allowing the recomposition of the system to achieve different requirements in security. These security requirements, called S&D Properties, must be addressed throughout every software engineering stage and are expressed as i) S&D Classes, highest level of abstraction which deals with concepts such as “use a confidential channel” or “this information must be authenticated”; ii) S&D Patterns, which specify classes using a precise semantic description of the mechanisms that enable a particular S&D Class; and iii) S&D Implementations, which are specific algorithms that implement those mechanisms. There are two main advantages of this work that are worth highlighting: they achieve interoperability through the use of S&D Classes at design time, hence allow120

ing developers to dynamically choose different S&D Patterns and S&D Implementation at run time; and S&D Patterns allow both static and dynamic analysis over the security mechanisms they represent. Security adaptation could also be integrated within their work since adaptors are still needed to cover the gap between the interfaces of S&D Classes and their corresponding set of S&D Patterns. In addition, services must be S&D-aware whereas our approach does not require such intrusion in the services. Compared to other kinds of contracts, SACs are specifications on how the orchestration must proceed. As contracts, security adaptation contracts are subject to be negotiated [BCP04], but this aspect is not covered in this document. In terms of monitoring, security adaptation contracts represent the security policy that must be enforced on messages intercepted by the adaptor at run time. In the presence of security violations, the adaptor is in charge of taking the appropriate measures such as interrupt every communication with the compromised service and notify the other services in the orchestration. Main contributions: The main contribution of this chapter is the inclusion of security concerns into WS adaptation contracts. These new contracts allow us to express the different adaptations required over the behaviour of the services for them to interoperate properly, as well as the security requirements that must be met during the communication. The purpose of the security requirements in adaptation contracts is to specify: i) the security checks that must be satisfied by every received message and their sequence in secure conversations, and ii) the transformations which must be effected on the security policy of a service so that it adapts to the security constraints of the system. We will focus on behavioural adaptation extended with security QoS. The goal is to obtain an adaptor which orchestrates the services avoiding deadlocks and livelocks, and it complies with the given contract and secrecy property (e.g., “the PIN code must remain secret”). The rest of the chapter is structured as follows. In Section 7.1, we present an example to motivate and illustrate how SACs are used to orchestrate incompatible services in behaviour and security QoS. Section 7.2 formalises security messages and how they are used in the behaviour of the services. Once security enabled services are defined, we proceed to formalise security adaptation in Section 7.3. Adaptation is enabled by SACs, which are composed of security contract terms (or contract terms, for short) (Definition 7.3.1). These contract terms are fundamental for security adaptors and will serve to match the messages sent and expected by the services (Definition 7.3.3). This matching will trigger the communication between 121

CHAPTER 7. SECURITY ADAPTATION

Table 7.1: Mapping between the operations of services a and b Service a

request ! name, Enc(Key, Pass), Req, Nonce refused ? Nonce reply ? Data, Hash(Data), Nonce

Service b

proceed ! login ? Name, Pass request ? Req denied ! result ! Data upload ? File

the adaptor and any of the services. Finally, we conclude with Section 7.4.

7.1 Motivational example Web services with security requirements present a tight coupling with the format, security algorithms and protocols used by the messages in their communications. In this section we illustrate a scenario where such restrictions hinder the proper communication between services and, in this way, we motivate the need for SACs. Example 7.1.1 In Fig. 7.1, we present the behaviour of four services for performing Secure SHell (SSH) operations. Services a and a0 try to get access to the functionality provided either by service b or b0 , but incompatibilities prevent their communication. The behaviour of these services are represented by the FSMs described in Definition 2.1.2 where operation names are followed with ‘!’ and ‘?’ in output and input actions, respectively. Operations also have a list of arguments with security expressions. Internal operations (i.e., transitions without external communication) are represented with τ . In this example we shall focus on services a and b. On the one side, service a (Fig. 7.1(a)) can perform several requests with its credential (typed Key and Pass) and request (Req) as arguments. These requests can be refused or followed by replies. It is important to highlight that the values for name and pass must remain constant throughout the session, so the same values must be sent in every iteration of the loop. Additionally, parameter Nonce is used to correlate requests and replies (like correlation sets in BPEL) and to avoid replay attacks. On the other side, service b (Fig. 7.1(b)) begins by notifying its availability with a proceed message. Then it must receive a login message with the credentials, which can be either accepted (another proceed) or 122

7.1. MOTIVATIONAL EXAMPLE

request!name,Enc(key,pass),req,nonce refused?nonce reply?data,Hash(data),nonce

(a) Client a

denied!

proceed! login?name,pass

upload?file proceed!

request?req result!data (b) Server b

request!name,Enc(key,pass),req,nonce refused?nonce reply?data,Hash(data),nonce (c) Client a0

request?req login?name,pass result!data (d) Server

b0

Figure 7.1: Behaviour of services simulating different ways to perform Secure SHell (SSH) operations

123


denied. If the login is accepted, several requests can be made (with their results). Additionally, the service allows users to upload files. This example might give the impression that we adapt secured services with services without security. This is true for this particular example but we will also cover cases where all the services are secured but using incompatible security policies. Concretelly, the services of this example are incompatible at signature (e.g., refused vs denied in Table 7.1), behaviour (e.g., unexpected proceed ) and security levels (e.g., service a uses an encrypted password Enc(Key, Pass), requires the digest of the data Hash(Data), and correlates requests and replies with the argument Nonce). Services a and b present complementary functionality and semantics but, due to these incompatibilities, security adaptation is required to make them to cooperate successfully. Behavioural adaptation is achieved by deploying an adaptor in the middle of the communication with such behaviour that it receives, recomposes and forwards every messages it receives in a way that all the services can interact properly and end up in a stable state. The behaviour of such adaptors can increase exponentially with the complexity of the services involved and their design requires taking into account all the possible interleaving between the messages exchanged. Therefore, we propose to describe adaptors with SACs, which abstract away from concurrency issues and focus on the mapping between the operations, arguments and security of the services. This mapping is expressed as a set of vectors which correlate the operations of the services. These vectors use symbolic parameters in place of arguments and they contain security expressions to process, analyse and recompose the messages. In addition, we might want to enforce some additional requirements over the adaptation such as “a particular message must not be sent more than x times” or “an operation A will be (un)available until the operation B is called”. These requirements constrain the application order of the interactions expressed by vectors. In order to represent such high-level requirements, adaptation contracts are FSMs with vector labelling the transitions. If such restrictions are not required, the contract needs only a single initial and final state with all the transitions looping on it. Example 7.1.2 There are several incompatibilities between services a and b (Fig. 7.1) which are solved by c5 in Fig. 7.2. First, the proceed messages sent from b are received by the adaptor due to vector v p . Operations have been prefixed with the service identifier. Vector vl maps the login request 124

7.1. MOTIVATIONAL EXAMPLE

Σc5 = { a:request!I, Enc(K ∧ , P), R, N ♦ b:login?I ∧ , P∧ , ♦ b: proceed!,

(vl ) (v p )

♦ b:request?R∧ ,

(vq )

a:request!I ∧ , Enc(K ∧ , P∧ ), R, N ♦ b:request?R∧ ,

(vr )

∧

∧

∧

a:reply?D , Hash(D ), N ♦ b:result!D, ∧

a:ref used?N ♦ b:denied! (a) Set of security vectors for the contract

c5 =

(vd )

}

(v f )

c Σ5 , S, s0 , S, T c , E

where

(b) T c for the contract

S = {s0 , s1 }; E5 = θ , κ κ = {x/K} θ = {key/K, pass/P, req/R, nonce/N, name/I, data/D} (c) Adaptation contract c5

Figure 7.2: Adaptation contract for services a and b

at service b with the appropriate arguments coming from the first request of a. Symbolic parameters will be bound to the received values (e.g., I, P, R and N on the left hand side of vl ). Parameters with a superscript ‘∧ ’ will be replaced by values already known to the adaptor. For instance, Enc(K ∧ , P) indicates that the value that will be bound to parameter P is encrypted with a known key K ∧ . This key, which must be known at the beginning of the session, is given in an initial adaptor environment E5 . This environment states that parameter K is a key, and x is its run-time value. Parameters I and P are used to compose the login message to be sent to service b. Vector vq processes the Req argument (in R) of the initial request in vl . Once the first request has been fully received, subsequent requests can be mapped directly with vectors vr and vd . Arguments Name and Pass are checked to be always the same due to superscript ‘∧ ’. Nonces are updated (vectors vl and vr ) and reused (vd and v f ) accordingly. Vector vr can conflict with vq therefore we use T c (Fig. 7.2(b)) to enforce that vq is triggered only the first time. Vector vq is guaranteed to be triggered first 125

CHAPTER 7. SECURITY ADAPTATION c:r?

s:p?

s:p?

c:r?

s:d? s:l!

c:f! c:r?

s:p?

s:q! c:r?

c:r?

c:r?

c:r?

c:r?

s:q!

s:q!

s:s?

c:y!

c:r?

c:r?

s:s? c:r?

c:y!

c:r?

c:r?

s:s?

Figure 7.3: Adaptation protocol for contract c5 and services a and b because of the need to send b: login? at the beginning of b. Figure 7.3 shows a security adaptor which complies with c5 and services a and b. The transition labels have been reduced to the characters underlined in c5 and prefixed with the identification of the corresponding service. Security adaptation contracts concisely represent the mapping of the operations, arguments and security requirements among services. SACs abstract away from concurrency issues, leaving them to the synthesis phase. This is especially important because adaptation protocols grow exponentially with the complexity, incompatibilities and interleaving among services. In fact, for finite service behaviour, the behaviour of the adaptor is potentially infinite in states and transitions whereas SACs are not. For instance, Fig. 7.3 has several dashed transitions which correspond to additional requests received before processing the previous one. These transitions represent a potentially infinite stack of pending requests, in contrast to the small size of the SAC (Fig. 7.2) and services (a and b in Fig. 7.1) which resulted in this adaptor. Additionally, SACs support high level restrictions expressed in their FSMs. The level of abstraction of the SACs makes them versatile enough to cope with small changes in the behaviour of the services. We also note that the same SAC with different services can generate different adaptors. For instance, any combination between services {a, a0 } × {b, b0 } is supported by the same SAC c5 but not by the same adaptor. For example, the adaptor in Fig. 7.3 does not work for {a, a0 } × b0 .

7.1.1

Methodology for security adaptation

Our methodology for behavioural WS adaptation involves the following steps: 1. First, we abstract the behaviour of the services to our formal model. This model allows us to represent the sequence of messages sent 126

7.2. WEB SERVICES WITH SECURITY QOS and expected by a service and the order in which they must occur. This is done automatically from the public description of the behaviour of the services written in abstract BPEL or WF. Security can be extracted from WS-Security messages or WS-Policy specifications. 2. Then, taking the behaviour into account, we must design (either assisted with a CASE tool [CSCO09] or automatically (see Chapter 6) an adaptation contract able to solve all of the incompatibilities among the services. At this stage, we can do static validation of the contract (i.e., the contract is well defined) and, although there are not any data nor the concrete adaptor to test the adaptation, we still can perform symbolic simulation over the contract by hand-picking which adaptation vectors to apply at every moment. In addition, using this simulation, we can do symbolic model-checking based on the contract and the models of the services. 3. Using the contract and the behaviour of the services, the protocol of the adaptor is synthesised while taking into account all possible interleaving among messages. The resulting adaptor conforms to the given contract, it is encoded into a particular implementation language (currently BPEL), and it can finally be deployed as a single orchestrator [MPS08] or distributed [Sal08] in a set of wrappers over the services. This point will be discussed in detail in Chapter 8. The goal of this methodology is to generate correct adaptors which are: i) secure, i.e., they enforce the security policies expressed in the SAC and security-enabled services; ii) non-intrusive, since adaptors do not alter the internal behaviour of the services and, in fact, services can be oblivious to adaptation; and iii) transparent, because adaptors should support every possible interaction which complies with the security policies of the contract and services and does not cause deadlock or livelock situations. It is worth observing that the whole methodology is supported by our toolbox called ITACA, which is enhanced with hierarchical adaptation, simulation and verification capabilities over the entire composition.

7.2 Web services with security QoS 7.2.1

WS-Security

Due to the system-independent nature of WSs, several specifications have emerged to express security requirements and properties over WSs and 127

CHAPTER 7. SECURITY ADAPTATION their orchestration. The specifications which are relevant for this document are briefly described below: WS-Security describes enhancements to SOAP messaging to provide quality of protection through message integrity, message confidentiality, and single message authentication. These mechanisms can be used to accommodate a wide variety of security models and encryption technologies. Security tokens are one of the main concepts in WS-Security. Security tokens are sets of claims (such as user names, keys or certificates) used for encryption, signature or authentication. WS-Security enables SOAP messages to describe the security transformations required to decrypt, authenticate and verify the different parts of the message by the intended recipient, therefore allowing end-to-end security. Listing 7.1 presents an extract of a SOAP message with WS-Security elements. WS-Trust defines how to issue, renew and validate security tokens that will be used to establish, assess and broker trust relationships among entities. They define an entity called Security Token Service that covers all the operations required for handling security tokens. The Security Token Service provides a trusted third party which asserts trust relationships among entities. Actual implementations are encouraged to make use of existing solutions that are compatible with this specification such as Kerberos or X.509 public-key certificates. WS-SecureConversation is yet another extension of WS-Security which defines Security Context Tokens. These tokens, which are handled by the mechanisms described in WS-Trust, are used as shared secrets for establishing secure sessions (i.e., secure conversations). Additionally, new keys can be derived by both parties from the shared Security Context Token. With WS-SecureConversation, the security constraints described with WS-Security go beyond individual messages and it is possible relate several messages under the same conversation. WS-Policy is an XML-based language to express the different security policies that are allowed and offered by a given entity. It is a flexible way to describe what kind of WS-Security structures are supported by the WS in the exchanged SOAP messages. For instance, it serves to specify all the different encryption algorithms that are supported by the service provider over the different parts of the SOAP message. Example 7.2.1 Listing 7.1 presents a SOAP message composed of: a timestamp (line 3); an X.509 certificate (lines 4-6); an encrypted key 128

7.2. WEB SERVICES WITH SECURITY QOS

Listing 7.1: WS-Security enabled SOAP message 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

... ... ... ... ... LyLsF094Pi4wP... ... LyLsF094i4wPU... Hp1ZkmFZ/2kQ...
......

(#enc1 at lines 7-11); the signature of the timestamp and the body of the message (lines 12-26), which is signed with the key given in the previous certificate (referenced in lines 21-25); and the body encrypted with the #enc1 key (lines 29-31). All these WS-* specifications must be known and considered by the architect in order to design a proper orchestration that is able to comply with the security requirements of all the services involved. Moreover, the mechanisms presented in WS-* security specifications do not make the communication secure. A SOAP message with WS-Security elements can specify that certain part is encrypted but the encryption key can be also sent as clear text, for instance. WS-* security specifications do enable 129


Behavioral descriptions

WSDL

Signature description

Service Provider

BPEL Web Service

WF ...

WS-Policy WS-Policy 1

SOAP Message exchange

Security Interceptors

Web Service

WS-Policy WS-Policy

WS-Security WS-Security Descriptions Descriptions

Web Service Medium

3

2

SOAP Message with WS-Security elements

Security Token Service

Figure 7.4: Different deployments of service systems with WS-Security

security concerns but they can be misused so careful design, analysis and verification are required to ensure certain security properties over the whole orchestration. This issue will be dealt with in Chapter 8. Fig. 7.4 shows several ways of deploying WSs which use WS-Security SOAP messages. The most common way (WS 1) involves the definition of the service without security, only its signature and behaviour are described. If the service provider offering that WS has a WS-Policy defined over the SOAP messages exchanged by that service, the provider must have WSSecurity descriptions to comply with that WS-Policy. These WS-Security descriptions of the SOAP message are part of the configuration of security interceptors that capture all SOAP messages and recompose them according to those WS-Security descriptions. These security interceptors require security token services (either local to the service provider or as a third party service) to provide WS-Trust mechanisms. The first services with WS-Security SOAP messages were deployed in engines without security interceptors, therefore WS-Security was hardwired within the service logic (WS 2). Finally, there are services without any security (WS 3) that might want to cooperate with security enabled services. In this scenario, security adaptors could be used as 1) security interceptors, 2) to adapt Web 130

7.2. WEB SERVICES WITH SECURITY QOS services with incompatible WS-Security descriptions, and 3) to support services without security capabilities. Security adaptation can be achieved using a security adaptor either in a transparent way or as a visible third party in between the communication, although the former implies that the adaptor must know some sensitive information to perform its task. On the one hand, if the adaptor is transparent to the WSs, it must know some of their private security tokens in order to recompose messages reusing those same tokens. On the other hand, if the adaptor is perceived as a third party, it needs its own security tokens, and trust must be handled (through WS-Trust, for instance) between the adaptor and the WSs. Either way, the adaptor must be enabled with security token generation capabilities as it might need to generate session keys, timestamps and signed data to be able to adapt the conversation among the services. The adaptor works as a proxy or central orchestrator that acts as a gateway among different WS-Policies. The security adaptors presented in this work support both transparent and opaque approaches.

7.2.2

Secure service behaviour

Secure service behaviour is still modelled using Definition 2.1.2. However, secure services use actions whose arguments might be structured with cryptographic primitives. Therefore, we concretise Σ to be a subset of actions with cryptographic structures, Σi . We now repeat below the original definition for service behaviour with this new concrete set of labels. Definition 7.2.1 (Secure service interface) A secure service interface is

. a state machine Σi , S, s0 , F, 7− → where Σi contains the interface actions of the service, S is the set of states, s0 ∈ S is the initial state, F ⊆ S are . final states without outgoing transitions and 7− →⊆ (S × Σi × S) is the labelled transition relation. Interface actions (∈ Σi ) are denoted by their channel (∈ Chan), an exclamation mark or a question mark for output and input actions, respectively, followed by a (possibly structured) type of the message (Σi = (Chan × {!, ?} × Types) ∪ {τ}). Internal actions (τ ) are used to encode internal decisions such as if activities in BPEL. Consider a set of basic types (BT ∈ BT ) denoted by capitalised words and a set of cryptographic constructors (F ∈ F ). Then, the set of Types is defined by the following grammar

T (∈ Types) ::= BT | F(T1 , . . . , Tar(F) ) 131


Table 7.2: Inference system (IS) with cryptographic primitives

m:T Hash(m) : Hash(T ) mi : Ti , i ∈ [1, n] (m1 , . . . , mn ) :(T1 , . . . , Tn ) k:K m:T Enc(k, m) : Enc(K, T ) Pk(k) : Pk(K) AEnc(k, m) : AEnc(K, T ) m:T

(1) (3) (5)

(7)

k:K m:T (2) AEnc(k, m) : AEnc(K, T ) (m1 , . . . , mn ) :(T1 , . . . , Tn ) (4i ) mi : Ti i ∈ [1, n] k : K Enc(k, m) : Enc(K, T ) (6) m:T k:K AEnc(Pk(k), m) : AEnc(Pk(K), T ) (8) m:T

where ar(F) is the number of arguments of constructor F . messages are defined analogously

Typed

m(∈ Msgs) ::= x | BM | F(m1 , . . . , mar(F) ) where x belongs to a countable set V of message variables, BM ∈ BM are messages of basic type and m1 , . . . , mar(F) are messages of type T1 , . . . , Tar(F) , respectively. Typed actions (∈ Act) are elements of (Chan × {!, ?} × Msgs) ∪ {τ}. Note that we use the set of constructors F both for types and messages. Table 7.2 defines a system IS able to infer typed messages based on constructors Hash, Enc, AEnc and concatenation. Symbols m, mi , k represent messages and T, Ti , K correspond to their respective types. The inference system IS models hash operations (1), symmetric encryption and decryption (5 and 6), public-key encryption and decryption (2, 7 and 8) and concatenation of messages (3 and 4i ).

Example 7.2.2 The WS-Security message in Listing 7.1 complies with the following security specification. 132


T0 = Tstamp, Info, Pk(Akey), AEnc(Akey, Hash(Info, Pk(Akey))), Enc(Key, Key), Hash(Tstamp), Hash(Body), AEnc(Akey, Hash(Hash(Tstamp), Hash(Body))), Enc(Key, Body)

(7.3)

(7.4)

In this example, no operation name is present as it was omitted in the WS-Security code. The different elements of the list have been indented to see their corresponding parts in Listing 7.1. It is worth noticing that the X.509 certificate (7.3) has been expanded to be able to reference its components: information about the certificate (Info), the certified public key (Pk(Akey)) and the signature of the previous elements by a certification authority (AEnc(Akey, Hash(Info, Pk(Akey)))). This signature consists of the encryption of the hash value of signed elements (the information and the public key) with the private key of the certification authority. A similar structure is found in the signature of the hash values of the body and the timestamp (7.4). This notation abstracts implementation details (e.g., the specific algorithms applied) but it manages to represent the different parts of security tokens such as certificates and signatures. Example 7.2.3 Our running example for this chapter is based on the interfaces of three services depicted in Fig. 7.5. These services present incompatible behaviour and security. Successful termination states are filled. Service 1 (Fig. 7.5(a)) supports two different authentication schemas: one based on hashes and a shared secret and another based on a modification of the Needham-Schroeder public-key protocol. On the one hand, the hash authentication starts with a hash output action with the following arguments: the identification of the originating service (typed Id), the request (Req), a nonce (which stands for number used once, typed Nonce) and the message authentication code (MAC) value of all the previous arguments with regard to a previously shared Secret . From that point on, service 1 expects all the subsequent messages to be encrypted with the concatenation of the previous nonce and secret. It can receive either a denied action (with an encrypted new nonce) or a reply with the encrypted data. On the other hand, the public-key authentication part starts with a pk_auth output action with the Id of service 1, the request and a nonce, all of them encrypted using the public key of service 2 (typed 133

CHAPTER 7. SECURITY ADAPTATION ack!AEnc(Pk(Key),Nonce) reply?AEnc(Pk(Key),Nonce,Nonce),Enc((Nonce,Nonce),AEnc(Key,Data)) denied?Enc(Nonce,Nonce) pk_auth!AEnc(Pk(Key),Id,Req,Nonce)

hash!Id,Req,Nonce,Hash(Id,Req,Nonce,Secret) reply?Enc((Nonce,Secret),Data) denied?Enc((Nonce,Secret),Nonce)

(a) Service 1

request?AEnc(Pk(Key),Id,Req,Nonce)

τ τ

reply!Nonce,Data no_access!Nonce

(b) Service 2 sign?Data

signed!AEnc(Key,Data)

exit?

(c) Service 3

Figure 7.5: Interfaces of three services

Pk(Key)). This branch can also be rejected by the destination service by transmitting a denied action, symmetrically encrypted with the previous nonce. Otherwise, it expects a reply with the previous nonce and a new nonce, both encrypted with the public key of service 1, followed by the requested data signed by service 3 and encrypted with the concatenation of both nonces. Finally, it sends an acknowledgement (via ack) with the received nonce encrypted with the public key of service 2. Service 2 is represented in Fig. 7.5(b). This service is able to provide the data corresponding to the received request. However, it only accepts 134

7.2. WEB SERVICES WITH SECURITY QOS the public-key protocol therefore every request must be received encrypted with its public key and then it replies in clear text. Service 2 has an internal choice, represented with τ -labelled transitions, which allows him to decide whether to reply or deny the access (via no_access). The last service (service 3 in Fig. 7.5(c)) simply encrypts with his private key all received data. Alternatively, it ends its behaviour if it receives action exit . We use this model to represent secure services interfaces, i.e., the type of their secured operations. However, we will use Crypto-CCS to describe, to simulate and to verify the actual execution of the services, including specific data messages.

7.2.3

Crypto-CCS

Crypto-CCS [Mar03] is a tool-supported [MPV02] process calculus inspired by CCS but extended with guards and parameterized with an inference system to perform cryptographic operations. Crypto-CCS processes are described by the following grammar:

S ::= S\L | S\\L | S1 kS2 | Aφ

A ::= 0 | pc.A | A1 + A2 | [m = m0 ]A1 ; A2 | [hhmi iii∈I ÌS x : T ]A1 ; A2 x pc ::= τ | c!m | c?x : T | χc,T where m, m0 , m1 , . . . , mn are messages or variables, x is a message variable, T is a (possibly structured) type, c ∈ Chan is a channel, φ is a finite set of typed messages, L is a subset of Chan and i ∈ I ⊆ N (the set of natural numbers). We briefly give the informal semantics of the terms of the calculus.

• 0 is the process that does nothing. • pc.A is the process that can perform an action according to the particular prefix construct pc and then behaves as A: – c!m allows the message m to be sent on channel c. – c?x : T allows messages m : T to be received on channel c. The message received substitutes variable x. x is used to eavesdrop a communication on channel c which – χc,T occurs in another service of the system. A message is eavesdropped when it is received by the attacker but not consumed. The eavesdropped message substitutes variable x. 135


• A1 + A2 is the process that non-deterministically decides to behave as A1 or A2 . It is worth noting that this operator can be used to model both internal choices (e.g., τ.A1 + τ.A2 ) and external choices (e.g., c1 ?x : T1 .A1 + c2 ?x : T2 .A2 being c1 6= c2 ). • [m = m0 ]A1 ; A2 is the matching construct. If the two messages are equal to each other, then the process behaves as A1 , otherwise as A2 . • [hhmi iii∈I ÌS x : T ]A1 ; A2 is the inference construct. If, applying a case of inference schema IS with the premises hhmi : Ti iii∈I , a message m : T can be inferred, then the process behaves as A1 (where x is replaced with m); otherwise the process behaves as A2 . This is the message-manipulating construct of the calculus: we can build a new message by using the messages in hhmi iii∈I and the inference rule IS. • The system S \ L is prevented from performing actions whose channel belongs to the set L, except for internal actions. • The system S\\L can perform actions not in L, in addition synchronisations whose channels are in L are renamed into τ . • A compound system SkS0 performs an action a if either of its subcomponents performs a, and a synchronisation action (τc,m ), if the sub-components perform complementary actions, i.e. send-receive actions. It is worth noticing that, unlike CCS, our synchronisation actions carry information about the message exchanged and the channel used. In this way, we can model eavesdropping. Indeed, the agents of one component, e.g. S, might know the message exchanged during the synchronisation of the other component, i.e. S0 , by simultaneously performing an eavesdropping action χ . Guarded actions (i.e., [m = m0 ]A1 ; A2 and [hhmi iii∈I ÌS x : T ]A1 ; A2 ) have a second process A2 which is executed when the guard does not hold. From now on, we assume that guards which do not hold are security failures in which the process must perform exceptional actions. These actions could trigger alarms, perform counter-attacks or tighten the security, among others. In this paper we will simply halt the process on security failures, therefore A2 is always the empty process (0). Let us note that these security failures are not violations of the secrecy property, they only mean that some messages were not as expected (e.g., a message which has been tampered or a key which is not correct). For the sake of clarity, we will omit message types in Crypto-CCS process when they can be easily inferred from the context. The operational semantics of Crypto-CCS is described in Table 7.3. Function T msgs(T ) represents the set of all possible messages of type 136


Table 7.3: Operational semantics of Crypto-CCS, where the symmetric rules for k1 , k2 and kχ are left omitted. (!)

c!m

(c!m.A)φ −→ (A)φ m : T ∈ T msgs(T )

χc,m x (χc,T .A)φ −→

m=m

0

(χ )

(A[m/x])φ ∪{m : T }

(A1 )φ −→ (A01 )φ 0

0

([m = m ]A1 ; A2 )φ −→

([]1 )

(A01 )φ 0

τ

(τ.A)φ −→ (A)φ

m = m0 (A1 )φ −→ (A01 )φ 0

([]2 )

(A2 )φ −→ (A02 )φ 0

(+2 )

([m = m0 ]A1 ; A2 )φ −→ (A01 )φ 0 α

α

(A1 )φ −→ (A01 )φ 0 α

α

(A1 + A2 )φ −→ α

(+1 )

(A01 )φ 0 0

α

SkS1 −→ S

τc,m

S −→ S

0

τ

0

(kχ )

kS10

c∈L 0

S\\L −→ S \\L

S −→ S0

S −→ S0 α

χc,m

S1 −→ S10

τc,m

α

(k1 )

SkS1 −→ S0 kS1

τc,m

(A1 + A2 )φ −→ (A02 )φ 0 c?m

S −→ S

S −→ S0

(τ )

α

α

α

(?)

c!m

(c!m.A)φ −→ (A)φ

c!m

S1 −→ S10

τc,m

SkS1 −→ S0 kS10

channel(α) 6∈ L

S\L −→ S0 \L α

(k2 ) (\L)

τc,m

S −→ S0 c 6∈ L

(\\L1 )

τc,m

S\\L −→ S0 \\L

S −→ S0 α 6= τc,m channel(α) 6∈ L

(\\L2 )

α

S\\L −→ S0 \\L α

hhmi : Ti iii∈I ÌS m : T

(A1 [m/x])φ ∪{m : T } −→ (A01 )φ 0 α

([hhmi iii∈I ÌS x : T ]A1 ; A2 )φ −→ (A01 )φ 0

@(m : T )hhmi : Ti iii∈I ÌS m : T

(\\L3 )

α

(A2 )φ −→ (A02 )φ 0

(D1 )

α

([hhmi iii∈I ÌS x : T ]A1 ; A2 )φ −→ (A02 )φ 0 α

(D2 )

T . The auxiliary function channel returns the channel of the given action. channel(τc,m ) = channel(τ) =⊥ x channel(c?x : T ) = channel(c!m) = channel(χc,T )=c The complete description of Crypto-CCS [Mar03] supports actions to 137

CHAPTER 7. SECURITY ADAPTATION generate random values. This random value generation is used for instantiated parameters in adaptation contracts. However, for the sake of simplicity, we will assume without loss of generality that every random value needed is initially known by the agent. In this way, we can safely replace instantiated parameters by known parameters in our adaptation contracts. Example 7.2.4 Fig. 7.6 shows possible Crypto-CCS processes for services 1, 2 and 3. x is not meant to be used by well-behaved The eavesdropping action χc,T services but only by the attacker. The service interface of well-behaved Crypto-CCS processes is obtained through the following definition.

Definition 7.2.2 The interface of a Crypto-CCS process P which does not eavesdrop (i.e., it does not use χ actions) is given by . ip(P) , (Σi , S, s0 , F, 7− →) where the states S are the possible configurations of P, state s0 = P, F = {0} and the alphabet Σi is obtained from the . transition system 7− →, which is inferred as follows. c!m

P −−→ P0

c!T

P 7−−→

c?m

P −−→ P0

m:T

P0

m:T

c?T

P 7−−→ P0

τc,m

P −−→ P0

τ

τ.P 7− →P

τ

P 7− → P0

In previous chapters, we used the composition operator among interfaces (⊗) as opposed to the parallel composition between Crypto-CCS processes. However, for deadlock analysis purposes, we can extrapolate the results from interfaces to Crypto-CCS being provided the following lemma. Lemma 7.2.1 If two Crypto-CCS processes which do not eavesdrop synchronise, then their corresponding interfaces also synchronise. More formally,

ip(P) ⊗ ip(Q) − → ip(P0 ) ⊗ ip(Q0 ) τ

if

PkQ −→ P0 kQ0 , α ∈ {τc,m , τ} α

We will use this lemma extensively in Section 8.4, where we present deadlock analysis and the synthesis of functionally-correct adaptors based on service interfaces. 138


S1 = P + H P = [hhi1 , r1 , n1 ii `3 c :(Id, Req, Nonce)] [hhPk(kb), cii `2 p : AEnc(Pk(Key), Id, Req, Nonce)] pk_auth!p.(R + D) D = denied?x.0 R = reply?x.[hhxii `41 y : AEnc(Pk(Key), Nonce, Nonce)]

[hhka, yii `8 nn :(Nonce, Nonce)][hhnnii `41 n0 : Nonce][n1 = n0 ] [hhnnii `42 n2 : Nonce] [hhPk(kb), n2 ii `2 a : AEnc(Pk(Key), Nonce)] ack!a.0 H = ... φ1 = {i1 , r1 , n1 , Pk(kb), ka, s1,2 }

(a) Crypto-CCS process for service 1

S2 = request?x : AEnc(Pk(Key), Id, Req, Nonce). [hhkb, xii `8 c :(Id, Req, Nonce)] [hhcii `41 i0 : Id][i0 = i1 ] [hhcii `43 n : Nonce](τ.R + τ.N) N = no_access!n.0 R = [hhn, d1 ii `3 y :(Nonce, Data)]reply!y.0 φ2 = {i1 , d1 , kb}

(b) Crypto-CCS process for service 2

S3 = sign?d : Data. [hhkc, dii `2 s : AEnc(Key, Data)] signed!s.0 + exit?.0 φ3 = {kc}

(c) Crypto-CCS process for service 3

Figure 7.6: Crypto-CCS processes for the services of the running example

139


7.3 Security adaptation contracts Security adaptation contracts allow to: i) express the security checks that must be performed by the adaptor; ii) describe how to decompose and recompose the messages that must be transformed by the adaptor while preserving the security restrictions of the services; iii) define security constraints among sequences of messages, such as secure sessions or security protocols; iv) perform analysis over the security of the resulting orchestration against several attacks; and v) retain the ability to adapt signature and behavioural incompatibilities. Behavioural adaptation involves receiving the messages and sending them at the moment and with the structure expected by the intended recipient. This recomposition of messages is achieved by symbolic parameters, which specify how the data is received and how the data must be restructured before being sent. However, in security enabled WS, reception and emission of the WS-Security messages is more complex as these messages might need integrity, confidentiality and authentication over parts of the message, or the whole the message. Therefore, the adaptor must be capable of verifying that the messages received comply with the security policy of the sender; it must decrypt those parts of the messages that must be recomposed to match the structure accepted by the receiver; and finally send the messages encrypted and authenticated as expected by the partner. We will use security contract terms (contract terms, for short) to describe such restrictions over WS-Security messages.

7.3.1

Security contract terms

Encoding WS-Security descriptions into Types, we know the types of the message (e.g., which part of the message is a key, is encrypted or is a hash value) but we lose all the references between the elements of the message (such as which part of the message contains the key which encrypts another message or where the digest of a given argument is). However, we retain this capability using symbolic parameters in security adaptation contracts. Symbolic parameters enable us not only to relate arguments among different services but also to relate arguments of the same service, even within the same message (which is useful for relating arguments with their signatures or hash values). Types where all the basic types, and possibly some structured types (BT and Types, respectivelly) are replaced by symbolic parameters (or parameters, for short) are called contract terms. 140

7.3. SECURITY ADAPTATION CONTRACTS Definition 7.3.1 (Security contract term) A security contract term (or contract term, for short) is an expression made of cryptographic constructors and annotated symbolic parameters. Contract terms respect the following grammar

T (∈ CTerm) ::= P | Pk(T ) | Hash(T ) | (T1 , . . . , Tn ) | Enc(T1 , T2 ) | AEncc (T1 , T2 ) | AEncd (T1 , T2 ) where P ∈ Param is a (possibly annotated) symbolic parameter. For the inference system of our running example (Table 7.2), the asymmetric encryption constructor needs to be annotated with whichever one of the rules for encryption (AEncc , rule 2) or decryption (AEncd , rules 7 and 8) needs to be applied. Hash and symmetric encryption unambiguously represent by syntax which inference rules must be applied depending on the direction of the action and the annotation of the symbolic parameters within. Symbolic parameters can be annotated to express how they are updated, verified and used for composing cryptographic messages. Parameters that represent previously known values are annotated with ‘∧ ’. These parameters are called known parameters. Outgoing messages typically have most of their parameters annotated as known parameters since the values corresponding to the arguments of a message must be available to the adaptor before the message is sent. Known parameters are also used when part of a received message should be compared to check if it is equal to some previously known data. In order to keep track of stored values, the adaptor includes an environment. There are arguments whose value must be generated by the adaptor (such as new keys, timestamps and nonces, for example), these are represented by parameters annotated with ‘∗ ’. We call these parameters instantiated parameters. Once instantiated, the newly generated value is also stored in the environment. All other unannotated parameters are called fresh parameters and their values will be updated in the environment when a matching message is received. Intuitively speaking, fresh parameters are those whose values are going to be received during a communication whereas instantiated parameters are those whose value must be created by the adaptor on its own (normally, before an output action). If the same parameter occurs more than once in the same contract term, all its occurrences should represent the same value.

141


Table 7.4: Notation used for secure services and SACs Domain Elem. Labels Example expression Definition

Chan Param BT Types CTerm Msgs Σi Act Σa Σc

c P BT T T m α a α v

NA NA NA NA NA NA α

7−→ a − → α ,→c . − →c

a:receive K∧ Data Enc(Key, Data) Enc(K ∧ , D) x : Enc(Key, Data) a:receive?Enc(Key, Data) a:receive?x : Enc(Key, Data) a:receive?Enc(K ∧ , D) a:refused?N ∧ ♦ b:denied!

Section 7.2.2 Definition 7.3.1 Section 7.2.2 Section 7.2.2 Definition 7.3.1 Section 7.2.2 Definition 7.2.1 Section 7.2.2 Section 2.2 Definition 7.3.6

Example 7.3.1 The following contract term is used for composing secured messages which match with the WS-Security example in Listing 7.1.

T0 = T, I, Pk(S), AEncc (A, Hash(I, Pk(S))), Enc(K, L), Hash(T ), Hash(B), AEncc (S, Hash(Hash(T ), Hash(B))), Enc(L, B)

(7.5) (7.6) (7.7) (7.8)

Types (like T0 in Example 7.2.2) describe the security structure and types of the message whereas contract terms (such as T0 ) relate each part of the message while respecting the structure imposed by types. For instance, S in T0 is the parameter for the key in the X.509 certificate and it is used three times: once to specify the actual key in 7.5, the second time to refer to its digest within the certificate in 7.5, and finally to specify that the key is used in the signature of the digests of the timestamp and the body in 7.7. Similarly, the encrypted key L (7.6) is used a second time to specify that the body of the message is encrypted with such key in 7.8. Symbolic parameters T, B and I are referenced several times by their digests. Example 7.3.2 Table 7.4 shows a summary of the notation (with examples) of the different expressions used for security adaptation. 142

7.3. SECURITY ADAPTATION CONTRACTS Example 7.3.3 Below there are three actions with contract terms and their corresponding types.

send_hash!D, Hash(D, S∧ ); send_hash!Data, Hash(Data, Secret) (7.9) send?D∧ ;

send?Data ∧

∗

denied?Enc(K , N );

denied?Enc(Nonce, Nonce)

(7.10) (7.11)

The action in 7.9 is received by the adaptor. Let us remember that actions in SACs has the direction of the messages as in the communicating service, which means that output actions (!) and input actions (?) are indeed received and sent, respectivelly. Action 7.9 occurs on channel send_hash and its arguments should be matched by the contract term D, Hash(D, S∧ ). Such a contract term receives and stores under parameter D whatever is matched with the first argument. Then, it verifies that the hash of both the value received in D and some previously known information S∧ (which is annotated to be known) should be equal to whatever is received as a second argument. Remember that every occurrence of the same parameter should represent the same value. In order to be known, the value of S must have been either: i) received in a previous input action as a fresh parameter; ii) instantiated by the adaptor in a previous output action; or iii) part of the initial environment of the adaptor. In any case, the adaptor already knew its value in order to verify the received message. The input action in 7.10 states that a new message must be composed with the previously known parameter D∧ and emitted through channel send . Finally, the input action denied (7.11) encrypts with a known key (K ∧ ) a nonce which is generated and stored by the adaptor (N ∗ ).

An environment E = θ , κ stores the types and values represented by the parameters with two substitutions: θ which substitutes parameters with their corresponding type and κ which replaces parameters with their known runtime value. Environments are extended with the infix operator ‘’. For instance, κ κ 0 represents a new substitution where substitution κ 0 takes precedence over κ or, in other words, every parameter which occurs in κ 0 is replaced by the values included in κ 0 , and then substitution κ is applied on the remaining parameters. Symbolic parameters (P ∈ Param) are evaluated by the substitutions in the environment to obtain their corresponding types (θ (P)) and run-time values (κ(P)). However, these substitutions can be restricted to annotated parameters with a superscript (e.g., θ f , θ ∧ and θ ∗ substitute only fresh, known and instantiated parameters, respectively). Function pm(T ), which 143

CHAPTER 7. SECURITY ADAPTATION returns the set of parameters present in contract term T , can also be restricted to annotated parameters in the same way, thus resulting in pm f (T ), pm∧ (T ) and pm∗ (T ). Contract terms can use different cryptographic constructors and inference systems provided that they unambiguously represent which inference rules need to be applied to decompose and compose the messages. The adaptor must compose messages according to the type expected by the destination service. Analogously, the adaptor will receive and decompose messages according to the type expected from the sender. Messages are composed/received based on a given contract term, therefore we define first how to relate contract terms with types by replacing parameters with their Types, and then we define when a given contract term is able to match a type using the previous relation. Definition 7.3.2 (Contract term typing) The type (∈ Types) corresponding to a contract term T with regard to a substitution θ : Param → Types, denoted by [T ]θ , is defined as follows.

[P]θ , θ (P) [Pk(T1 )]θ , Pk([T1 ]θ ) [Hash(T1 )]θ , Hash([T1 ]θ ) [Enc(T1 , T2 )]θ , Enc([T1 ]θ , [T2 ]θ )

([T ]θ )

[AEncc (T1 , T2 )]θ , AEnc([T1 ]θ , [T2 ]θ ) [AEncd (T1 , T2 )]θ , AEnc(pair([T1 ]θ ), [T2 ]θ ) [(T1 , . . . , Tn )]θ , ([T1 ]θ , . . . , [Tn ]θ ) where Ti ∈ CTerm, i ∈ {1, . . . , n}; P ∈ Param; and θ : Param → Types. Function pair returns the public/private pair of the given key type.

pair(T ) ,

K if T = Pk(K) Pk(T ) otherwise

( pair)

Note that we used the appropriate pair of keys to generate the type in asymmetric cryptography primitives due to the duality AEncc /AEncd in contract terms, which is not present in types. The previous definition can be naturally extended to obtain the interface of adaptor actions (elements of Σa ).

[c!T ]θ = c![T ]θ 144

[c?T ]θ = c?[T ]θ

7.3. SECURITY ADAPTATION CONTRACTS Definition 7.3.3 (Contract term matching) A contract term matches with a type, denoted by T ` T , if and only if a substitution θ : Param → Types exists such that [T ]θ = T . As the following result shows, if T ` T , the substitution θ verifying [T ]θ = T is unique. Proposition 7.3.1 Given a contract term T which matches a type ( T ` T ), if two substitutions θ1 , θ2 enable this match ([T ]θ1 = [T ]θ2 = T ), then

θ1 (P) = θ2 (P) for all P ∈ pm(T ). Thus, we will denote by θT,T the unique substitution such that T ` T and dom(θT,T ) = pm(T ). Example 7.3.4 Let T1 and T1 be a contract term and a type, respectively, given by:

T1 = AEncd (PK, Enc(C, K)), O T1 = AEnc(Akey, Enc(Key, Key)), AEnc(Akey, Enc(Key, Key)) Then, we can easily derive that T1 ` T1 with the following substitution:

θT1 ,T1 = {Pk(Akey)/PK, Key/C, Key/K, AEnc(Akey, Enc(Key, Key))/O} It is worth noting how the constructor AEncd in T1 is removed from its subindex, and its key (PK, θT1 ,T1 (PK) = Pk(Akey)) is replaced with its private pair:

pair([PK]θT ,T ) = Akey 1

1

In this way, even though the contract term expressed the asymmetric key that was necessary to decrypt the message, the same contract term is rewritten and compared with the type which composed that message (T1 ).

145


7.3.2

Validation of contract terms

Security adaptation contracts are easier to design than programming adaptors from scratch. SACs must be written with care [CSCO09], otherwise they might be invalid or result in empty adaptors. This is particularly important with SACs since we are able to write contract terms that do not match their corresponding types (covered in Definition 7.3.3), or contract terms which contain parameters that cannot be matched on input actions because their values might be obfuscated by security. In order to prevent the latter we define the reachability of parameters and, at the same time, we define valid contract terms. Definition 7.3.4 (Parameter reachability) The function reach : Param × CTerm × CTerm → Boolean returns whether a parameter P can be obtained from a message that complies with the given contract term or not.

 True if T = P     0  reach(P, T2 , T )∧    if T = Enc(T1 , T2 )   ∀P0 ∈ pm f (T1 ).reach(P0 , T 0 , T 0 )      reach(P, T2 , T 0 )∧ 0 if T = AEncd (T1 , T2 ) reach(P, T , T ) , 0 f 0 0 0  ∀P ∈ pm (T ).reach(P , T , T ) 1     n  _   reach(P, Ti , T 0 ) if T = (T1 , . . . , Tn )     i=1    False otherwise where T , T 0 ∈ CTerm. Function reach is used on the fresh parameters in input actions, therefore there is no need for including AEncc in the definition. It is worth noting the role played by the third argument (T 0 ) which represents the whole contract term of which the second argument is a part. This argument allows us to look for fresh parameters in other parts of the contract term. In this way, even though keys might be represented by contract terms with fresh parameters, these keys can still be obtained (therefore allowing us to keep looking inside the encrypted message) if those fresh parameters can be reached in any other part of the message. Thus, we will define reachable(P, T ) , reach(P, T , T ). 146

7.3. SECURITY ADAPTATION CONTRACTS Example 7.3.5 Being P ∈ Param, T ∈ CTerm then:

reachable(P, Enc(K, P)) = False reachable(P, (Enc(K, P), K)) = True reachable(P, Hash(P)) = False reachable(P, (Hash(P), K, AEncd (Pk(K), P)) = True

(7.12) (7.13) (7.14) (7.15)

In 7.12, parameter P is within an encrypted message with an unreachable key, therefore the message cannot be decrypted and the parameter is unreachable. If the key is given in any other part of the message 7.13, parameter P can be reached. Digests cannot be decomposed so P cannot be reached in 7.14. Finally in 7.15, we can reach K therefore we can obtain its public key and then we reach P.

We now define two notions of valid contract term over a given type

T depending on whether the given contract term T is used for output (validSend ) or input (validRec) actions in the adaptor. Contract term T is considered valid only if T ` T . Additionally, on input actions, every fresh

parameter must be reachable and, on output actions, each parameter must be annotated to be known or instantiated. Instantiations must assign values to parameters according to their types.

Definition 7.3.5 (Valid Contract Term) A contract term T is considered valid with regard to a type T and the direction of the action if:

  T ` T , validSend(T , T ) iff pm f (T ) = 0, /   ∗ ∀P ∈ pm (T ).θT,T (P) ∈ BT   T ` T , ∗ / validRec(T , T ) iff pm (T ) = 0,   f ∀P ∈ pm (T ).reachable(P, T )

(validSend)

(validRec)

147

CHAPTER 7. SECURITY ADAPTATION Example 7.3.6 Given

T2 = Enc(key, “reply”, data, Hash(data)) T2 = Enc(C∧ , “reply”, D∧ , Hash(D∧ )) T5 = Enc(C∧ , “reply”, D∧ , Hash(D)) T6 = Enc(C, “reply”, D∧ , Hash(D∧ )) T7 = Enc(C∧ , “reply”, D, Hash(D∧ )) T8 = Enc(C∗ , “reply”, D∧ , Hash(D∧ )) T9 = Hash(D∧ ) then

validRec(T2 , T2 ) = True validRec(T5 , T2 ) = False validRec(T6 , T2 ) = False validRec(T7 , T2 ) = True validRec(T8 , T2 ) = False validRec(T9 , T2 ) = False

validSend(T2 , T2 ) = True validSend(T5 , T2 ) = False validSend(T6 , T2 ) = False validSend(T7 , T2 ) = False validSend(T8 , T2 ) = True validSend(T9 , T2 ) = False

7.3.3

SAC

SACs are an evolution of the adaptor specifications presented in [BCP06] and extended with value-passing in [MPS08]. The contribution of SACs are the inclusion of cryptographic structures in the actions which allows the receiving, verifying and composing of cryptographic messages. SACs allow contract designers to focus on matching the semantics behind arguments and operations, and forget about concurrency and secrecy issues. These problems are automatically addressed by the synthesis of secure adaptors presented in Chapter 8. An adaptor must react to the messages it receives. Contract terms helped to receive and send individual messages. We now use security adaptation vector to relate received messages to their corresponding replies from the adaptor. These vectors are just an particularization of the adaptation vectors seen in Section 2.1 where actions use contract terms. Definition 7.3.6 (Security adaptation vector) A security adaptation vector (or vector, for short) is denoted as c!T , c?T , c?T ♦ c0 !T 0 or c!T ♦ c0 ?T 0 where c and c0 are channels and T , T 0 are contract terms. 148

7.3. SECURITY ADAPTATION CONTRACTS We can unequivocally identify each side of a vector by its direction, e.g., being v = c?T ♦ c0 !T 0 , then ?v = c?T and !v = c0 !T 0 . In addition, we can naturally extend the previous operator ‘[·]· ’ to obtain the interface corresponding to vector components: [c!T ]θ , c![T ]θ . The intuition behind vectors with two actions is that, whenever the adaptor receives an action matching the left-hand side of the vector, it must eventually send the action in the right-hand side of the vector. Vectors with one action can be used as needed by the adaptor. Vectors can interleave or, in other words, we can apply additional vectors between the input and output actions of a two-sided vector. Example 7.3.7 The following vectors correlate actions with various contract terms.

send_hash!D, Hash(D, S∧ ) ♦ send?D∧ ∧

∧

send_enc!AEncd (K , D) ♦ send?D

(7.16) (7.17)

Vector 7.16 relates the actions 7.9 and 7.10 of Example 7.3.3. When the adaptor receives the former, it must reply with the latter. Vector 7.17 first receives action send_enc and decrypts its single argument with a previously known private key K ∧ (typed Key). Rule 8 of the inference system IS (Table 7.2) is used for this decryption and its result is stored under parameter D. This is forwarded as expected using the second half of the vector through action send. Security adaptation contracts can enforce some requirements over the adaptation such as “a particular message must not be sent more than x times” or “an operation A will be (un)available until the operation B is called”. These requirements constrain the application order of the interactions expressed by vectors. In order to represent such high-level requirements, adaptation contracts are state machines whose transitions are labelled with synchronisation vectors. Security Adaptation Definition 7.3.7 (Security adaptation contract) A

. Contract (SAC) c is a FSM plus an environment Σc , S, s0 , F, − →c , E where Σc is a set of security vectors, S is the set of states, s0 ∈ S is the . initial state, F ⊆ S is the set of final states, and − →c ⊆ (S × Σc × S) is a transition relation. Intuitively speaking, a SAC can be understood as a mapping between the different security policies of several services, throughout the information (e.g., keys and certificates) and restrictions required to support the secure interaction between the services, despite their initial incompatibilities. 149

CHAPTER 7. SECURITY ADAPTATION Example 7.3.8 The SAC c6 = Σc6 , {s0 , s1 }, s0 , {s0 }, T c , E6 is able to apply multi-factor authentication between three incompatible services where

Σc6 = {send_hash!D, Hash(D, S∧ ) ♦ , ∧

c

∧

(v1 ) ∧

send_enc!AEncd (K , D ) ♦ send?D } v1

(v2 )

v2

T = {s0 −→c s1 , s1 −→c s0 }

E6 = {Data/D, Secret/S, Key/K}, {s/S, k/K} The key difference with respect to Example 7.3.7 is that we do not send the data when we receive the first message (v1 ) but, instead, we wait for the second message (v2 ) and we check that the data received in both messages is the same to do multi-factor authentication (using the know parameter D∧ in v2 ). The transition relation of the contract (T ) enforces that the vectors are applied in the order previously mentioned. Example 7.3.9 The service interfaces of our running example can be adapted using the security adaptation contract presented in Fig. 7.7. For this contract we are not going to make any distinction of sides for the vector or, in other words, the isolation between services is enforced only by the renaming of the operations, not by the operational semantics of the SAC. The contract supports both the hash and public key protocols from service 1 (Fig. 7.5(a)). If service 1 decides to make the request through hash, this is received by vector vhs . This vector allows requests coming from a service identified I and it checks that the hash of the request (Hash(I, R, NA, S∧ )) corresponds to what is received and the secret previously known by the adaptor. Once the reception is verified, the adaptor compliant with this contract should eventually respond by sending a request to service 2 (Fig. 7.5(b)). Such a response is composed using the received values, i.e., the nonce of service 1 (NA∧ ) and the request R∧ . Service 2 might either respond with the data D with vector vrep or reject the request via 2:no_access in vdny . Either way, it also provides the nonce NA∧ received with the request from service 1. This nonce and the secret (s1,2 which substitutes parameter S∧ ) are used to encrypt and send the data to service 1 on the right-hand side of vrep . If, on the contrary, the request is denied, what is encrypted and sent is a new nonce instantiated by the adaptor (i.e., N ∗ ) using vector vdny . A similar process happens if service 1 goes through the public-key protocol starting with 1: pk_auth in vector v pk . In this case, however, the request Q is directly understood by service 2, therefore no verification or 150

7.3. SECURITY ADAPTATION CONTRACTS

Σc7 = { 1:hash!I, R, NA, Hash(I, R, NA, S∧ )

♦ 2:request?AEncc (KB∧ , I ∧ , NA∧ , R∧ );

2:reply!NA∧ , D ♦ 1:reply?Enc((NA∧ , S∧ ), D∧ ); ∧

∧

∧

(vrep )

∗

2:no_access!NA ♦ 1:denied?Enc((NA , S ), N ;

(vdny )

∧

1: pk_auth!Q ♦ 2:request?Q ; ∧

(v pk )

∧

2:reply!NA , D ♦ 3:sign?D ; ∧

(vsgn ) ∧

∗

2:no_access!NA ♦ 1:denied?Enc(NA , N ); 3:signed!P ♦

(vhs )

∧

∧

(vno ) ∗

1:reply?AEncc (KA , NA , NB ), Enc((NA∧ , NB∗ ), P∧ );

1:ack!ACK ♦ ; ♦ 3:exit?; }

(vok ) (vack ) (vext )

(a) Set of security vectors for the contract

v c7 = Σc7 , {s0 }, s0 , {s0 }, {s0 − →c s0 |v ∈ Σc } , θ , κ where

θ = {Pk(key)/KA, Pk(Key)/KB, Secret/S, Id/I, Req/R, Data/D, AEnc(Pk(Key), Id, Req, Nonce)/Q, AEnc(Pk(Key), Nonce)/ACK, Nonce/NA, Nonce/NB} κ = {Pk(ka)/KA, Pk(kb)/KB, s1,2 /S} (b) Adaptation contract c7

Figure 7.7: SAC for services 1, 2 and 3

recomposition is needed from the adaptor, hence the request is directly forwarded to service 2. Then, service 2 might deny the request (vno ) or otherwise reply with the data using vector vsign . The right-hand side of this vector delegates on service 3 (Fig. 7.5(c)) to sign it with its private key. These signed data are received in vector vok and properly encrypted and composed to reply to service 1. Note that the nonce which has to be instantiated composing 151


1:reply in vok (i.e., NB∗ ), is the same for both occurrences of the symbolic parameter. Finally, service 1 will acknowledge the response with ACK in vack . By using simply ACK to match the acknowledgement we just receive it and ignore it. Alternatively, this message could have been verified to correspond to the previously sent nonce NB using a contract term AEncc (KB∧ , NB∧ ). Such a contract term encrypts and compares because the contract

environment θ , κ does not posses the private key of service 2 to be able to decrypt the message with AEncd (PKB∧ , NB∧ ) (where PKB should represent the private key of service 2 in the environment). Vector vext allows the adaptor to finish the behaviour of service 3 if its signature is not needed because the orchestration went through the hash-based schema.

7.4 Summary In this chapter we have defined SACs, used to specify the adaptation over WSs with stateful behaviour and security requirements. The goal of the adaptation is to make possible the cooperation of WSs within an orchestration while overcoming their initial incompatibilities at signature, behaviour and security QoS levels. SACs allow us to combine several WS-* security specifications in a single abstract notation and, at the same time, to solve possible incompatibilities among services. Such incompatibilities are common due to the tight coupling of WS-Security enabled services and their security restrictions. The formal model applied to SACs is based on a set of basic security primitives that support a wide range of security protocols and can be analysed by automatic cryptographic protocol verifiers, as we will see in Chapter 8.

152

8

R. Munroe - xkcd.com

Synthesis of secure security adaptors

Incompatible services have incompatible security policies and, therefore, use incompatible security protocols. In Chapter 7, we presented security adaptation contracts (SACs), which extended the adaptation contracts defined in Chapter 2 to support the adaptation between cryptographic protocols. In this way, we were able to specify the mapping between messages whose content could be encrypted, digested or signed. The first step towards security adaptation is to design the SAC, which overcomes the incompatibilities in signature, behaviour and QoS security. Then, a secure security adaptor must be automatically synthesised in compliance with such a SAC. Regular adaptor synthesis (as seen in Chapter 4 for dynamic adaptation), has the goal of generating deadlock-free adaptors which comply with the adaptation contract. Security adaptors have the additional requirement that they have to be robust against security attacks. Therefore, this chapter is devoted to the synthesis of secure security adaptors. The synthesis of secure security adaptors requires the Crypto-CCS description of the services, a SAC and a secrecy property which we want to preserve. Thus, the goal of the synthesis process is to be compliant with the SAC, to avoid deadlocks in the interactions with the services, and that it doesn’t exist any attacker to the given secrecy property. We will use partial model-checking techniques for the security verification and refinement of security adaptors. There are several other papers [LYR02, RCHP00, HCDB99, AL08, SCMS+ 09, ZZ04, KMNC05] that take advantage of adaptation to support and enhance security. Most of them [ZZ04, AL08, HCDB99, RCHP00, SCMS+ 09, KMNC05] are compositional approaches where security is dynamically changed depending on runtime QoS parameters and attack detection. These approaches focus on the performance vs security trade 153

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS off and, when certain triggering conditions are met (a mobile agent who goes to a different host, an intruder is detected, or a possible buffer overflow attack) they tighten the security. This is done by replacing the security components of the systems by another one previously known and configured. The approach presented in this chapter differs radically from theirs as we focus on the synthesis whereas they do not. Our work could be used in conjunction with theirs to adapt alternative security components at design time. Li et al. [LYR02] presented an interesting approach for securing distributed adaptation. In their work, their main focus is on data adaptation and to synthesise and execute a plan that allows the different parties to distributively apply a set of data transformations. They present an elaborated middleware, called Conductor, which is in charge of the planning. More interestingly, it also analyses, asserts and handles the trust between the different services and the different privacy requirements of each piece of information. They built this security schema based on several security box implementations. These security boxes wrap the different Conductor-enabled services to provide them with the appropriate cryptographic capabilities. Such security boxes are pre-designed but interchangeable at run time. Conductor presents a more high-level solution to a more specific problem. Our approach lacks their dynamic planning as we require static SACs. However, our security adaptors could be used to develop such security boxes with new security protocols. In addition, Crypto-CCS allows us to describe and verify the different deployment scenarios tackled by Conductor. The synthesis of security adaptors is analogous to the automatic composition of services with security policies [CMR08]. In this related paper, the authors propose a two-staged approach. The AVISPA tool is run in each stage: first to obtain the protocol of the composition and second to verify that it preserves the desired security properties. The latter fits the functionality of AVISPA and, for the former, the authors converted the goal of the composition into another security property, therefore using the verifier to obtain an attacker which is, instead, the actual protocol of the composition. Although our work shares the motivation and the synthesis-and-verification approach, our proposal goes beyond theirs because i) our security adaptors are synthesised to be compliant with a given contract, so we have a finer control over the result; and ii) our approach includes a novel refinement stage where, if any attacker is discovered during the verification stage, the adaptor is automatically refined to be secure against such an attack. As far as SACs describe the allowed interactions among the participants, controller synthesis [MM10, MPS05] is another area where security adaptors can be applied. In particular, the synthesis of security controllers is based on wrapping part of the system and only allowing certain actions 154

8.1. OVERVIEW OF THE SYNTHESIS OF SECURITY ADAPTORS

Synthesis

Verification

(Sect. 8.4)

Service Service Process Process

Functional Adaptor

Incompatible behaviour and security

Deadlock and livelock free Contract compliant

Contract

Refinement

(Sect. 8.5)

(Sect. 8.6)

Attacker

Secrecy Property

The most general attacker for the given adaptor, services and secrecy property

Secure Adaptor

Functional adaptor robust against attacks to the given secrecy property

Figure 8.1: Overview of the synthesis of secure security adaptors to show out of that wrap. Being provided that the deployment scenario supports that the adaptor completely wrap part of the system, SACs can serve for the same purpose.

8.1 Overview of the synthesis of security adaptors Figure 8.1 shows an overview of the approach presented in this chapter. The inputs of our synthesis process are: i) services with incompatible behaviour and security QoS encoded in Crypto-CCS; ii) a SAC, i.e., a mapping between the interfaces of the services which specifies how the incompatibilities must be solved and which security checks must be enforced; and iii) a secrecy property to preserve expressed in a logical language with knowledge operators. The synthesis process is structured in three sequential steps. First, a functionally-correct security adaptor is synthesised based on the given contract and services. This adaptor is able to orchestrate the services in a way that it solves their initial incompatibilities. However, the adapted system might be insecure against secrecy attacks. For this reason, we verify if the synthesised adaptor and the given services are robust with regard to the given secrecy property in a second step. Finally, if an attack exists, we proceed to refine the initial adaptor into a secure security adaptor. Hence, the output of the synthesis process presented in this chapter is an adaptor encoded in Crypto-CCS, able to orchestrate the services despite their incompatibilities, compliant with the SAC (which allows a fine-grained control over the result) and robust against secrecy attacks. In this chapter, we extend Crypto-CCS for the verification and refinement 155

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS stages because of its sound verification algorithm based on partial model checking. However, other verification tools could be employed such as ProVerif [Bla09], AVISPA [Vig06] or Spi calculus [AG97]. These approaches should be adapted to support the interfaces given by SACs (via Definition 7.2.2 and Theorem 7.2.1) and must provide the most general attacker (Theorem 8.5.3) so that they can take advantage of the functionally-correct synthesis (Section 8.4) and the refinement stages (Section 8.6) that are covered in this chapter.

8.1.1

Functionally-correct adaptors

SACs describe the adaptors that must be deployed in the middle of the communication between incompatible services to allow their successful interaction avoiding deadlocks and livelocks. Stateful services (such as WF or BPEL processes) are particularly prone to these undesired situations. An adaptation contract defines a mapping between the operations (and arguments) of the services, and the restrictions that must be obeyed to fulfil the functional and security requirements. The process to obtain an adaptor process P conforming to a given contract is called adaptor synthesis. The idea is to synthesise an adaptor compliant with a contract in such a way that the parallel composition (denoted with the operator ‘k’) between the adaptor and the services S always reaches a satisfactory state.

PkS always reaches a final/stable state.

(8.1)

At this point, P is a security adaptor process in the sense that it successfully reads and recomposes the cryptographic messages exchanged between services. These messages contain parts where different cryptographic operations have been applied. Encryption (symmetric and asymmetric), signature and hashing complicate the adaptor task as it has to be able to decrypt, verify, and re-encrypt messages as expected by the destination service. This message manipulation is described by the SAC.

8.1.2

Secure adaptors

A security adaptor process satisfying Equation 8.1 might not yet be secure. Although it manipulates cryptographic messages, it does not guarantee that this manipulation prevents the disclosure of private information. Therefore, in addition to being able to manipulate cryptographic messages according to the SAC, we would like the adapted system to be resilient against secrecy attacks or, in other words, the adapted system must 156


Behavioral descriptions

BPEL

Signature description

WSDL

WF ...

\L WS

Web Service

...

Can eavesdrop but not interfere

WS Contract

WS-* WSSecurity

Knows Security descriptions

\ L''

(Partial) restriction

ɸ

Known (sensitive) data

WS

Adaptor

WS Attacker

Xɸ'

WS (Partial) hiding

WS

\\ L'

Cannot eavesdrop nor interfere Complete restriction

Figure 8.2: Scenarios for secure security adaptation

not disclose sensitive information during its communications. This is expressed in the following equation.

@X s.t. PkSkX |= psec

(8.2)

where ‘|=’ is the logical satisfiability operator, psec is the logical formula which represents the secrecy attack to avoid and X is the possible attacker. As an example, psec can represent that the attacker X is able to eavesdrop the credit card number of S during its communication with P. There are multiple attack scenarios and Equation 8.2 denotes a general one where the network is under the control of the attacker. In this case, P cannot enforce any security on it because X can, among other things, completely isolate the adapted system S from the adaptor process P. For example, a phising site, where the client believes it is accessing a trusted service but the actual service (possibly including an adaptor) has been impersonated by the attacker. Other scenarios can be envisioned depending on the actual deployment setting and the different zones of trust (see Fig. 8.2). In fact, we could want to restrict the power of the attacker in a way that the adaptor serves as a gateway to our system. Using the restriction operator (\), Equation 8.2 could be concretised into the following equation. 157

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS

@X s.t. (PkS)\LkX |= psec

(8.3)

In Equation 8.3, the attacker is still able to eavesdrop on any communication in the system but it can only actively participate in those channels not present in L. Language L does not need to be the complete set of actions between S and P. If L is empty, we revert to Equation 8.2 and, if L contains every channel of S and P, then we are modelling passive attackers, i.e., attackers which can only eavesdrop but not actively participate in the communication. We can further restrict the power of the attacker by limiting on which actions he can eavesdrop. In this case, a hiding operator (\\) is used to denote the actions which cannot be accessed nor eavesdropped by the attacker. For instance, (PkS)\\L0 is protected from attacks to synchronisations on the channels in L0 . In addition, we only need the synchronisations made within the system so we can restrict the whole expression with the complete language of actions (\L00 ). Let us highlight the fact that neither L nor L0 have to cover the complete language of actions or, in other words, a single service can have different actions in each of the different layers of isolation from the attacker. This flexibility allows us to model scenarios which range from attacks where the attacker controls the network (including insider attacks) to the more limited external attackers. Summing up all the previous points, we reach our final equation where the attacker might initially know some information φ .

@X s.t. ((PkS)\\L0 \LkXφ )\L00 |= psec

(8.4)

The goal of this chapter is to synthesise an adaptor process P which overcomes possible incompalitibilities in signature, behaviour and security (Equation 8.1); preserving Equation 8.4 and compliant with a given SAC.

8.1.3

Motivational example

Let us consider two services Q and R which send certain data to a third service S. Each of these services have a single action in their interfaces, listed in Table 8.1. On the one hand, service Q sends the data along with its message authentication code (MAC) in a single message with the following type: send _hash!Data, Hash(Data, Secret). The MAC is the hash value (i.e., the result of a hash function where it is unfeasible to obtain its inverse) of the couple consisting of the data plus a secret. Both services previously shared that secret. On the other hand, service R uses 158


Service Service Q Service R Service S

Table 8.1: Service interfaces Interface action

send _hash ! Data, Hash(Data, Secret) send _enc ! AEnc(Pk(Key), Data) receive ? Data

asymmetric encryption to guarantee its integrity and secrecy using the following message: send _enc!AEnc(Pk(Key), Data). The data is encrypted using the public key (Pk(Key)) of the destination service. Finally, service S is the intended recipient of both messages but service S lacks any security infrastructure and the only message that it understands is receive?Data. In this scenario, we need an adaptor P with a secure channel with S which intercepts the messages coming from Q and R, verifies their integrity and forwards them to S. Adaptor P provides service S with a unified-integrity and destination-authenticity mechanism, as it understand both MAC and public-key protocols. Such an adaptor P satisfies Equation 8.1. However, we now want to enforce a property psec such that “the data received by S is confidential outside PkS”. In this case, we obtain from the verification step an attacker X (violating Equation 8.3) that eavesdrops channel send _hash, in which the data are sent as clear text. However, if L enforces the attacker to interact only through P to reach S, the adaptor could avoid the attack by forbidding the send _hash action, only allowing send _enc. Such an adaptor satisfies Equation 8.4. In addition, SACs are flexible enough to cope with a different scenario where the data sent by Q and R must be the same before relaying them to the final destination, service S. This final case would be a multi-factor authentication mechanism (i.e., information must be authenticated in several ways at the same time) supported by adaptor P on behalf S. This problem can be tackled also by security adaptors. It is worth noting that security adaptors can be deployed either transparently or as another known participant in the communication, being provided that it can intercept the communications that need adaptation. If the adaptor has all the information required to impersonate a service (like the Secret and private Key in the previous example), then the deployment can be transparent between that service and the rest of the system. Otherwise, the services must be aware of the adaptor (which must posses its own credentials) and interact through it, handling trust and authentication accordingly. Secu159

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS rity interceptors, used in Oracle BPEL Process Manager 10g, are good examples of transparent security adaptors. In this case, BPEL processes are defined without security, and security interceptors intercept outgoing and incoming messages, applying security to the former and verifying the security and extracting the encapsulated information in the latter (as in Fig. 7.4). Compared to other kinds of contracts, SACs specify requirements over an adaptor that must be synthesised and deployed to support the secure conversation between WSs that were initially incompatible. As contracts, SACs are subject to be negotiated, but this aspect is not covered in this work. In terms of monitoring, SACs represent the security policy that must be enforced on messages intercepted by the adaptor at run time. In the presence of security violations, the adaptor is in charge of taking the appropriate measures such as interrupt every communication with the compromised service and notify the other services in the orchestration. Main contributions: We formalise how to synthesise deadlock-free adaptors satisfying Equation 8.1 and compliant with a given SAC (see Section 8.4). Then, in order to address scenarios such as Equation 8.3, we are going to encode our synthesised adaptor into a Crypto-CCS process. Crypto-CCS [Mar03] is a variant of CCS [Mil89] which uses partial model checking and logical satisfiability techniques to verify properties such as Equation 8.4. In this chapter, we extend Crypto-CCS in such a way that, if an attacker exists, it returns the most general attacker for the given system in the sense of any possible attack can be done by that attacker (see Section 8.5). If such an attacker exists, then we proceed to complete our synthesis process and refine our initial adaptor by removing the last controllable adapted action which enabled the attack (see Section 8.6). In this way, we finally obtain a secure security adaptor which satisfies Equation 8.4. We finally conclude with Section 8.7.

8.2 Security adaptors Security adaptors are the realisation of SACs for a given set of services. Security adaptors are equipped with an environment which is updated at run time to perform the exchange of messages and its behaviour is customised to avoid deadlocks among the services while complying with the contract at the same time. We continue to use the intensional semantics of SACs (i.e., the transition . system ,→c from Chapter 2) as a starting point for the synthesis of security adaptors. By substituting each contract term T with its corresponding 160

8.2. SECURITY ADAPTORS type (using substitution θ ) in every transition of the adaptor (via function [T ]θ in Definition 7.3.2), we obtain the adaptor interface. . Given an adaptor A = Σa , S, s0 , F, ,→c compliant with a contract whose type substitution is θ , its interface is given by ia(A), defined as follows.

ia(A) , Σi , S, s0 , F, T

(ia)

where

Σi = {[α]θ | α ∈ Σa }

[ α]

α

θ T = {s 7−−−→ s0 | s ,−→ s0 }

and

Because both the adaptor and its interface share the same set of states and both are deterministic, it is straightforward to obtain an equivalence relation between their states and transitions. Given an adaptor A as above · I I I I and an interface I = Σ , S , s0 , F , 7− → , we can define the inverse function of ia as follows:

a I I I 0 ia−1 (ia−1 A (I) , Σ , S , s0 , F , T A ) where α

α

[α]

θ T 0 = {s ,−→ s0 | s ,−→ s0 ∧ s 7−−−→ s0 }

As we will discuss later, the synthesis process presented in this chapter depends on these two mappings. In fact, we need ia to preserve the determinism of adaptors, keeping a tree structure on the generated interfaces. Proposition 8.2.1 Given an adaptor A compliant with a contract c, then ia(A) is a deterministic interface. In addition, if the transition relation of c presents a tree structure, then ia(A) also presents a tree structure. The complementary relationship between ia and ia−1 A (as inverse mappings) is justified by following result. Proposition 8.2.2 Given an adaptor A compliant with a contract whose type substitution is θ then

I ia(A) ⇒ ia−1 A (I) A We will use the previous results to preserve contract compliance during the refinement stage. We say that an adaptor A is an adaptor for services S if the parallel composition between the adaptor and the services does not present deadlocks or livelocks. 161


(a) Adaptor

A = P+H P = 1: pk_auth?x : AEnc(Pk(Key), Id, Req, Nonce).P0 H = 1:hash?q :(Id, Req, Nonce, Hash(Id, Req, Nonce, Secret)).[hhqii `41 i : Id]

[hhqii `42 r : Req][hhqii `43 na : Nonce].He .[q = q0 ]H 0 k 3:exit!.0 He = [hhi, r, na, sii `3 hsc :(Id, Req, Nonce, Secret)] [hhhscii `1 hs : Hash(Id, Req, Nonce, Secret)] 0

[hhi, r, na, hsii `3 q0 :(Id, Req, Nonce, Hash(Id, Req, Nonce, Secret))]

H = [hhi, na, rii `3 ec :(Id, Nonce, Req)] [hhPk(kb), ecii `2 en : Enc(Pk(Key), Id, Nonce, Req)]2:request!en.(R + D) D = 2:no_access?na0 : Nonce[na0 = na][hhna, sii `3 k :(Nonce, Secret)] [hhk, nii `5 y : Enc((Nonce, Secret), Nonce)]1:denied!y.0 R = 2:reply?z :(Nonce, Data)[hhzii `42 d : Data]

[hhna, dii `3 z0 :(Nonce, Data)][z0 = z][hhna, sii `3 k0 :(Nonce, Secret)] [hhk0 , dii `5 e : Enc((Nonce, Secret), Data)]1:reply!e.0

P0 = . . .

(b) Adaptor Crypto-CCS process

Figure 8.3: Adaptor for services 1, 2 and 3, compliant with contract c 162

8.3. FROM SECURITY ADAPTORS TO CRYPTO-CCS Example 8.2.1 The adaptor in Fig. 8.3 is compliant with the SAC c in Fig. 7.7 and successfully adapts the three services in Fig. 7.5 independently of their internal choices and supporting both authentication schemas. The actions are prefixed (and formally renamed with) the identification of the communicating service. Because contract vectors (∈ Σc ) contain adaptor actions (i.e., Σc ⊆ (Σa × Σa ) ∪ Σa ), we have represented adaptor actions with the direction and vector that contains them in the contract. For instance, ?vrep represents the input action of vector vrep , i.e., 2:reply?NA∧ , D. Several transitions labelled !vext are dashed because they represent same interactions as solid transitions but with occurrences of !vext at different points in the trace. The reason for this is that, once service 1 decides to go through the hash authentication schema, we can finish the session with service 3 using vector vext at any time, thus the interleaving.

8.3 From security adaptors to Crypto-CCS

We now proceed to give some definitions to be able to encode adaptors into Crypto-CCS processes. The inference rules of IS needed to compose and decompose the value of a contract term are unambiguously given by the contract term. This assumptions boils down to two functions dependant on IS: evIS and outIS . Function evIS is in charge of replacing every symbolic parameters using substitution σ and evaluate the resulting expression. In this chapter, substitutions σ , ρ and κ replace parameters with the Crypto-CCS variables which represent their values, so, in short, function evIS is used to evaluate and compose Crypto-CCS messages. 163

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS For IS in Table 7.2 it is necessary to define the following evIS function.  0, σ (P) if T = P    (    T = Hash(T1 ),    Q[hhyii `1 υx : [T ]θ ], x if   Q, y = evIS (T1 , σ , θ )         T = (T1 . . . Tn ),     (Q1 . . . Qn   Qi , yi = evIS (Ti , σ , θ ), if       [hhy1 . . . yn ii `3 υx : [T ]θ ], x) i ∈ [1, n] evIS (T , σ , θ ) ,    1 , T2 ),   (Q .Q  T = Enc(T   1 2  Q1 , y1 = evIS (T1 , σ , θ ),  if    [hhy1 , y2 ii `5 υx : [T ]θ ], x)     Q2 , y2 = evIS (T2 , σ , θ )       c (T1 , T2 ),    T = AEnc   (Q1 .Q2   Q1 , y1 = evIS (T1 , σ , θ ), if       [hhy1 , y2 ii `2 υx : [T ]θ ], x) Q2 , y2 = evIS (T2 , σ , θ ) We denote by υx a new variable x not used in the rest of the process. Example 8.3.1 The following expression generates the Crypto-CCS process required to sign a piece of information:

T = D∧ , T 0 σ = {d/D, k/K}

[T ]θ = Data, [T 0 ]θ

T 0 = AEncc (K ∧ , Hash(D∧ )) θ = {Data/D, AKey/K}

[T 0 ]θ = AEnc(AKey, Hash(Data))

evIS (D∧ , σ , θ ) = (0, d) evIS (Hash(D∧ ), σ , θ ) = ([hhdii `1 h : Hash(Data)], h) evIS (T 0 ), σ , θ ) = ([hhdii `1 h : Hash(Data)]

[hhk, hii `2 e : [T 0 ]θ ], e) evIS (T , σ , θ ) = ([hhdii `1 h : Hash(Data)]

[hhk, hii `2 e : [T 0 ]θ ] [hhd, eii `3 m : [T ]θ ], m)

Function outIS is dependant on the inference system used. It plays a complementary role to evIS in the sense that it obtains what is inside a 164

8.3. FROM SECURITY ADAPTORS TO CRYPTO-CCS constructor (the cleartext of some encrypted message or the elements of a list, for instance) as opposed to evaluating a structured contract term. Being given a symbolic parameter P, a composite contract term T , a variable x which contains the value corresponding to T and substitutions σ and θ , outIS returns the process able to obtain the value of what is contained within T , the variable that will be replaced with that value, and the contract term corresponding to the content. For IS in Table 7.2, function outIS is defined as     T = (. . . T j . . . Ti . . . ),     [hhxii `4i υy : [Ti ]θ ], Ti , y if reachable(P, Ti ),      ¬reachable(P, T j )           T = Enc(T1 , T2 ),    Q.[hhz, xii `6 υy : [T2 ]θ ], T2 , y if reachable(P, T2 ),        Q, z = evIS (T1 , σ , θ )           T = AEncd (T1 , T2 ),   [T1 ] = Pk(T ), outIS (P, T , x, σ , θ ) , θ  Q.[hhz, xii ` υy : [T ] ], T , y if 7 2 2  θ  reachable(P, T2 ),          Q, z = evIS (T1 , σ , θ )       T = AEncd (T1 , T2 ),          [T1 ]θ 6= Pk(T )   Q.[hhz, xii `8 υy : [T2 ]θ ], T2 , y if   reachable(P, T2 ),          Q, z = evIS (T1 , σ , θ )     ⊥ otherwise Function reachable (see Definition 7.3.4) is true iff the value of the given parameter can be obtained from the given contract term. Example 8.3.2 For instance, if we have in variable x the message corresponding to a contract term T = Enc(K ∧ , D) where σ = {k/K} and θ = {Key/K, Data/D}, then we have that:

outIS (D, T , x, σ , θ ) = ([hhk, xii `6 d : Data], D, d) Being given the value x of a contract term T and the value of its known parameters (in κ ), then the value of all its fresh parameters can 165

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS be obtained in a substitution ρn using the Crypto-CCS process Q1 . . . Qn returned by function gets, defined as follows:

gets((P1 , . . . , Pn ), T , x, ρ0 , κ, θ ) , Q1 . . . Qn , ρn

(gets)

where Qi , ρi = get(Pi , T , x, ρi−1 , κ, θ ). Function get is defined as follows.  0, ρ {x/P} if T = P       T = F (T1 , . . . , Tar(F ) ),     0 get(P, T , x, ρ, κ, θ ) , Q.R, ρ if Q, T 0 , y = outIS (P, T , x, ρ f κ ∧ , θ ),      0  = get(P, T 0 , y, ρ, κ, θ ) R, ρ    ⊥ otherwise

Example 8.3.3 We can obtain the content of the parameter D from the contract term T = K, Enc(K, D), whose actual value is referred to with variable x, in the following way:

gets((K, D), T , x, ε, ε, θ ) = ([hhxii `41 k : Key] [hhxii `42 ed : Enc(Key, Data)] [hhk, edii `6 d : Data], {k/K, d/D}) The empty substitution is denoted by ε .

The security check expressed in contract terms can be performed using Crypto-CCS. Contract terms of input actions have known parameters to represent that, whatever is received in that action, it should partially match with what is contained in those parameters. If we have a variable x which is bounded to the message received through contract term T , using function get , then we can obtain the value of the fresh symbolic parameters in T . Then, we can evaluate T (through function evIS ) using the known expected parameters to fill the gaps with the received fresh parameters. With this evaluation we obtain the expected value x0 . Finally, we only have to compare the expected value against the actual received value using Crypto-CCS guards, i.e., [x = x0 ]. Example 8.3.4 For instance, in order to verify the integrity of some received data (parameter D) using its hash value and, in addition, compare 166

8.3. FROM SECURITY ADAPTORS TO CRYPTO-CCS that we are receiving a previously known nonce, we proceed as follows. Being

T = D, Hash(D), N ∧ ;

θ = {Data/D, Nonce/N};

κ = {n/N}

then the Crypto-CCS process P able to check the received data x is

P = [hhxii `41 d : Data] [hhdii `1 hd : Hash(Data)]

[hhd, hd, nii `3 x0 : Data, Hash(Data), Nonce] [x = x0 ]

The generation of such a process from a given security adaptor is covered by operator J·K. described below. Security adaptors for sequential agents can be encoded into CryptoCCS processes. Using the previous definitions, the .Crypto-CCS process

corresponding to a security adaptor Σ, S, s0 , F, ,− → compliant with a

contract whose environment is θ , κ , is given by Js0 K , where J·KE θ ,κ

is defined as follows. • JsK , 0 if s ∈ F . θ ,κ

α 0

α 0 , ∑ P where B = {Js ,−→ | s ,−→ sK s } if s 6∈ F . θ ,κ θ ,κ

• JsK

c!T

• Js ,−−→

P∈B

s0 K θ ,κ

, P.c!x.Q where

– P, x = evalIS (T , κ ∧ , θ ). – Q = Js0 K .

θ ,κ

c?T

• Js ,−−−→ s0 K , c?υx : [T ]θ .P.Q.[x = y].R where θ ,κ – P, ρ = gets(pm f (T ), x, ε, κ). – Q, y = evalIS (T , ρ f κ ∧ , θ ). . – R = Js0 K

θ ,κρ

Example 8.3.5 Figure 8.3(b) shows part of the Crypto-CCS process corresponding to the adaptor for the running example. The branch corresponding to the public key authentication has been omitted for the sake of clarity. All the expressions needed to perform the security checks in the adaptor are highlighted in red and bold. 167


8.4 Synthesis adaptors

of

functionally-correct

security

Cryptography is an orthogonal aspect to functionality. If there were no security attacks, malicious users and distrust, there would be no need for cryptography. For this reason, we are going to approach the synthesis process by abstracting away the security or, in other words, we forget about the cryptographic inference system, security checks and symbolic parameters presented so far and focus only on the interfaces of the services and the adaptor. We can do that to a certain level thanks to Theorem 7.2.1, which states that two interfaces synchronise if their corresponding Crypto-CCS processes also synchronise. In this context, functional correctness means contract compliance and absence of deadlocks. However, deadlock freedom cannot always be preserved in the presence of active attackers. For example, an attacker could interrupt any further communication therefore leaving the system in a deadlock state. Because of this, our goal for functional correctness is to preserve deadlock freedom in the absence of active attackers. In this way, we might synthesise an insecure functionally-correct adaptor but, in Section 8.6, we will refine such adaptor into a secure one. There are several approaches [CPS08, MPS08, BCP04, CMS+ 09a, Pad09] in the literature to the synthesis of behavioural adaptors with similar properties of functional correctness. However, these approaches do not support cryptographic messages which must be cryptographically processed on receptions and emissions. Although these related papers could be extended to manipulate cryptographic messages, they still synthesise functionally-correct yet insecure adaptors. This means that the synthesised adaptors succeed in exchanging messages avoiding deadlocks but they might do so in a way that sensitive information remains insecure and therefore the adapted system is vulnerable to secrecy attacks. Alternatively to the solution proposed below, it is possible to use these related approaches by first transforming the problem into an equivalent one without security, synthesise the functionally-correct adaptor using traditional approaches, transform back the result into an adaptor with security, and then proceed to the verification and refinement stages described in this chapter. The ITACA toolbox supports this alternative approach and it allows the visualisation, simulation and analysis of behavioural adaptation contracts. For further details on this alternative procedure, please consult Appendix C. Inspired by this previous work, we now proceed to illustrate how to 168

8.4. SYNTHESIS OF FUNCTIONALLY-CORRECT SECURITY ADAPTORS synthesise functionally-correct adaptors able to manipulate cryptographic messages (whose parts might be encrypted or digested, for instance). For now, we will focus on contract compliance and deadlock freedom, therefore the adaptor synthesised at the end of this chapter is still insecure with regard to the secrecy property to preserve. In Section 8.6, we complete this process by taking advantage of the verification mechanism presented in Section 8.5 so that we can reuse the functions formalised in this chapter to refine functionally-correct adaptors into secured ones. Synchronisations between services and adaptors depend on three conditions: Signature matching. Actions with complementary direction but identical types occurring on the same channel. This can be reduced to comparing transitions in the adaptor interface with those in the service interface and check that are complementary, i.e., same channel, same type and different direction. Contract compliance. The contract allows synchronisation. In other words, each synchronisation between a service and adaptor A corresponds to one of the transitions of the intensional semantics of the contract, Ac . This condition can be understood as a control dependency. Parameter dependencies. The adaptor has the information required to perform the synchronisation. These are the implicit dependencies among the symbolic parameters in contract terms. An example of this second sort of dependencies is that, if a certain parameter is annotated to be known in a contract term (i.e., the value of that parameter is needed to process that contract term) the value corresponding to that parameter should be present in the initial environment or must have been received in advance in a previous transition. The value of a parameter is obtained on input actions where the parameter is annotated to be fresh, or in an output action when the parameter is annotated to be instantiated. In the latter case the value is generated by the adaptor. These are data dependencies.

8.4.1

Data dependencies

We can avoid dealing with parameter dependencies as these can be made explicit in the contract. In order to do that, we refine any given contract into an equivalent one with the dependencies included using rule D EP. 169


v

s− → c s0 pm∧ (v) ⊆ K (D EP)

v

s, K − →cE s0 , K ∪ pm f (v) ∪ pm∗ (v)

. A contract c = Σc , S, s0 , F, − →c , E , where E = θ , κ , is transformed by rule D EP into another contract cE with explicit data dependencies.

. cE = Σc , S0 , s00 , F 0 , − →cE , E where

S0 = S × 2Param ;

s00 = s0 , Dom(κ) ;

F 0 = F × 2Param

In every reachable state of cE , it is guaranteed that the adaptor knows the runtime values needed for the following transitions. We have overloaded function pm to be applicable to vectors, e.g., pm∧ (c?T ♦ c0 !T 0 ) = pm∧ (T ) ∪ pm∧ (T 0 ). Lemma 8.4.1 The contract transformation given by rule D EP satisfies that: • Every adaptor compliant to cE is also compliant to c. vi+1 v0 vi • For every trace of cE such as s0 −→ −−→cE it holds that cE . . . si −→cE si+1 − ∧

pm (vi+1 ) ⊆ Dom(κ E ) ∪

j=i [ j=0

pm f (v j ) ∪ pm∗ (v j )

Lemma 8.4.1 intuitively means that, at every state, the security adaptor is aware of the content of all the symbolic parameters needed to proceed with the following synchronisations. Example 8.4.1 For instance, being a contract

c8 =

c v1 v2 Σ8 , {s0 }, s0 , {s0 }, {s0 −→ c s0 , s0 −→c s0 } , E

where

Σc8 = {v1 = receive!D ♦ , v2 = ♦ send?D∧ }

E = {Data/D}, ε then we obtain the following contract with explicit data dependencies

cE8 =

c Σ8 , {s1 , s2 }, s1 , {s2 },

v1 v2 v1 {s1 −→ cE s2 , s2 −→cE s2 , s2 −→cE s2 } , E

where s1 = s0 , 0/ and s2 = s0 , {D} . 170

8.4. SYNTHESIS OF FUNCTIONALLY-CORRECT SECURITY ADAPTORS

8.4.2

Control dependencies

Now, we have to deal with contract compliance and interface matching. Assuming that the original contract was c0 , then we obtained c = cE0 with explicit parameter dependencies. We are now going to work on a deterministic Ac , which is the state machine which characterises contract c. We start by using A0 = Ac as an initial approximation to the adaptor. However, A0 might present deadlocks when it orchestrates the services, so we are going to do a deadlock analysis on its adaptor interface and prune the controllable branches that cause those deadlocks. This pruning process only works if ia(Ac ) is a tree. This, however, can be easily imposed explicitly by the contract or implicitly by modifying rule D EP in a way that every generated contract state is a unique state. Given the service interface S with an initial state sS0 , the deadlock analysis proceeds as follows. We define an process

iterative synthesis over a candidate adaptor interface Ii = Σ, S, si0 , F, Ti by selecting a α

transition t = s 7−→ s0 ∈ Ti such that: τ ∗

τ +

sS0 ⊗ s0 7− → s1 ⊗ s 7− → s01 ⊗ s0 for some s01 where s01 ⊗ s0 is a deadlock state and an existing trace α¯

τ ∗

s1 7−→7− → s01 . Then we consider a new adapter interface given by: Ii+1 = prune(S, Ii ,t)

where function prune (defined below) removes the given transition t from Ii without creating new deadlocks. The initial candidate is I0 = ia(A0 ) and we iterate while a transition t exists satisfying the above mentioned conditions. It is not possible to synthesise the adaptor if any of the prune(S, Ii ,t) is undefined or if sS0 ⊗ sn0 still presents deadlocks at the end of this process; in either case we return an empty adaptor. Otherwise, A = ia−1 A0 (In ) is an adaptor for S. The reason to check deadlock absence again at the end of the process is that, in some cases, deadlock situations are inherent to service interface S and, no matter how many times we prune the adaptor interface I , the interface S might still reach a deadlock. For example, a service interface S with no final states. Function prune must remove a transition leading to the given transition in a way that it does not cause anymore deadlocks in the process. It must be a deadlock-free pruning. Because of this, we have to check that the transition to remove has a sibling and it is a controllable decision, i.e., the adaptor can control which of those branches is followed by the services. 171

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS This controllability check is done by function prunable. Function prune is formally defined over a tree-like state machine I as follows. Definition 8.4.1 (Deadlock-free pruning) Given

an interface S with initial state sS0 , an interface I = Σ, S, s0 , F , T , and a transition t ∈ T ,

function prune(S, I,t) defines a new interface Σ, S, s0 , F, T 0 where the new transition relation is given by:

T 0 = {u ∈ T | ∃u˜ ∈ O[I].u ∈ u˜ ∧ t 0 6∈ u} ˜ if there exists t 0 ∈ P = prunable(S, I) such that exists a trace in O[I] where t 0 precedes t , and for each of these traces · · ·t 0 · t˜ · t · · · ∈ O[I], t˜ ∩ P = 0/ . If it does not exist such a transition t 0 , we consider that prune(S, I,t) is undefined (represented as ⊥). Function prunable is defined as: τ ∗

prunable(S, I) , {s 7−→ s0 ∈ T | ∀u . sS0 ⊗ s0 7− → u⊗s α

τ ∗ β¯

β

∃β 6= α . s 7−→ ∧ u 7− → 7−→ and τ +

α¯

∀v . u 7− → v , v 7− 6→} Function prunable returns the set of transitions that can be removed from the adaptor without generating new deadlocks. Prunable transitions β

must have a sibling transition (s 7−→) so that the execution can continue through an alternative branch and, in addition, it must be a controllable choice of the adaptor, i.e., none of the services can internally require the α¯

prunable action at the parent state (v 7− 6 →). It is worth noting that mapping prune is well defined because if t 0 exists, it is clearly unique. In addition, the proposed iterative process, which builds a sequence of interfaces {Ii }i=0...n , is also well defined in spite of the apparent non-determinism exhibited by the selection of the transition ti in each step. In fact, the process is independent of the pruning order. This is what the following proposition states. Proposition 8.4.2 Function prune is independent of the pruning order. More formally, given two interfaces S and I , and two transitions t1 and t2 in I, we have:

prune(S, prune(S, I,t1 ),t2 ) = prune(S, prune(S, I,t2 ),t1 ) 172

8.5. VERIFICATION This proposition allows us to univocally define the adaptor c[S] (possibly empty) generated by this pruning process as follows:

c[S] ,

(

0, / 0, / ⊥, 0, / 0/ if In =⊥ ia−1 Ac (In )

otherwise

where In is the last interface produced by the iterative process and I0 = ia(Ac ). In order to demonstrate that prune behaves as expected, first we prove that a pruned adaptor is still an adaptor, and then the main result of this chapter will prove that the synthesis process returns adaptors for the given services, if it converges to a non empty adaptor. Lemma 8.4.3 For any service interface S and any transition t of an adaptor A, if prune(S, ia(A),t) 6=⊥ then

ia−1 A (prune (S, ia(A),t)) A and prune(S, ia(A),t) is deterministic. Now, we can prove that, given a contract, the iterative pruning process for a certain interface (representing services) returns either an empty adaptor or an adaptor for those services, compliant with the contract. Theorem 8.4.4 Given a contract c, the iterative pruning process for a certain interface S, providing the sequence of interfaces {Ii }i=0...n (with I0 = ia(Ac )), satisfies that if In 6=⊥ then ia−1 Ac (In ) is an adaptor for services S compliant with contract c.

8.5 Verification We synthesised a functionally-correct adaptor compliant to a contract in Section 8.4. However, the contract was conceived to support and mediate between the security protocols of the services and these protocols might preserve different security properties. In this section, we will analyse the security implications of including an adaptor in the system and we will specifically verify that the adapted system globally preserves a given secrecy property, even in the presence of attackers. 173


8.5.1

The attacker

Recalling the discussion in Section 8.1, if we consider that the attacker controls the network, the attacker could completely bypass the adaptor and directly communicate with the services, isolating the adaptor from the system.

@X s.t. PkSkX |= psec

(8.2)

In this case, the adaptor process P is in no position to enforce any security because the attacker X can intercept every communication with service S, therefore the only kind of security property that the adaptor can control is that it does not makes things worse. For example, we can guarantee that the adaptor does not disclose sensitive information to an unauthorised party. Alternatively, if we can assume that certain channels cannot be actively manipulated (with \L) or eavesdropped on (with \\L0 ) by the attacker, the adaptor can be used as a gateway or a firewall to sensitive services. In this scenario, we might want to verify security properties regarding the resilience of the interface offered by the adaptor to the attacker. These are external attackers.

@X s.t. ((PkS)\\L0 \LkXφ )\L00 |= psec

(8.4)

The reason for the restriction on L00 is that we are only concerned about the possible synchronisations between the different parties, and this is achieved by including every channel of the system in L00 . In between the previous two kinds of attacks, and also covered by Equation 8.4, we have insider attacks. These are attacks where the intruder has certain privileged knowledge (φ ) and some of the services in the system trust it. Crypto-CCS (Section 7.2.3) gives formal operational semantics which allow us to simulate and verify security-enabled services. We need to i) be able to model the knowledge of the different parties (so that we can model insider attacks), ii) reason on the communications between services and what can be inferred from them using our cryptographic inference system, and iii) specify restrictions on the power of the attacker, so we can distinguish external from network attackers. The operational semantics of Crypto-CCS can be parameterized with the cryptographic inference system IS in Table 7.2 and support our previous assumption about action synchronisations (Theorem 7.2.1). 174

8.5. VERIFICATION

8.5.2

Verifying security adaptors

Note that such properties as (8.4) look like validity statements of mathematical logic, i.e.: ∀X X |= p (8.5) where the formula p must be checked for every structure X . The main difference is that in Equation 8.4 we check the components X in combination with a system S (including the adaptor process and the restrictions of Equation 8.4). We can reduce such a verification problem as Equation 8.4 to such a validity checking problem as Equation 8.5. To obtain this, we apply and extend the partial model checking techniques used for the compositional verification of concurrent systems, see [And95]. Consider a system S in combination with a process X and try to figure out if the whole system SkX enjoys a property expressed by a formula p or not. Then, partial model checking techniques can be used to find the sufficient and necessary condition on X , expressed by a logical formula p//S, so the whole system SkX satisfies p. In short, we have:

SkX |= p

iff

X |= p//S

(8.6)

Using this property, such verification problems as in Equation 8.4 can be easily reduced to such validity problems as in Equation 8.5.

8.5.3

A language to describe protocol properties

We illustrate a logical language (LK ) for the specification of the functional and security properties of a compound system. Language LK was presented in [Mar03] and it is an extended normal multimodal logic with operators which make it possible to specify whether a message belongs to an agent’s knowledge after a computation γ performed by the whole system, starting from a fixed initial knowledge. The syntax of the logical language LK is defined by the following grammar: φ

φ

F ::= T | F |haiF |[a]F | ∧i∈I Fi | ∨i∈I Fi | m ∈ KX,γ | ∃γ . m ∈ KX,γ where a ∈ Act , m is a message, X is an agent identifier, I is an index set (possibly infinite) and φ a finite set of typed messages. The language φ φ without m ∈ KX,γ and ∃γ . m ∈ KX,γ (“knowledge” operators) is called L . Informally, T and F are the true and false logical constants; the haiF modality expresses the possibility to perform an action a and then satisfy F . 175

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS The [a]F modality expresses the necessity that, after performing an action a, the system satisfies F ; ∨i∈I (∧i∈I ) represents the logical disjunction (conjunction). As usual, we consider ∨i∈0/ (∧i∈0/ ) as F (T). A system S satisfies a φ formula m ∈ KX,γ if S can perform a computation γ of actions and an agent of S, identified by X , can infer the message m starting from the set of messages φ plus the messages it has come to know during the computation γ . φ The formula ∃γ . m ∈ KX,γ is satisfied by a system S if there exists a computation γ and an agent X of S s.t. Xφ can infer m during the computation γ . We assume that a unique identifier can be assigned to every sequential agent in a compound system (e.g., the path from the root to the sequential agent term in the parsing tree of the compound system term). Then, given γ γ a sequence of transitions S = ⇒ S0 of a compound term S, let (S = ⇒ S 0 ) ↓X be the sequence of actions of the agent identified by X in S, that have contributed to the transitions of the whole system1 . Finally, the formal semantics of a formula F ∈ LK w.r.t. a compound system S is inductively defined in Table 8.2. Function msgs(γ) returns all the messages occurring in the trace γ and function D(φ ) returns the set of typed messages which can be inferred (through the rules in IS, Table 7.2) from knowledge φ . Table 8.2: Semantics of the logical language.

S |= T S |= F S |= ∧i∈I Fi S |= ∨i∈I Fi S |= haiF S |= [a]F φ S |= m ∈ KX,γ

for for iff iff iff iff iff φ

S |= ∃γ . m ∈ KX,γ

iff

every process S no process S ∀i ∈ I . S |= Fi ∃i ∈ I . S |= Fi a ∃S0 . S −→ S0 and S0 |= F a ∀S0 . S −→ S0 implies S0 |= F γ ∀S0 s.t. (S = ⇒ S0 ) ↓X = γ 0 and m : T ∈ D(φ ∪ msgs(γ 0 )) φ ∃γ . S |= m ∈ KX,γ

1 For simplification, here we leave out the technical details. We can however achieve this result by suitably adding information on the transitions, e.g., see [DP01].

176

8.5. VERIFICATION

Table 8.3: Partial evaluation function for (SkX) \ L, with Sort(SkX) ⊆ L, φ and ∃γ . m ∈ KX,γ . φ

.

∃γ . m ∈ KX,γ //S =

φ 0 0 (c,m0 ,S0 )∈Send(S) hc!m i(∃γ . m ∈ KX,γ //S ) W φ ∪{m0 } 0 //S0 ) c!m0 hc?m i(∃γ . m ∈ KX,γ S−→S0 0 W φ ∪{m } τc,m0 hχc,m0 i(∃γ . m ∈ K //S0 ) X,γ 0 S −→S W φ a ∃γ . m ∈ Kγ //S0 S−→S0 (a=τc,m ,τ) φ m ∈ KX,ε //S

(sending) ∨

W

(receiving) ∨ (eaves−dropping) ∨ (idling) ∨ (nothing to do)

  T m ∈ D(φ ) . . φ φ m ∈ KX,ε //S = ∃γ . m ∈ KX,γ //Nil =  F m∈ 6 D(φ ) where: c?m0

Send(S) = {(c, m0 , S0 )|S −→ S0 and m0 ∈ D(φ )}

Proposition 8.5.1 Given a system S and an agent Xφ where φ is finite and Sort(SkX) ⊆ L, then if m is an initial message we have: φ

(SkXφ )\L |= ∃γ . m ∈ KX,γ

iff

φ

Xφ |= ∃γ . m ∈ KX,γ //S.

Function Sort(S) returns the set of channels occurring in the Crypto-CCS process S. φ We use the formula psec = ∃γ . m ∈ KX,γ to check if there exists a possible intruder X with an initial knowledge φ that can discover some confidential values m while interacting with the services and the adaptor protocol with trace γ . We proceed as in the previous case, now assuming that the adaptor protocol is also part of the system to be analysed. This time we use a partial model checking function as described in [Mar03], and recalled in Table 8.3, to convert psec in Equation 8.4 into a validity problem with psec //S as in Equation 8.5. 177

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS φ

Proposition 8.5.2 Consider the formula F = ∃γ . m ∈ KX,γ //S. Then it is decidable whether or not a model X of such formula exists.

8.5.4

The most general attacker

The process XF constructed by following the proof steps of Theorem 8.5.2 is maximal, i.e., any attack performed by the intruder can be performed by the one developed in the proof. Indeed, the following proposition holds. Proposition 8.5.3 Given a system S and a finite knowledge φ , then for any attacker Xφ such that

(SkXφ )\L |= F where L ⊇ Sort(SkXφ ), F = ∃γ . m ∈ KX,γ //S and γ 0 is such that all the following hold. φ

γ

• γ 0 = ((SkX)\L = ⇒ S1 ) ↓X , • m ∈ D(φ ∪ msgs(γ 0 )) and • m 6∈ D(φ ∪ msgs(γ 00 ) for any γ 00 strict prefix of γ 0 then XF (i.e., the process obtained from the proof of Theorem 8.5.2) also satisfies

γ

((SkXF ) \ L = ⇒ S2 ) ↓XF = γ 0

Example 8.5.1 For our running example and considering that the following

φ is public information, if we want the request coming from service 1 (r1 ) to be secret to third parties (secrecy property psec ). φ = {Pk(ka), Pk(kb), i};

φ

psec = ∃γ . r1 ∈ KX,γ

Then by Theorem 8.5.2 we obtain an attacker

X = hash?x + χhash,x s.t. (SkPkX)\L |= psec where L ⊇ Sort(SkPkX) Attacker X is able to violate psec . In the current setting, there is no adaptor capable of avoiding that attack because the attack does not involve the adaptor at all (the attacker can directly communicate with service 1 via hash?x). However, in a slightly different example where we can assume that possible attackers cannot actively participate in the system (passive attackers are modelled by restricting the system communications with \L), we had this new formula ((PkS)\LkX)\L |= psec . In this case, we still obtain a passive attacker X = χhash,x . 178

8.6. SECURING ADAPTORS THROUGH REFINEMENT Note that we are always interested in the possible communications between services, adaptors and attackers and not in any other external communication. Therefore, we always deal with equations such as (SkX)\L |= p where L ⊇ Sort(SkX) but we often omit this final restriction (\L) leaving just SkX |= p for the sake of clarity.

8.6 Securing adaptors through refinement We are able to synthesise functionally-correct adaptors (Section 8.4) and verify the resulting adapted system with regard to a global secrecy property (Section 8.5). In this chapter we want to refine the synthesised adaptor to prevent the violation of such a property.

8.6.1

Refinement

In some cases, a SAC can be designed to avoid certain attacks, but this is limited to the extent allowed by the service interfaces. A service can be inherently insecure so, if it can be directly accessed by the attacker, there is no way the adaptor can secure it. However, we can guarantee that the adaptor is resilient to attacks and, when the deployment scenario allows it, we can use the adaptor as a firewall to the services. In these cases, we are going to refine the adaptor by removing from its behaviour traces that can be compromised by an attacker. Given an initial adaptor A0 compliant with a contract c whose environment is E , two languages L (of actions which can be eavesdropped on but cannot be actively accessed by the attacker) and L0 (of actions which can neither be accessed nor eavesdropped on by the attacker), the Crypto-CCS processes of the service S, and an attacker X such that (SkJA0 KE )\\L0 \LkX |= psec , we define an iterative refinement process over an adaptor Ai and its interface Ri = Σ, S, s0 , F, Ti by selecting a γ

minimal trace γ such that Q = ⇒ Q0 where Q = (SkJAi KE )\\L0 \LkX and 0 0 Q = Y \\L \Lk0, and proceeding as follows: γ

⇒ Q0 ) ↓JAi KE is empty, then there is no security adaptor capable of • If (Q = preserving psec with L and L0 since the attack did not involve the adaptor. Therefore, the refinement returns the empty adaptor. γ αk α1 • If (Q = ⇒ Q0 ) ↓JAi KE = P0 −→ · · · −→ Pk , then the interface ip(JAi KE ) β1

βk

presents a unique trace P0 7−−→ · · · 7− −→ Pk , and there is a unique trace β1

βk

s0 7−−→ · · · 7−−→ sk in ia(A0 ), since ip(JAi KE ) ia(A0 ). Then, we iterate

179

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS considering βk

ti = (sk−1 7−−→ sk );

Ri+1 = prune(ip(S), Ri ,ti ));

Ai+1 = ia−1 A0 (Ri+1 )

The initial adaptor interface is R0 = ia(A0 ) and we iterate while a trace γ exists satisfying the above conditions. If any of the prune(ip(S), Ri ,ti ) is undefined, we return the empty adaptor. When there are no more of such γ traces, at step n, the final result of the refinement is ia−1 A0 (Rn ). It is worth noting that this iterative process can be used at the same time to synthesise functionally-correct adaptors (i.e., remove deadlocks) being A0 = Ac , Qi = γ ⇒ Q0i SkJAi KE and being Q0i any deadlock state, i.e., Q0i is such that ∃γ . Qi = γ0

and @γ 0 . Q0i =⇒ 0 where every label in γ and γ 0 is either τ or τc,m .

8.6.2

Synthesis and refinement overview

With this refinement process we conclude the synthesis of secure security adaptors which is depicted in Fig. 8.4 and summarised as follows: 1. Convert service Crypto-CCS process S into its interface, i.e., ip(S). 2. Based on that service interface and a given contract, synthesise (syn) a functionally-correct service adaptor A0 = c[ip(S)] as in Section 8.4. 3. Encode the adaptor into a Crypto-CCS process P = JA0 KE , where E is the environment given in the SAC. 4. Verify (Section 8.5) if there exists an attacker X to S and P, i.e., (SkP)\\L0 \LkX |= psec where psec is the logic expression of the attack to avoid being restricted by L and L0 . 5. Refine (ref) the adaptor A0 based on a possible attacker X (Section 8.6). 6. The final result is the adaptor An = ia−1 Ac (Rn ) and its Crypto-CCS process is JAn KE . Our approach synthesises, if it exists, an adaptor A for service interfaces ip(S) compliant to contract c and environment E which preserves a given secrecy property psec . This is formalized as follows. Theorem 8.6.1 Given a contract c, a process S, and a secrecy property psec (restricted to alphabets L and L0 ), the iterative refinement procedure 180

(syn)

In

prune

... .. ... .. .......... .

ia−1 AC

..........................................................................

....... ....... ....... ....... ....... .......

(ref)

|= psec

Rn

prune

... .. ... .. ... .. ... .. ... .. ... .. ... . ........ .

ia−1 A

ia

.............. .............. c .............. .............. .............. .............. . .............. ..

R0 = ia(A0 )

J·K.

characterises

Figure 8.4: Technical overview of the synthesis of secure security adaptors

c = cE0

data deps.

... ... ... ... ... . .......... .

c0

Contract

..........................................................................................................................

A0 = c[ip(S)] = ia−1 Ac (In )

......................................................... I0 = ia(Ac ) .....................................................ia Ac

Adaptor

J·K. JAn KE ..................................................................................................................................................................................................................................................... An = ia−1 Ac (Rn )

X

ip(S)

. ................. ................. ................. .... ................. ......... ................. . . . . . . . . . . . . . . . . . . . . . . . . . . .... ......... ................. ......... ................. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. ............ ................. . .....................

(SkP)\\L0 \LkX

... ... ... ... ... . ......... .

ip

Interface

.............................................................................................................................. ...... ....... ...... . .... . .. .. . ... .. .... .... .......

P = JA0 KE

S

Crypto-CCS

8.6. SECURING ADAPTORS THROUGH REFINEMENT

181

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS which provides a sequence of interfaces {Ri }i=0...n (with R0 = ia(c[ip(S)])), satisfies that if Rn 6=⊥ then A = ia−1 Ac (Rn ) is an adaptor for services ip(S) compliant with contract c such that property psec is preserved for L and L0 , i.e.:

@X . P\L00 |= psec

for the system

P = (JAKE kS)\\L0 \LkXφ

where E is the environment of contract c, L00 ⊇ Sort(P), and φ is the initial knowledge of the attacker. Example 8.6.1 Going back to our running example, being the adaptor process P = JAKE , we had seen in Example 8.5.1 that it was possible for an attacker to learn the request coming from service 1 (r1 ) or, more φ formally, (PkS)\LkX |= psec being psec = ∃γ . r1 ∈ KX,γ is satisfied by an attacker X = χhash,x . This attacker is the result of the verification process (see Theorem 8.5.2). Then, we calculate the trace of synchronisations γ made by the adaptor during the attack (Q =⇒ Q0 ) ↓P , whose corresponding interface contains the single action:

hash?Id, Req, Nonce, Hash(Id, Req, Nonce, Secret) During the refinement process, we prune that action from the adaptor interface and then we convert the result into an adaptor using function ia−1 A . The resulting adaptor corresponds to the blue bold transitions of Fig. 8.3. This final adaptor succeeds in preserving psec because it allows only the use of the public-key protocol where the request r1 is only known between services 1 and 2. Not even the adaptor, which forwards all the messages between the services, is aware of the request. Example 8.6.2 Blue bold transitions in the adaptor of our running example are based on the Needham-Schroeder public-key protocol. This protocol is known to have a flaw where a malicious trusted service can impersonate a third one. In order to analyse this new scenario, we have to include the attacker X as a trusted insider so we consider the system to be

S = (S1,2 + S1,X )k(S2,1 + S2,X )kS3 with

φX = {iX , i1 , Pk(ka), Pk(kb), kx, sX,1 , sX,2 } 182

8.7. CONCLUSION AND FUTURE WORK where subscripts represent the legal parties in the communication, e.g., S1,2 represents service 1 talking with service 2, r1 and d1 is the request and data for service 1, s1,2 is the secret shared between services 1 and 2, and so on. The contract has to be extended to include allowed communications between the services and X because X is a trusted service or inside attacker. Then, as expected, the verification process returns an attacker X such that φ

(SkJAKE )\LkX |= ∃γ . {r1 , d1 } ⊆ KX,γ where L restricts every channel not prefixed with X . This attacker does not use the hash! action to impersonate another service as it does not know the shared secret between service 1 and 2. If the attacker tried to use action hash with a random secret, this would have resulted in a security failure recognised by the adaptor process, which would have become 0, therefore stopping all communications. For this reason, and because of the flaw in the Needham-Schroeder public-key protocol, every branch of attacker X requires going through pk_auth actions on the adaptor. During the pruning, the adaptor could use actions denied to avoid the attack but the contract enforces that such actions depend on no_access actions from service 2, so it is not a controllable decision, and therefore it is not prunable. The other alternative, and the solution for this example, is to prune the adaptor at the pk_auth action, therefore enforcing communications with X using the hash-based schema. As in Example 8.5.1, attacker X is still able to eavesdrop on the request r1 but it cannot learn the reply d1 , hence preserving the new secrecy property of this example. Such an adaptor corresponds to the non-bold black transitions of the right-hand side of Fig. 8.32 .

8.7 Conclusion and future work We have presented an approach to the synthesis, verification and refinement of secure security adaptors. The desired interactions between the adaptor and the services, together with the cryptographic operations that must be performed by the adaptor on every intercepted message, are described in SACs. These contracts are able to overcome incompatibilities 2 Actually, the adaptor is the parallel composition of A 1,2,3 (the original adaptor mediating between the services of the running example), A1,X,3 (the same adaptor but replacing service 2 with X ) and AX,2,3 (the adaptor which uses X as service 1) because X is an inside attacker included in the adaptation. All of these are copies of the non-bold black transitions of Fig. 8.3 but each one has the appropriate initial environment and action prefixes to interact with their corresponding services.

183

CHAPTER 8. SYNTHESIS OF SECURE SECURITY ADAPTORS in signature, behaviour and security QoS. Based on the contract and the behaviour of the given services, the adaptor is synthesised to avoid deadlock and livelock situations. In a second step, the synthesised behaviour of the adaptor is verified and refined using Crypto-CCS to forbid any interaction with the adaptor that can violate the secrecy properties we want to preserve. At the end of this process we obtain an adaptor robust against attacks to the given secrecy property, compliant with the given SAC and able to overcome service incompatibilities. Our approach is versatile enough to cope with a range of security protocols and deployment scenarios with different zones of trust such as an attacker which controls the network, a trusted insider or an external attacker which can participate actively or passively in the communications. As regards future work, we plan to extend our work with other verification approaches which cover multiple attackers at the same time. In addition, we want to support adaptor synthesis by untrusted third parties and, in these cases, we plan to use Proof Carrying Code [NL98] techniques based on SACs.

184

Part IV

Final remarks

64. Often it is the means that justify the ends: Goals advance technique and technique survives even when goal structures crumble. A.J. Perlis - Epigrams in programming

9

Future work Software adaptation is an interesting topic. It is wide enough to cover a whole range of different technologies and areas so that you can always extrapolate your results to other domains. You can also dive deep into any of those topics, always trying to push harder the limits of computability, automatism and efficiency of both brand-new and well-stablished algorithms. And, what is more exciting, it paves the way to promissing synergies with other research areas and hot topics like artificial intelligence, model-checking, cloud-computing, mobile applications and, in general, anything that boils down to the formal composition of distributed entities. Being software adaptation a field full of oportunities, there are several open fronts for automatic adaptation of WSs. From less to more important, I would personally highlight the following topics.

9.1 Sound generation of adaptation contracts Every thesis has a beginning, when you have to discover your niche and look for interesting problems and the best tools to approach them, it is also the time when everything is new but also more shaky. This thesis started with the automatic generation of adaptation contracts (Chapter 6) and therefore it is not as consolidated as the rest of the thesis. Chapter 6 was included in this document because it helps giving a more complete view of the adaptor-oriented development process and, more importantly, it introduced some novel ideas. On the one hand, due to the lack publicly available services with behavioural descriptions, the heuristic function which governs contract generation was evaluated based on synthetic hand-made examples. Additionally, the algorithm to divide and merge partial adaptation contracts in the presence of internal conditions is computationally expensive and errorprone. Therefore, there is space for improvement and further research. 187

CHAPTER 9. FUTURE WORK On the other hand, we approached an interesting problem in a unique way. Other approaches directly contract generation and go directly to the synthesis stage. In this manner, they apply their automatic matching techniques to the synthesis, hence not requiring the contract, but this lack explicit contract is, indeed, a drawback. Adaptation contracts provide a high-level and concise description of the adaptor, without them, this information is lost among hundreds and thousands of adaptor transitions. Therefore, you lack a more fine-grained control of the adaptor, there is less information to debug and optimise the adaptation and not knowing why the adaptor behaves as it behaves can lead the developer to frustration. The heuristic function requires further validation but is simple to understand, it showed promising results and it is easily expandable using the complementary expert system. In addition, we reduced the contract generation problem to the graph search problem, and there are lots of consolidated results in this area to draw strength from. In addition, Dinapter exploits the assumption that compatible data have the same type. This restriction is not always true and it could be relaxed by using semantic matching techniques, automatic comparators between XML files representing service interfaces (see [ADMR05]) and going further in depth into work based on service similarity [OSP10]. To sum up, automatic contract generation is a very interesting topic which touches several branches (artificial intelligence, natural language, semantic ontologies, graph search, testing, . . . ), it is an important part of the adaptation-oriented development process and it has several applications to other problems such as reverse-engineering, reconfiguration, documentation and debugging.

9.2 Automatic encoding of adaptors into executable languages We have discussed dynamic adaptors, whose behaviour is learned at run time (Chapter 4) and we have also presented the synthesis of security adaptors, where the behaviour of the adaptor is synthesised, verified and refined. Either way, the synthesis process resulted in an abstract model of the adaptor. This model serves to motivate the research and illustrate the problems solved (incompatibilities, deadlocks and security flaws) but it lacks the details required to implement the adaptor in, for instance, a BPEL process. The adaptor behaviour represents input and output actions, external choices and loops whose conditions are also external (the adaptors discussed in this work cannot present internal choices). However, they 188

9.2. ENCODING ADAPTORS INTO EXECUTABLE LANGUAGES Service Interfaces (Abstract BPEL+WSDL) Designer Interactive Contract Specification + Simulation and Verification (ACIDE)


"

!

#

!

$ "



...


Adaptation Contract




Figure 9.1: Revisiting the adaptation-oriented development process

do not contain the information required to locate the service (URL) nor the XML schema which represents the data types, nor the actual WSDL description of the adapted services. An adaptor encoded into a BPEL process requires of such information. Revisiting Fig. 9.1, we refer to the stage labelled as STS2BPEL. In this step, the adaptor behaviour/protocol is encoded into an executable BPEL process. We sketched an algorithm to encode adaptor behaviour into BPEL in [CMS+ 10a]. This algorithm was based on two pillars. First we annotated the deployment information of the adapted services (i.e., URLs and general invocation information, XML schemas, correlation sets and WSDL descriptions) as comments to the service behaviour so as to recover it during the adaptor implementation phase. In order to comply with the structure and grammar of BPEL we applied a FSM pattern. We created a variable to keep track of the current state in the adaptor behaviour, the BPEL code was composed of an initial reception at the adaptor (mandatory in BPEL), and then a loop whose break condition is to be in a final state. Within the loop, there is a switch/if structure that, depending on the current state, it mechanically encodes every branch of the adaptor behaviour into a sequence of invoke/receive activities or a pick, if it presented an external choice. In addition, every reception was followed by an assignment where the parameters are stored in local variables and every invocation was preceded by another assignment to compose its arguments. This algorithm, although viable, was never implemented and only executed by hand. In addition, it is restricted to a subset of BPEL (no 189

CHAPTER 9. FUTURE WORK exceptions, links or compensation handlers) and, since to date there are no formal semantics for BPEL, this algorithm is sensitive to the specifics of each BPEL engine. It might not be interesting work in terms of research but the definition of formal semantics for BPEL and the encoding of adaptor behaviour into BPEL, or any other implementation language, are crucial open problems for the real-world application of this thesis.

9.3 Application and evaluation over a real-world project There are few publicly available services with WSDL interfaces and almost none with an explicit behaviour encoded in BPEL. This fact does not invalidate this thesis, almost every Web service has an implicit behaviour (normally, authenticate and then execute privileged operations at least, e.g., Example 5.1.1) but this behaviour is usually informally described in natural-language documentation. This is a real deal-breaker to test the validity of our contributions based on a reasonable-sized testbed of real-world examples. In our case, we used examples obtained from the Construction and Analysis of Distributed Processes (CADP) toolbox and some others that we modelled ourselves from the textual documentation of actual Web services. However, in order to make an exhaustive evaluation, it would be desirable to automatically extract signature and behaviour information from the natural-language documentation of WSs. Finally, without an automatic way to encode actual WSs into our internal model, and without an algorithm to deploy the synthesised adaptors into a WS, the development of real-world system based on adaptors requires quite some effort. In addition, we have not had the time, money or people to contact with enterprises interested in implanting our toolbox.

190

Science is a way of thinking much more than it is a body of knowledge. C. Sagan

10 Conclusion

There are two ways to develop a thesis: in depth or out wide. This thesis belongs to the second category. Circling around the topic of adaptation (Chapter 2), we have touched several other areas finding synergies between adaptation and WS discovery, pervasive environments and QoS security. Software adaptation is very versatile and can be applied to any project where software needs to be reused or reegineered. However, software adaptation is not a hammer for every nail. It has its drawbacks and its advantages, so we devoted this thesis to overcome the former in order to foster the later.

10.1 Adaptation drawbacks Software adaptation has the following drawbacks: 1. Adaptors are an intermediary which incurs an extra overhead. It is another service, with extra management, maintenance and additional delays. 2. Adaptors, in this thesis, are orchestrators and, as a single point which intermediates between several services, they represent a single point of failure, and hence adaptation reduces the resilience of the system. 3. Adaptation blurs boundaries. The interfaces, semantics, QoS and security policies of different services are manipulated and changed, possibly in ways not originally expected nor intended by their developers and providers. Matching originally incompatible interfaces usually implies subtle bugs due to the unexpected interactions between the services. In addition, adaptors are exponentially large with regard to the services they adapt, and this complicates further the debugging and development of adapted systems. 191

CHAPTER 10. CONCLUSION 4. Mapping the signature and behaviour of several services so as to support their interoperation and to avoid deadlocks is a complex and error-prone task. There are numerous options during the mapping and it is not always easy to understand their implications on the final result due to undesired interactions.

10.2 Proposed solutions Throughout this document, we have presented solutions to compensate all these drawbacks: 1. An adapted system can never be as efficient as a system without adaptors. Adaptation is an overhead. Despite this inherent inefficiency, we have managed to achieve lightweight and scalable solutions both in WS discovery (Chapter 5) and dynamic adaptation (Chapter 4). In the former, we abstracted the necessary conditions for two services to be adaptable (using AD trees), reduced them into their minimum expression (FD rules) and we exploited them to simplify the discovery process using search trees (FD trees). In the latter, we skipped one of the most expensive steps in adaptation, the synthesis. We did this by learning from the interactions between services at the cost of some failures but, at the end, this process resulted in correct adaptors. Furthermore, the knowledge learned by these dynamic adaptors could be distributed so as to avoid the same learning failures in recently deployed adaptors. 2. There are some efforts towards distributing adaptation [Sal08]. Instead of having a single central adaptor, this is split in a set of wrappers which are distributed among the adapted services. Additionally, learning adaptors are able to detect failures in the adapted system and react accordingly. Therefore, software adaptors do not have to become a single point of failure and, indeed, they can actually increase the resilience of the adapted system against incompatibilities, unexpected changes and secrecy attacks. 3. Unlike other approaches, we promote the use of adaptation contracts (Section 2.1). These contracts make explicit the mapping between the operations of the system, they provide a high-level and intuitive description of the system behaviour (Chapter 7) and they allow us to verify and simulate the system (e.g., using the ITACA toolbox) while hiding interleaving details and automatising deadlock and security analysis (Chapter 8). 192

10.3. ADVANTAGES 4. Dinapter (Chapter 6) generates adaptation contracts, and therefore it to automatises the mapping process. It guarantees deadlock-free solutions which can be simulated and customised with ITACA.

10.3 Advantages If we managed to diminish the drawbacks with our solutions, then you can enjoy the following advantages: 1. Adaptor-oriented development process foster the reuse of legacy services and it eases the inclusion of third-party services. 2. Adaptor development is mostly automatised. 3. Our adaptation toolbox has built-in simulation and formal verification algorithms. 4. Dynamic adaptation (and learning adaptors in particular) enhances the robustness of the system since they can react to drastic changes in service behaviour and unexpected failures. 5. It is a step towards automatic self-reconfiguration putting together the discovery of adaptable services, the automatic generation of adaptation contracts and learning adaptors.

10.4 Summary The Internet has provided a myriad of services, cloud computing has provided the infrastructure and personal hand-held devices provide pervasiveness. The only piece missing in the puzzle is an easy way of putting it all together so that you can enjoy applications composed of several of these services and not being trapped into a vendor-locked framework. However, things get complicated when you come out of the homogeneous and coherent view of a single provider and get into the wilderness of services implemented in various platforms, with different interfaces, protocols and semantics. Although service providers offer an external interface, services are not always meant to easily cooperate with other services (possibly offered by a competitor company) and incompatibilities, security flaws and other undesired interactions tend to occur. 193

CHAPTER 10. CONCLUSION In this thesis we proposed the use of adaptors to ease the development of systems composed of several, initially incompatible, services. Throughout this document we have addressed various stages of the adaptororiented development process (discovery, design, synthesis, verification, refinement and dynamic adaptation) and we have applied them to different scenarios (pervasive computing, WS registry and security enforcement). We started with WS discovery in Chapter 5. Traditional service discovery approaches, which perfectly match the query and the services, are too restrictive if we want to find and compose incompatible, but adaptable, services. Instead, we developed a search tree able to efficiently discover services whose signature and behaviour could be adapted to interoperate despite their incompatibilities. Once the adaptable services are discovered, we have to design an adaptation contract (Chapter 2). This contract is a mapping between the operations (and their arguments) between services and a FSM which (optionally) controls the way in which the contract is applied. This contract can be designed using assisted development environments such as ITACA [CMS+ 09a] or automatically generated using Dinapter (Chapter 6). Then, this contract can be used to simulate and verify the model of the system before proceeding to the next stage. Then we proceed to synthesise the behaviour of the adaptor which complies with the contract and the services. The synthesis process guarantees that all the services and the contract always reach a final state avoiding deadlocks and livelocks, therefore fulfilling the goal of the adaptation. In addition, we saw how to encode the security policies governing the services into a security adaptation contract (SAC), described in Chapter 7. The synthesis of secure adaptors using SACs (Chapter 8) requires an additional step: refinement. Once an adaptor compliant with a given SAC is synthesised, it has to be analysed in order to spot security flaws where sensitive information is disclosed. If this happens, then those interactions are pruned from the secure adaptor during the refinement stage while preserving deadlock freedom. Both contract generation and adaptor synthesis are meant to be done at design time. However, if the services can change their behaviour drastically (maybe part of the service is disabled due to a hardware malfunction) then the adaptor needs to be re-synthesised. If we want to avoid this step (the synthesis process is costly in computational terms) or if deploying new adaptors is unfeasible (e.g., we might be deploying adaptors in every mote of a WSAN), then we can resort to dynamic adaptors (Chapter 4) able to learn and react to these changes. 194

10.5. EPILOGUE

10.5 Epilogue This concludes my thesis after four-plus years of hard work. Regardless of how you name it orchestration design, service choreography, assisted/automatic composition of WSs or coordination, adaptation is a necessary tool if you do not want to build a project from scratch. Throughout this thesis, I have presented a formal methodology to automatise the development of correct adaptors and tried to convince you of its many advantages. I hope I succeeded. If I did, please take advantage of my work and, if it helped you, reference it. If you found it even more life-changing, then you might consider offering me a postdoc or (finger-crossed) a permanent position. If you did not agree with me, if you considered that there are obscure parts or found any flaws in my dissertation, do not hesitate to contact me (“José Antonio Martín” ). Constructive criticism is more than welcome. If our disagreement is interesting enough, we might even write a paper together or, at least, I might get another reference from all the fuss. I sincerely hope you found this paper interesting and not too tiresome to read. Thank you.

195

Part V

Appendix

A

40. There are two ways to write error-free programs; only the third one works. A.J. Perlis - Epigrams in programming

Proofs

Proposition 4.2.1 Let S and P be two bounded services, A0 be an adaptor for contract c in its initial state A0 = {hsc0 , 0i} / , and I0 be a (possibly empty) sequence of inhibited traces. If the adaptor employs a monotonic learning function, then there exists a sequence I0 , I1 , . . . , In , with a finite n ≥ 0, such that: τ

I j+1

1. ∀ j ∈ [0, n) ∃S0 , P0 . S|hA0 , I j , λ ic [P] − → ∗ −−−→ S0 |hA0 , I j+1 , λ ic [P0 ] with I j v I j+1 , and τ

In+1

2. 6 ∃S0 , P0 , In+1 . S|hA0 , In , λ ic [P] − → ∗ −−−→ S0 |hA0 , In+1 , λ ic [P0 ] with In @ In+1 . Proof The proof immediately descends from the boundedness of S and

P and from the monotonicity of the add function.

Proposition 4.2.2 Let S and P be two bounded services, A0 be an adaptor for contract c in its initial state A0 = {hsc0 , 0i} / . If the adaptor employs a monotonic learning function, and I is a complete sequence of inhibited traces, then for every S0 , A0 , t 0 and P0 such that

τ S | hA0 , I, λ ic [P] − → ∗ S0 | A0 , I,t 0 c [P0 ] where A0 6= A0 , there exists a sequence of τ transitions

τ S0 | A0 , I,t 0 c [P0 ] − → ∗ S00 | A00 , I,t 00 c [P00 ] such that A00 ∩ OK c 6= 0/ . Proof Let S0 , A0 6= 0/ , t 0 and P0 be elements satisfying

τ + S | hA0 , I, λ ic [P] 7− → S0 | A0 , I,t 0 c [P0 ] 199

APPENDIX A. PROOFS Note that t 0 6= λ , because A0 6= 0/ and, therefore, at least one of the E XT or I NT rules have been applied. We will now proceed by reductio ad absurdum. Let us suppose that every trace of τ -transitions like

τ ∗ S0 | A0 , I,t 0 c [P0 ] 7− → S00 | A00 , I,t 00 c [P00 ] progresses to an adaptor state such that A00 ∩ OK c = 0/ . Then, we could apply rule L EARN combined with PAR, and therefore

add(t 00 ,I) S00 | A00 , I,t 00 c [P00 ] 7−−−−−→ S00 | A0 , add(t 00 , I), λ c [P] As add is monotonic, t 00 ∈ prefixed(add(t 00 , I)). On the other hand, I is a complete sequence of inhibited traces, and therefore, t 00 ∈ prefixedBy(I). However, it is not possible to reach an adaptor state like hA00 , I,t 00 ic , where t 00 ∈ prefixed(I), because both rules E XT and I NT (which are the only ones capable of increasing the trace) include the proviso I ∈ / t which is applied to every prefix of t 00 . Proposition 4.2.3 Let S and P be the initial states of two bounded services. Let us consider an adaptation contract c which corresponds to an adaptor with an initial state A0 and a monotonic learning function. If I and I 0 are complete sequence of inhibited traces resulting from a learning process starting in S | hA0 , I0 , λ ic [P], then

I v I0

and

I0 v I

Proof Because of the symmetry of the proposition, it is enough to prove that for every t . I ∈ t it exists t 0 . I 0 ∈ t 0 such that t 0 is a prefix of t . Then, there exists J @ I , S0 , and P0 , such that τ +

S | hA0 , J, λ ic [P] 7− → S0 | hA0 , add(t, J), λ ic [P0 ] In addition, J cannot contain any of the prefixes of t , because this would have prevented the previous configuration to proceed, due to premises of rules E XT and I NT. Now, we can proceed by reductio ad absurdum. Let us assume that I 0 does not contain any prefix t 0 of t . Then, after the learning process in which the I 0 was derived, we could reproduce the same sequence of τ -transitions than before, in such a way that:

τ + S | A0 , I 0 , λ c [P] 7− → S0 | A0 , add(t, I 0 ), λ c [P0 ] where I 0 @ add(t, I 0 ) (notice that I 0 ∈ / t , but t ∈ prefixed(add(t, I 0 ))). However, 0 I was a complete sequence of inhibited traces, which is a contradiction. 200

Proposition 7.3.1 Given a contract term T which matches a type ( T ` T ), if two substitutions θ1 , θ2 enable this match ([T ]θ1 = [T ]θ2 = T ), then

θ1 (P) = θ2 (P) for all P ∈ pm(T ). Proof This proposition is demonstrated trivially by induction on the structure of T –see Definition 7.3.3. Lemma 7.2.1 If two Crypto-CCS processes which do not eavesdrop synchronise, then their corresponding interfaces also synchronise. More formally,

ip(P) ⊗ ip(Q) − → ip(P0 ) ⊗ ip(Q0 ) τ

PkQ −→ P0 kQ0 , α ∈ {τc,m , τ} α

if

Proof Synchronisation among services which does not eavesdrop occur by rule ‘k2 ’ (described in Table 7.3). The condition of this rule requires the transitions of the two services to be labelled with the same message, thus the transitions have the same type and by extension, they present the same interface. Proposition 8.2.1 Given an adaptor A compliant with a contract c, then ia(A) is a deterministic interface. In addition, if the transition relation of c presents a tree structure, then ia(A) also presents a tree structure. Proof As A is simulated by Ac , we only need to prove this result for Ac . Indeed, by construction of Ac , ia(Ac ) is a deterministic tree (see Sections 7.3.3 and 8.4.2), and if st ∈ ΣA progresses to two different αi

states st1 and st2 by st ,−→ sti (i = 1, 2), then [α1 ]θ 6= [α2 ]θ . Therefore, ia(A) is still deterministic. On the other hand, Ac may present different αi

transitions joining two given states: st ,−→ st 0 (i = 1 . . . n), but in that case [α1 ]θ = · · · = [αn ]θ , and then ia(Ac ) keeps a tree structure. Proposition 8.2.2 Given an adaptor A compliant with a contract whose type substitution is θ then

I ia(A) ⇒ ia−1 A (I) A 201

APPENDIX A. PROOFS Proof Let us consider:

· · I = ΣI , OI , sI0 , F I , 7− →I and A = ΣA , OA , sA0 , F A , ,− →A

In order to distinguish different transition relations in ia(A) and ia−1 A (I), we will denote them by 7− → and ,− →, respectively. We are going to prove that, given s1 ∈ OI and s2 ∈ OA , if s1 in 7− →I is simulated by s2 in 7− → (s1 I s2 ), then s1 in ,− → is also simulated by s2 in ,− →A (s1 A s2 ); and we will proceed by induction on the transitions departing from s1 in ,− →. ·

Base case. If s1 ,− 6 → (there is no transitions from s1 ), then the only condition of simulation (Definition 2.3.1) to be checked is the second one: if s1 is final in ia−1 A (I), then it is also final in I , and hence s2 is final in ia(A), because s1 I s2 . But final states in ia(A) coincide with final states in A (by construction). Therefore, s2 is also final in A, and we conclude s1 A s2 . Inductive case. Let us assume that s01 A s02 (given s01 I s02 ) holds for α

every s01 such that s1 ,−→ s01 for some α ∈ ΣA . Then, by construction [α]θ

α

of ia−1 →I s01 and s1 7−−−→I s01 . As s1 I s2 , A (I), we have that both s1 ,− [α]

θ s2 7−−−→ s02 for some s02 such that s01 I s02 . By induction hypothesis

[α]

α

θ s01 A s02 , and also s2 ,−→A s02 , because s2 7−−−→ s02 in ia(A).

Lemma 8.4.1 The contract transformation given by rule D EP satisfies that:

• Every adaptor compliant to cE is also compliant to c. vi+1 v0 vi −−→cE it holds that • For every trace of cE such as s0 −→ cE . . . si −→cE si+1 − ∧

pm (vi+1 ) ⊆ Dom(κ E ) ∪

j=i [ j=0

pm f (v j ) ∪ pm∗ (v j )

Proof The proof is trivial following rule D EP, which makes explicit data dependencies, and the definition of adaptors (Definition 2.3.2). Proposition 8.4.2 Function prune is independent of the pruning order. More formally, given two interfaces S and I , and two transitions t1 and t2 in I, we have:

prune(S, prune(S, I,t1 ),t2 ) = prune(S, prune(S, I,t2 ),t1 ) 202

Proof Let I1 = prune(S, I,t1 ) and I2 = prune(S, I,t2 ), T1 and T2 their corresponding transition relations, and P = prunable(S, I). If I1 is undefined (⊥), then the result is trivial, if we consider prune(S, ⊥,t) =⊥, because if there is no t 0 ∈ P satisfying the pruning condition for t1 in I will neither exist a such transition in prunable(S, I2 ), since prunable(S, I2 ) ⊆ P. The symmetrical reasoning could be made if I2 =⊥. Thus, we can suppose both I1 and I2 are not undefined. Let us consider t ∈ prune(S, I1 ,t2 ). If t 6∈ T2 , it would exist t 0 ∈ P such that a trace exists, u˜ = · · ·t 0 · t˜ · t2 · · · ∈ O[I], where t˜ ∩ P = 0/ and t is included in t˜ or after t2 . As t ∈ T1 , u˜ ∈ O[I1 ]. Now, if we consider P1 = prunable(S, I1 ), we have two alternatives: (i) t 0 ∈ P1 . Then, t 6∈ prune(S, I1 ,t2 ). (ii) t 0 6∈ P1 . Clearly, t˜ ∩ P1 = 0/ because P1 ⊆ P, and t˜ ∩ P = 0/ . Therefore, one of the following conditions holds: (a) There exists t 00 preceding t 0 in u˜ such that t 00 ∈ P1 , and then transition t 6∈ prune(S, I1 ,t2 ).

prune(S, I1 ,t2 ) t 6∈ prune(S, I1 ,t2 ).

(b) Otherwise,

is

undefined,

and

trivially,

Thus, we have proved that t ∈ prune(S, I1 ,t2 ) implies t ∈ T2 . Let us suppose that t 6∈ prune(S, I2 ,t1 ). Then, there exists a transition t 0 ∈ P2 = prunable(S, I2 ) such that a trace exists, u˜ = · · ·t 0 · t˜ · t1 · · · ∈ O[I2 ], where t˜ ∩ P2 = 0/ and t is after t 0 (i.e., it has been pruned). If t is after t1 , then t 6∈ T1 because t 0 ∈ P1 ⊆ P (that is, it was already pruned by I1 ), and t 6∈ prune(S, I1 ,t2 ), such as it was assumed initially. Therefore, t˜ = u˜1 · t · u˜2 . But, as we have already mentioned, by hypothesis, t ∈ T1 , and then u˜1 ∩ P = 0/ . Thus, to prune t1 in I2 , it must exist t 00 ∈ P after t , i.e.such that u˜2 = u˜2 0t 00 u˜2 00 , with u˜2 00 ∩ P = 0/ . But, t˜ ∩ P2 = 0/ , hence t 00 6∈ P2 , and the only way to have a transition t 00 prunable in I but not prunable in I2 is because (see definition of prunable mapping) some branch previous to t 00 , and reaching t2 has been pruned in T2 . As t is still in T2 , we can find t 000 ∈ P in u˜1 · t · u˜2 0 such that a trace u˜0 = · · ·t 0 u˜1 · t t˜1 · t 000 · t˜2 · t2 · · · ∈ O[I] exists, coinciding with u˜ until t˜1 , satisfying t˜2 ∩ P = 0/ . If u˜0 6∈ T1 , t 6∈ prune(S, I1 ,t2 ); then u˜0 ∈ T1 . Taking into account that transitions after t 00 in u˜ are not in T1 , we have that t 0 (or some previous transition) is in P1 and t˜ ∩ P1 = 0/ (if they are not, by considering definition of prunable mapping, t 0 6∈ P2 or t˜ ∩ P2 6= 0/ , which are the case). Thus, we conclude that there exists t 0 ∈ P1 such that u˜0 = · · ·t 0 · t˜0 · t2 · · · ∈ O[I1 ] with t˜0 ∩ P1 = 0/ , and t is included in t˜0 . In fact, this 203

APPENDIX A. PROOFS is true for t˜0 = u˜1 · t · t˜1 · t 000 · t˜2 , because u˜1 · t · t˜1 ∩ P1 = 0/ (since u˜1 · t · t˜1 is a subtrace of t˜), and also t˜2 ∩ P1 = 0/ (since P1 ⊆ P and t˜2 ∩ P = 0/ ). However, at this point, we get a contradiction because t ∈ prune(S, I1 ,t2 ). Lemma 8.4.3 For any service interface S and any transition t of an adaptor A, if prune(S, ia(A),t) 6=⊥ then

ia−1 A (prune (S, ia(A),t)) A and prune(S, ia(A),t) is deterministic. Proof We know that for every S, I and t such that prune(S, I,t) 6=⊥ it happens that prune(S, I,t) I because function prune only removes transitions from I . Then, by applying Theorem 8.2.2 we obtain ia−1 A (prune(S, ia(A),t) A. For the same reason (transitions in the pruned interface are a subset of transitions in ia(A)), prune(S, ia(A),t) is also deterministic because ia(A) is deterministic. Theorem 8.4.4 Given a contract c, the iterative pruning process for a certain interface S, providing the sequence of interfaces {Ii }i=0...n (with I0 = ia(Ac )), satisfies that if In 6=⊥ then ia−1 Ac (In ) is an adaptor for services S compliant with contract c. Proof Given an interface S and a contract c, if we consider the interface generated in each step i of the pruning process Ii = prune(S, Ii−1 ,t), we can prove that it satisfies c c ia−1 Ac (prune(S, Ii ,t)) A and Ii ia(A )

if Ii 6=⊥ (i = 0, . . . , n − 1), being I0 = ia(Ac ). In fact, by Theorem 8.4.3, this is true for n = 0. If we proceed by induction on i, and we assume c c as inductive hypothesis ia−1 Ac (prune(S, Ii−1 ,t)) A and Ii−1 ia(A ), then we can derive the result by applying Theorem 8.2.2. Thus, the resulting adaptor (if it is not empty) at the end of the process is still compliant with contract c. Additionally, every deadlock avoidable by the adaptor is removed by the pruning process, which does not generate new deadlocks, and this process converges because prune is a monotonically decreasing function (w.r.t. transitions in the adaptor). Proposition 8.5.1 Given a system S and an agent Xφ where φ is finite and Sort(SkX) ⊆ L, then if m is an initial message we have: φ

(SkXφ )\L |= ∃γ . m ∈ KX,γ 204

iff

φ

Xφ |= ∃γ . m ∈ KX,γ //S.

Proof This result corresponds to Proposition 4.3 in [Mar03]. φ

Proposition 8.5.2 Consider the formula F = ∃γ . m ∈ KX,γ //S. Then it is decidable whether or not a model X of such formula exists. Proof We prove the thesis by structural induction on S and F ; furthermore, if F is satisfiable, we construct a model XF for such a formula.

• F =F. Then, no sequential agent models F , and thus F is not satisfiable. • F =T. Then, every sequential agent models F . Let (XF )φ be (0)φ • Then F is the disjunction of several formulas, say F = F1 ∨F2 ∨F3 ∨F4 ∨F5 . Each of these formulas corresponds to a behaviour of X as it follows from the partial model checking table. At least one of these formulas must be satisfiable, if F is satisfiable. For each satisfiable formula Fi , with i = 1..5, we built a process Xi and as XF we consider the summation of these processes. – F1 =

0 (c,m0 ,S0 )∈Send(S) hc!m i(∃γ

W

. m ∈ KX,γ //S0 ). φ

∗ Consider the set of formulas F1,S0 = (∃γ . m ∈ KX,γ //S0 ) that φ

are satisfiable and consider for each of them the corresponding synthesised process XF1,S0 (by structural induction on S this must hold). This set cannot be empty otherwise F1 is not satisfiable. For any c, m0 there is at most one S0 s.t. (c, m0 , S0 ) ∈ Send(S) s.t. XF1,S0 is a synthesised model. Then let X1,S0 be (p.c!x.XF0 1,S0 )φ , where: 1) p is a proof of m from φ whose root is an assignment to the variable x; 2) x is a variable that does not appear in XF1,S0 ; 3) XF0 0 is the term XF1,S0 where m is replaced with x. We then 1,S consider as X1 the summation of all these processes.

– F2 =

W

c!m0 S−→S0

φ ∪{m0 }

hc?m0 i(∃γ . m ∈ KX,γ

//S0 ) φ ∪{m0 }

∗ Fix c s.t. exists m0 , S0 with (∃γ . m ∈ KX,γ //S0 ) satisfiable. By induction we can find XF2,m0 ,S0 that satisfy that formula. Consider a variable x that is not present in any of these processes. Then, let X2,c0 be c?x : T.Y , where Y is the summation of summands 0 0 of the form [x = m0 ]XF0 0 0 , where X2,m 0 .S0 is X2,m0 .S0 where m 2,m .S is replaced with x (assuming x is fresh). Eventually, X2 is the summation on all c for which a satisfiable formula exists. W φ ∪{m0 } //S0 ) – F3 = τc,m0 hχc,m0 i(∃γ . m ∈ KX,γ S −→S0

∗ This case proceeds analogously to F2 . 205

APPENDIX A. PROOFS – F4 =

W

τc,m0 ,τ S −→ S0

∃γ . m ∈ Kγ //S0 φ

∗ This case is subsumed by F3 where the eavesdropped message is actually included in the knowledge of X . φ

– F5 = m ∈ KX,ε //S. This is the base case.

Proposition 8.5.3 Given a system S and a finite knowledge φ , then for any attacker Xφ such that

(SkXφ )\L |= F where L ⊇ Sort(SkXφ ), F = ∃γ . m ∈ KX,γ //S and γ 0 is such that all the following hold. φ

γ

• γ 0 = ((SkX)\L = ⇒ S1 ) ↓X , • m ∈ D(φ ∪ msgs(γ 0 )) and • m 6∈ D(φ ∪ msgs(γ 00 ) for any γ 00 strict prefix of γ 0

then XF (i.e., the process obtained from the proof of Theorem 8.5.2) also satisfies γ

((SkXF ) \ L = ⇒ S2 ) ↓XF = γ 0

Proof The proof is done by induction on the length of γ .

• γ is empty. Then also (XF )φ is s.t. m ∈ D(φ ). • γ = αγ1 . We can then investigate on the nature of the action α . γ1

τ

– α = τ . Then, ((SkX) \ L −→ (S0 kX) \ L =⇒ S1 ) ↓X = γ 0 and m ∈ D(φ ∪ msgs(γ 0 )) and m 6∈ D(φ ∪ msgs(γ 00 ), for any γ 00 strict prefix of γ 0 . By structural induction on γ1 we know that also XF 0 can perform φ the same sequence γ 0 , where F 0 = ∃γ . m ∈ KX,γ //S0 . Since XF 0 would be one summand of XF , the result follows. – α = τc!m0 . We can have four cases:

∗ The action is due to a sending from X and a reception from S. Then it must be that m0 ∈ D(φ ) and (S, m0 , S0 ) ∈ Send(S). τc,m0

γ1

It means that ((SkX) \ L −→ (S0 kX 0 ) \ L =⇒ S1 ) ↓X = (c!m0 )γ 0 and m ∈ D(φ ∪ msgs(γ 0 )) and m 6∈ D(φ ∪ msgs(γ 00 ), for any γ 00 strict prefix of (c!m0 )γ 0 . By structural induction on γ1 we know that also XF 0 (XF1,S0 in the terminology of Prop. 8.5.2) can perform the same sequence

206

γ 0 , where F 0 = ∃γ . m ∈ KX,γ //S0 . Since (p.c!x.XF0 1,S0 )φ , where: 1) p is a proof of m0 from φ and XF0 0 is the term XF1,S0 , i.e. XF 0 1,S where m0 is replaced with x, would be one summand of XF , the φ

result follows.

∗ The action is due to a receiving form X and sending from S.

c!m0

τc,m0

Then it means that S −→ S0 and ((SkX) \ L −→ γ1

(S0 kX 0 ) \ L =⇒ S1 ) ↓X = (c?m0 )γ 0 and m ∈ D(φ ∪ {m0 }msgs(γ 0 )) and m 6∈ D(φ ∪ msgs(γ 00 ), for any γ 00 strict prefix of (c?m0 )γ 0 . By structural induction on γ1 we know that also XF 0 (XF2,m0 ,S0 in the terminology of Prop. 8.5.2) can perform the same sequence φ ∪{m0 } γ 0 , where F 0 = ∃γ . m ∈ KX,γ //S0 . 0 Since c?x : T.(. . . + ([x = m0 ]XF0 0 0 ) + . . .), where X2,m 0 .S0 is 2,m .S

X2,m0 .S0 where m0 is replaced with x (assuming x is fresh) would be one summand of XF , the result follows. ∗ The action is an internal synchronisation of S and it is eavesdropped by X . This is similar to the previous case. ∗ The action is an internal synchronisation of S and it is not eavesdropped by X . Similar to the case with τ . Theorem 8.6.1 Given a contract c, a process S, and a secrecy property psec (restricted to alphabets L and L0 ), the iterative refinement procedure which provides a sequence of interfaces {Ri }i=0...n (with R0 = ia(c[ip(S)])), satisfies that if Rn 6=⊥ then A = ia−1 Ac (Rn ) is an adaptor for services ip(S) compliant with contract c such that property psec is preserved for L and L0 , i.e.:

@X . P\L00 |= psec

for the system

P = (JAKE kS)\\L0 \LkXφ

where E is the environment of contract c, L00 ⊇ Sort(P), and φ is the initial knowledge of the attacker. Proof c[ip(S)] is the adaptor resulting from applying the iterative pruning process in Section 8.4. Then, by Theorem 8.4.4 we have that R0 Ac . Thus, If we proceed as in Theorem 8.4.4, we can prove that A = ia−1 Ac (Rn ) is still an adaptor for services ip(S) compliant with contract c. This is because the nature of the transitions pruned by mapping prune was not used to reason about the resulting interfaces; that is, it does not matter if the 207

APPENDIX A. PROOFS pruned transition exhibited a deadlock situation when interacted with ip(S) or permitted an attack from some X . On the other hand, by construction JAKE preserves the property psec such as it is stated; in fact, for each step 0 of the refinement process, if (Jia−1 Ac (Ri )KE kS)\\L \L presents a trace which allows an attack, the transition in Ri corresponding to the last interaction is pruned. In addition, by Theorem 8.5.3, we know that the avoided attacker is the most general one; and therefore, by disabling it (by successively pruning vulnerable traces), we disable any other possible attack.

208

Always be wary of any helpful item that weighs less than its operating manual. T. Pratchett

B

Tool support B.1 STS-XML format STS-XML is the format used by the ITACA toolbox to symbolically describe service interfaces and adaptation contracts. Hence, it is the principal intput format of most of the applications in the toolbox.

B.1.1

Service interfaces

This format is used to symbolically describe the behaviour of the services. Symbolic parameters are encoded into operation names. As an example we will use the STS-XML description of the receiving SPIN node shown in Listing B.1. This service was used and described in detail in Section 4.1, concretely in Fig. 4.3(a).

For a generic service interface Σi , S, s0 , F, T we have the following conversion. 1. The main XML element is (line 1). This element contains the following elements: and . 2. For every α ∈ Σi it must be given an identifier stored within a element. Additionally, it must be created a within the element specifying the direction of the action, if it is not an internal transition. For instance, α = c:adv? has been encoded into lines 4 and 10 3. Every state s ∈ S must be encoded as an element with the appropriate attributes depending on whether it is the initial state (s = s0 ) or a final state (s ∈ F ). Line 16 represents the initial state of the protocol. 209

APPENDIX B. TOOL SUPPORT

Listing B.1: STS-XML code describing a reciving SPIN node 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

25 26

27 28

29

30

31 32 33

4. Finally, for every transition in T there must be a element specifying its label, source and target state. For instance, a transition outgoing from the initial state and labelled with c:adv? is described in line 24. 210

B.2. DINAPTER

B.1.2

Adaptation contracts

Adaptation contracts are encoded in a similar fashion into STS-XML files. We show in Listing B.2 an encoded in STS-XML.

excerpt of a contract For a generic contract Σc , Sc , sc0 , F c , T c the transformation is as follows. 1. The STS-XML description has the main element . 2. Every adaptation vector in Σc is encoded as a new element within the element. For instance, vectors VendA = a:end! ♦ and V1 = c:req! ♦ b:interest? are described by lines 3 and 12, respectively. 3. Then we have a element with the and stored within. 4. States are encoded in the same way as service interfaces (e.g., line 20). 5. Transitions also respect the syntax presented in the STS-XML encoding of service interfaces, since both structures are FSMs. As an example, line 29 represents a transition labelled with vector V1.

B.2 Dinapter B.2.1

Requirements and installation

Dinapter is open source (GPLv3) and publicly available as a github project at: https://github.com/jamartinb/dinapter The core of Dinapter is implemented in Java, it uses the Jess engine for expert systems and it has a python script on top. Therefore, it requires the following libraries: JPype (http://jpype.sourceforge.net/) to bridge between python and Java. Jess (http://herzberg.ca.sandia.gov/) the expert system engine used by Dinapter. Apache ant (http://ant.apache.org/). In order to install the trial version of Jess and compile the Java classes.

• ant get-jess - Downloads and sets up the trial version of Jess. 211


Listing B.2: An adaptation contract encoded in STS-XML 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 212 37 38 1 2

B.2. DINAPTER

Figure B.1: Dinapter running inside ACIDE

• ant dist - Compiles, packages and sets up the Java classes of Dinapter.

Java SDK 1.6 (6) or above (http://java.com/). Finally, in the main file dinapter.py it must be updated the constant JVM_LIB with the appropriate path to the java file libjvm.so. You can test the installation to check if everything is working properly by executing: dinapter.py -t

B.2.2

Within ACIDE

Dinapter can be used in within the ACIDE environment or as a standalone application. Figure B.1 shows the interface of ACIDE. Once the behaviour of two services described in STS are imported into ACIDE, one should be set as service "A" and the other as service "B" using the respective buttons in the 213

APPENDIX B. TOOL SUPPORT bar. Then we can start Dinapter with the given two services by pressing the button depicting a wand. Then Dinapter runs and generates a first set of correct adaptation contract with the same heuristic value. These contracts are listed in a new window where the vectors can be visualised prior their application to the main ACIDE window, using the arrow button. If we want Dinapter to discard the current contracts and generate others with worse heuristic values, then we have to press the "More contracts" button. Once we are satisfied with the contract, we can proceed to "Close" the window of Dinapter.

B.2.3

Standalone

Besides ACIDE, Dinapter can be executed as a standalone application. All the functionality is accessed through the python script dinapter.py, which accepts the following modes. dinapter.py [ -t | -h | ]

-t executes the self-test. -h shows the manual. , instructs Dinapter to generate adaptation contracts for the given services encoded in STS-XML. Once it find the first set of contracts, it shows them and it asks either to continue generating more contracts or exit the execution.

B.3 Dynamic adaptors Learning adaptors are implemented as a library within ITACA and published at https://github.com/jamartinb/adaptor as a standalone application. The learning process is simulated, and hence it is needed the behavioural description of the services encoded in STS-XML. The command returns the set of traces allowed by the learning adaptor after it has converged or it has reached the given limit of simulated traces, whatever happens first. These traces are given in two files: one which stores the traces in a raw format, and a second which represents them in the DOT format (see http://www.graphviz.org/). Additionally, the command can write in a file some statistics of the performance of the learning adaptor such as the number of inhibited traces, the success and failure rates, the ammount of allowed traces, and the sample standard deviation of those data. 214

B.3. DYNAMIC ADAPTORS Concretelly, the command respects the following grammar: adaptor.py [-h] -c C [-t] [-p P] [-l L] [--ter TER] [-i I] [-n N] [-s F] [--times T] [--sthr THR] [--dthr DTHR] [--dter DTER] [--athr ATHR] S [S ...]

The different arguments have the following meanings:

S are the files which contain the STS-XML description of the services. This behavioural information is used by the simulator, not learning adaptors, since these are not aware of the behaviour of the services

-h, - -help show the help information of the command -c C, - -contract C specifies the adaptation contract contained within an XML file C -t executes the unit-tests of learning adaptors -p P, - -prefix P the prefix P of the output files. These fail will contain the traces allowed by the learning adaptor and the behaviour of the services encoded in DOT. These files are not created if no prefix is given

-l L, - -limit L specifies in L the depth limit during training and simulation, default = 20 - -ter TER sets the transition error rate, TER ∈ [0, 1] or, in other words, the probability to force a failure in each synchronisation, default = 0 - -sthr THR stablishes with THR the static threshold of inhibited traces, β . The adaptor will forget the oldest inhibited traces to always remain within the treshold. The default value is infinity

- -dthr DTHR The value given in DTHR is the minimum value of the dynamic threshold β . This threshold starts being equal to DTHR, on failed traces is incremented by one and on successful traces is decremented by one. This option excludes - -sthr

-i I is the maximum number of trainning iterations. Each of these iterations corresponds to a complete traversal of the whole behaviour of the services. If it not set it will wait until nothing is learnt in the last iteration

-n N for statistical purposes, N represents how many samples we want or, in other words, how many times we are going to restart the experiment to gather statistics 215


-s F, - -stats F sets the file where the statistics will be written to - -dter DTER is a list of at least I values of TER, one for each iteration - -times T train using T randomly generated traces. This changes the meaning of the iterations since they now correspond to these T traces, and not a complete traversal

- -athr ATHR enables adaptive adaptation (see Section 4.4) with a minimum threshold of ATHR. This option excludes - -dthr and - -sthr The DOT files generated by adaptor.py can be compiled into images. As an example, the adaptor depicted in Fig. 4.4 was directly generated from a DOT file. Additionally, the traces simulated between adaptors and services are also returned in a DOT file, is it is shown in Fig. B.2. Filled nodes are successful nodes where the adaptor and the services reach a final state.

216

B.3. DYNAMIC ADAPTORS

user

pass

login

TAU

connected

request

getTicket

getTicket

getTicketDB

getTicketDB

getTicket

accepted

quit approved

approved

getTicket appointment

appointment accepted

accepted

quit approved

appointment quit

approved quit

appointment

approved approved

getTicketDB

getTicketDB

getTicketDB

getTicket

getTicket

getTicket

accepted

accepted

accepted

approved quit

quit

quit

approved

approved approved quit

quit

TAU appointment rejected

rejected

request

denied

getTicket

getTicket

getTicketDB

appointment

TAU TAU

rejected

connected

denied

denied

request

endC

endC

endC

getTicket

getTicketDB

endS

endS

endS

getTicketDB

getTicketDB

getTicketDB

finish

finish

finish

getTicketDB

getTicket

getTicket

accepted

approved quit

login

connected

appointment

getTicketDB appointment

getTicketDB appointment

appointment TAU

appointment TAU

appointment request

appointment

getTicket

accepted

quit approved

approved approved quit

quit

accepted

approved

approved

quit

quit

approved

quit

approved

quit

quit

approved

quit

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endC

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

endS

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

finish

Figure B.2: Traces simulated between services and a learning adaptor

217

C

Any sufficiently advanced technology is indistinguishable from magic. A.C. Clarke

Secure synthesis using ITACA C.1 Convert security contracts into deadlockequivalent behavioural contracts As seen in Section 8.4.1, the data dependencies among the symbolic parameters in a security contract c can be explicitly included in the contract cE . Additionally, by Theorem 7.2.1, we can avoid altogether contract terms in order to do the synthesis of functionally-correct adaptors. So, if we want to use traditional approaches to adaptor synthesis (where no security or symbolic parameters are supported), we can include the interface information in the channel of the actions. Therefore, we have to transform the contract and the service interfaces to another contract and service interfaces without security in a way that the transformation can be reversed once the adaptor is synthesised. More formally, the contract transformation is defined in Figure Fig. C.1(b). Single-sided vector transformation is omitted. Symbol ‘?! ’ represents either ‘?’ or ‘!’. We use the calligraphic letter C to denote the equivalent contract without security and, similarly, for the state machine which it imposes to the adaptor, i.e., A C being the synthesised adaptor A . We proceed c?! T

c:T ?!

analogously for services, i.e., for every s1 − −−→ s2 we have s1 −−−→ s2 . Let us highlight that the obtained services and contract without security are just particular cases of our service interfaces and contracts, therefore we can use our previous results. We can now use any compatible approach to the synthesis of behavioural adaptors [CPS08, MPS08, BCP04, CMS+ 09a, Pad09] to generate an adaptor without security A such that it complies with C and the securityless services. We will follow the procedure depicted in Figure Fig. C.1(a). 219

APPENDIX C. SECURE SYNTHESIS USING ITACA Ac characterising c c

adaptor A s.t. A Ac A

c

Sec−1 AC characterising C

A Sec C

adaptor A s.t. A A AC

C

A

(a) Contract transformations.

c:[T ] ?!

c?! T

c?T ♦ c0 !T 0

s1 −−−−−−−−→ c0 s2 c:[T ] ! ♦ c0 :[T 0 ] ?

θ s1 −−−−θ−−−−−−− → C s2

(S EC−1 )

θ s3 −−−−− → A s4

c:[T ] ?!

θ s1 −−−→ Ac s2 s1 −−−−− → A C s2

c?! T

s1 , s3 −−−→ A s2 , s4 (S EC)

(b) Rule to remove contract terms from (c) Rule to include the contract terms back into security adaptation contracts the synthesised adaptor.

Figure C.1: Rules to remove and include security into adaptation contracts.

C.2 Recovering security adaptors The transformation from c to C (Ac to A C , analogously) is reversible because both state machines are deterministic and use the same set of states. However, the synthesised secure-less adaptor A might not use the same set of states so, in order to obtain its corresponding adaptor with security (A), we need to define a procedure to undo the transformation. Figure Fig. C.1(c) returns adaptor A using the equivalence relation between Ac and A C while simulating A . The initial state of A is the one which corresponds to the initial state of Ac and A C . The final states of A are those which correspond to final states of A .

220

NOBODY expects the Spanish Inquisition! Our chief weapon is surprise. . . surprise and fear. . . fear and surprise. . . Our two weapons are fear and surprise. . . and ruthless efficiency. . . Our three weapons are fear, surprise, and ruthless efficiency. . . and an almost fanatical devotion to the Pope. . . Our four. . . no. . . Amongst our weapons. . . Amongst our weaponry. . . are such elements as fear, surprise. . . I’ll come in again.”

D

Adaptación de servicios software

M. Python

Los servicios son parte fundamental del Internet tal y como lo conocemos hoy en día. Éstos, a su vez acceden múltiples servicios al mismo tiempo para proporcionar una oferta integrada y de valor añadido. Por ello, es especialmente importante que estos servicios sean interoperables y que cada uno acceda sin problemas al interfaz ofrecido por los demás. Nosotros proponemos el uso de adaptadores software, resumidos en Sección D.1, para mejorar la interoperabilidad de los servicios, su diseño y reutilización. Finalmente, en Sección D.5, presentaremos las conclusiones obtenidas durante la tesis.

D.1 Resumen de la tesis Internet está constituido por servicios. Páginas Web, redes sociales, aplicaciones Web, juegos, administración electrónica, comercio electrónico, . . . algunos de estos servicios trabajan de forma autónoma pero, los realmente innovadores, aquellos que están cambiando nuestra vida diaria son eminentemente interoperables. Facebook ofrece juegos desarrollados y desplegados por terceros, puedes crear un twit desde casi cualquier página y cualquier servicio de comercio electrónico involucra servicios de pago, crédito, compra, envío y soporte, muchas veces dados de distintos proveedores. Es de vital importancia de que esta nube de servicios, donde cada uno ha sido desarrollado por equipos y estandares distintos, se integre sin problemas. Que el servicio de pago no te cobre dos veces por haber malentendido el servicio de compra o que el servicio de compra no olvide avisarte de los sobrecargos por aduana del servicio de mensajería. Los servicios han de ser plenamente conscientes de los requisitos, condiciones y funcionalidad de sus pares y complementarse mutuamente a la perfección. 221

APÉNDICE D. ADAPTACIÓN DE SERVICIOS SOFTWARE Lamentablemente, esto no siempre es así. Los paquetes se pierden y ningún servicio te notifica, te cobran aduanas de forma inesperada, el GPS del coche de alquiler te da la distancia en millas en lugar de adaptarse a kilómetros, tus fotos de flickr son difícilmente traspasables a Picasa Web y tus entradas y amigos de facebook no pueden ser exportados a ningún otro servicio. Es más, cuando se diseñan nuevos servicios, éstos tienen que realizarse acorde al interfaz de los demás servicios a acceder. Si estos últimos son modificados, entonces el servicio que hemos desarrollado deja de funcionar y hay que actualizarlo a las nuevas circunstancias. Las arquitecturas orientadas a servicios (SOAs) están compuestas por servicios Web (WSs) interoperables. Sin embargo, WSs no son siempre compatibles, con lo que se resiente su reusabilidad, el desarrollo y el mantenimiento de sistemas SOA. Para ello proponemos el uso de adaptadores software [YS97]. que se encargan de coordinar distintos servicios, inicialmente incompatibles, de manera que consigas tu objetivo sin importarte las diferencias entre dichos servicios. Los adaptadores software son otros servicios cuyo cometido es coordinar a los demás servicios del sistema. En otras palabras, si queremos diseñar un servicio que tiene que acceder, por ejemplo, a Google Maps y a Amazon, lo hacemos en base a un API genérica y dejamos en manos del adaptador el traducir dicha API a la de Google y Amazon. Esto facilita el desarrollo pero, además, si luego cambia la API de cualquiera de estos servicios, o queremos cambiar Amazon por Ebay, el adaptador se hará cargo del cambio sin tener que modificar nuestro servicio. Durante el desarrollo de un servicio, cuando estamos accediendo a los servicios de terceros, tenemos en cuenta su signatura (i.e., las operaciones y los argumentos de los servicios) pero solemos olvidar el comportamiento implícito de dichas operaciones. Por ejemplo, hay que realizar una llamada a la operación de registro de usuario antes de poder realizar acciones con la credencial de dicho usuario; o hay que seleccionar imágenes antes de proceder a editarlas o borrarlas. Estos son ejemplos de dependencias de comportamiento entre operaciones de servicios Web y normalmente son sólo descritos en la documentación del servicio en lenguaje natural o, en los menos casos, está explícitamente publicada como procesos BPEL o flujos de trabajo WF. Esto es particularmente importante en servicios con estados, también llamados servicios con comportamiento normalmente descritos en BPEL o WF. Dichos servicios con comportamiento son especialmente susceptibles a interbloqueos en donde los servicios no pueden continuar 222

D.1. RESUMEN DE LA TESIS Service Interfaces (Abstract BPEL+WSDL) Designer Interactive Contract Specification + Simulation and Verification (ACIDE)


"

!

#

!

$ "



...


Adaptation Contract




Figura D.1: Proceso de adaptación: diseño del contrato y síntesis del adaptador

con su ejecución debido a incompatibilidades y diferencias en cuanto a las posibles secuencias de acciones. La adaptacion software [YS97] es una solución sólida que permite a los WS cooperar sin importar sus incompatibilidades iniciales. Mediante el despliegue de adaptadores en el medio de la comunicación se pueden solucionar de forma efectiva las incompatibilidades a los niveles de signatura y comportamiento entre servicio [BP06, CPS08]. Intuitivamente hablando, un adaptador intercepta, recibe, procesa y modifica los mensajes intercambiados entre los servicios de manera que los adapta al comportamiento y signatura esperados por el destinatario. Sin embargo, el diseño de adaptadores es una tarea difícil donde el desarrollador tiene que tener en cuenta el comportamiento de todos los servicios y sus posibles interacciones. En este proceso, detalles sutiles pueden ser obviados, con lo que resultaría en una adaptación incorrecta y volver a presentar interbloqueos. Nosotros proponemos el uso de contratos de adaptación, los cuales son especificaciones abstractas de la adptación. Estos contratos permiten establecer de forma clara y sencilla cómo solventar las incompatibilidades entre los servicios sin perderse en los detalles de su comportamiento. Esta tesis ha sido implementada como parte de ITACA, una suite de herramientas que permite el proceso de trabajo mostrado en Fig. D.1. Esta figura describe como, una vez se conocen los servicios a adaptar, entonces se diseña un contrato de adaptación. Este contrato se genera o bien automáticamente usando una herramienta llamada Dinapter (Capí223

APÉNDICE D. ADAPTACIÓN DE SERVICIOS SOFTWARE tulo 6) o bien de forma asistida al diseñador empleando la herramienta ACIDE. Una vez el contrato se ha generado, se procede a sintetizar un adaptador que cumpla dicho contrato y permita la correcta coordinación entre los servicios. Pasemos a definir formalmente estos contratos y su implicación sobre los adaptadores.

D.2 Contratos de adaptación Un adaptador se especifica mediante un contrato de adaptación. Este contrato define un conjunto de correspondencia entre las operaciones y argumentos de los servicios. Cada una de estas correspondencias se denomina vector de adaptación (o sólo vector, por abreviar). Opcionalmente, los contratos pueden incluír también algunas restricciones de alto nivel en cuanto a las secuencias de aplicación de los vectores. Definición D.2.1 (Contrato de adaptación)

Un contrato de adaptación c es una máquina de estados finita (FSM) Σc , Sc , sc0 , F c , T c donde Σc es el conjunto de vectores, Sc es el conjunto de estados, sc0 ∈ Sc es el estado inicial, F c ⊆ Sc es el conjunto de estados finales y T c ⊆ (Sc × Σc × Sc ) es un conjunto de transiciones etiquetadas. Los vectores en Σc tienen la forma a ♦ b donde: • a, b ∈ Σa son acciones de comunicación de entrada y salida, • un lado del vector puede estar vacío (viz., a ♦ ó ♦ b), • si tanto a como b están presentes, entonces uno tiene que ser una acción de entrada y el otro de salida. Las acciones contienen un nombre de operación o canal seguido por un símbolo de interrogación o explacamión para acciones de entrada o salida, respectivamente. Por último, puden presentar una lista (posiblemente vacía) de argumentos. Los adaptadores actúan como mediadores entre dos lados. Cualquier comunicación entre dichos lados debe ser interceptada y manipulada por el adaptador. Una acción en un lado de un vector denota la acción complementaria que el adaptador ejecutará de cara a los servicios en dicho lado. Por ejemplo, un vector como a!args ♦ b?args0 (donde args y args0 son listas de parámetros simbólicos) significa que, si el adaptador recive una petición a la operación a y su contenido encaja con args por parte de un servicio en el lado derecho, entonces el adaptador tendrá que enviar tarde o temprano una petición b con el contenido especificado en args0 hacia un servicio del lado derecho. 224

D.2. CONTRATOS DE ADAPTACIÓN

Cuadro D.1: Notación para transiciones y trazas

(a, l, b) ∈ T

{(a0 , τ, a1 ), (a1 , τ, a2 ), . . . (an−1 , τ, an )} ⊆ T, n ≥ 0 {(a0 , τ, a1 ), (a1 , τ, a2 ), . . . (an−1 , τ, an )} ⊆ T, n > 0

≡

≡

≡

l

a− →b

a0 − → ∗ an τ

a0 − → + an τ

Todo mensaje recibido por el adaptador es encajado con un vector. Al encagarlo, los parámetros simbólicos del vector son actualizados con el contenido del mensaje y el estado del adaptador avanza. Por ejemplo, una vez el vector a!args ♦ b?args0 es disparado por una recepción en a, el contenido de args es almacenado, args0 es instanciado y una petición a b?args0 es insertada en una cola de mensajes que serán enviados antes de que el adaptador termine su ejecución. En cuanto un servicio esté listo de recibir dicho mensaje sin causar interbloqueos, entonces el primer mensaje de la cola será entregado. La relación de trasiciones T c puede imponer restriciones en el orden en el cual los vectores serán disparados. De esta manera, T c permite establecer políticas de alto nivel sobre las comunicaciones como, por ejemplo, “que no se realicen más de tres peticiones” o “después de cada petición debe de devolverse una confirmación”. A lo largo de la tesis se emplea la notación usual para transiciones genéricas y trazas, tal y como se describe en Tabla D.1. El estilo de la flecha se cambia (en símbolo o subíndice) cuando es importante distinguir distintos sistemas de transiciones. Si no, emplearemos por defecto la flecha normal ‘− →’. Ejemplo D.2.1 El ejemplo de este capítulo está basado en un sistema metereológico simplificado. Tenemos tres servicios incompatibles pero con una funcionalidad complementaria: a) un servicio que sensa la temperatura, este servicio podría ser desplegado en el sumidero de una red de sensores de temperatura; b) un servicio de monitorización que registra la información, éste podría estar localizado en un portátil; y c) un servicio de humedad, el cual puede estar desplegado en la misma infraestructura de la red de sensores de temperatura. La signatura de los servicios, i.e., los nombres de las operaciones y sus argumentos, se encuentra en la Tabla D.2. El servicio de temperatura (servicio a) tiene: acciones de salida user!usr y pass!psw para autenticarse 225

APÉNDICE D. ADAPTACIÓN DE SERVICIOS SOFTWARE con su nombre (argumento tipado usr) y contraseña ( psw); una operación para notificar de la temperatura actual, i.e., upload!temp; dos acciones de entrada para denegar la petición (denied?) o contestarla con un nuevo intervalo tras el cual debe mandarse la nueva notificación (delay?time); y, finalmente, una acción de salida para avisar de que ha terminado la sesión actual, end!. Intuitivamente, las acciones de entrada (e.g., denied?) representan la disponibilidad de una operación, es decir, la operación es ofertada por un servicio, mientras que las acciones de salida representan peticiones de servicio (e.g., upload!temp), ambas seguidas por el tipo de sus argumentos (temp en este caso). El servicio de monitorización (servicio b) puede ser una nueva versión de otro anterior o haber sido desarrollado por un vendedor distinto con lo que presenta operaciones con una funcionalidad similar pero con una signatura incompatible. En lugar de las operaciones user?usr y pass?psw, esperadas por el servicio a, el servicio b tiene una única operación de autenticación login?usr, psw. La autenticación puede ser rechazada (re jected!) o aceptada (connected!). El servicio recibe las notificaciones de temperatura con la operación register?temp y envía las respuestas a través de answer!time. Este servicio puede recibir una petición de cerrar sesión (quit?) y notifica de ello mediante la acción end!. El servicio de monitorización requiere información de humedad (tipada humid ) antes de decidir cuante tiempo va a esperar hasta la próxima actualización de temperatura. Por esta razón solicita dicha información al servicio de humedad, servicio c, a través de la petición y correspondiente respuesta, getHumid! y getHumid?humid . El último es entendido por el servicio c pero, en lugar del primero, el servicio c necesita la información de la temperatura

Cuadro D.2: La signatura del ejemplo Servicio a

user ! usr pass ! psw

Servicio b

Servicio c

login ? usr, psw connected !

upload ! temp denied ? delay ? time end ! 226

register ? temp re jected ! getHumid ! getHumid ? humid answer ! time end !

getHumid ? temp getHumid ! humid f inish !

D.2. CONTRATOS DE ADAPTACIÓN para realizar cierta calibración a través de la acción getHumid?temp. Finalmente termina su sesión y lo notifica con la acción f inish!. Figura D.2 ilustra un posible contrato de adaptación para estos servicios. El vector vu permite al adaptador recibir la acción user y referirse a su argumento mediante el parámetro simbólico U . El vector vl primero recibe la contraseña (en P) con la acción pass y, como consecuencia, envía finalmente una petición a login uniendo tanto el usuario U como su contraseña P en un único mensaje. El resto de los vectores se comportan de manera similar. El sistema de transiciones del contrato establece el objetivo del sistema. En este caso en concreto especifica que si el servicio c envía una notificación connected (vc ), entonces se debe de enviar tarde o temprano una actualización de temperatura mediante la acción answer, que a su vez es traducida de una acción delay mediante el vector va . A su vez, el sistema de transiciones establece que la sesión debe finalizar (mediante los vectores ve , ve0 and ve00 ) o bien en estte punto o antes de conectar (i.e., antes de vc ).

Tal y como hemos visto en este ejemplo, los servicios pueden emplear diferentes alfabetos de acciones (diferentes nombres de acción así como diferentes nombres, número y orden de argumentos). Los vectores del contrato (Σc ) nos dicen cómo solventar estas incompatibilidades en signatura. Además, los servicios pueden bloquearse debido a incompatibilidades de comportamiento. Las incompatibilidades de comportamiento surgen cuando la secuencia en la que las operaciones de los servicios son ofrecidas y solicitadas (su comportamiento) no es correspondindo con otro servicio. Por ejemplo, puede que el servicio de temperatura primero solicite la operación upload!temp y luego proporcione sus credenciales (via user!usr y pass!psw) mientras que el servicio de monitorización espera las acciones en el orden contrario (login?usr, psw seguido por register?data). Su adaptación entraría en un interbloqueo aunque se solventaran sus incompatibilidades en signatura. Los contratos de adaptación, la máquina de estados finita concretamente, no dice necesariamente como solventar las incompatibilidades de comportamiento porque el comportamiento de los servicios puede ser desconocido o puede incluso cambiar durante la conversación. En general, definimos el comportamiento de un servicio, ie, su interfaz de comportamiento, como una máquina de estados finita. Definición D.2.2 (Interfaz de servicio) Un interfaz de servicio se define

formalmente mediante una FSM Σi , S, s0 , F, T donde: Σi es el conjunto de 227

APÉNDICE D. ADAPTACIÓN DE SERVICIOS SOFTWARE

Σc = {

user!U ♦ , pass!P ♦ login?U, P, ♦ connected!, upload!D ♦ ,

(vu ) (vl ) (vc ) (v p )

♦ register?D), getHumid?D ♦ getHumid!,

(vr ) (vg )

delay?T ♦ answer!T,

(va )

getHumid!H ♦ getHumid?H,

(vt )

denied? ♦ re jected!,

(vd )

♦ quit?,

(vq )

end! ♦ , f inish! ♦ ,

(ve ) (ve0 )

♦ end!}

(ve00 )

(a) Vectores de adaptación Σc \ {va , ve , ve′ , ve′′ } Σc \ {vc , ve , ve′ , ve′′ }

Σc \ {ve , ve′ , ve′′ }

{va }

{vc } {ve }

{ve }

{ve′ }

{ve′′ }

(b) Máquina de estados finita del contrato

Figura D.2: Un contrato de adaptación

etiquetas asociado con las transiciones, S es un conjunto de estados, s0 ∈ S es el estado inicial, F ⊆ S son estados finales o estables y T ⊆ (S × Σi × S) son las transiciones. Existen múltiples lenguajes con su propia semántica en la literatura para describir orquestraciones de servicios [BBDN+ 06, CSC+ 07, SBS06]. Sin embargo, hemos elegido esta formalización porque es simple y nos abstrae a un nivel adecuado de detalle para nuestras necesidades de análisis de comportamiento. 228

D.2. CONTRATOS DE ADAPTACIÓN

(a) Servicio (b) Servicio b, servicio de monitor- (c) Servicio c, sensor de a, sensor de ización humedad temperatura

Figura D.3: Un posible comportamiento para los servicios de nuestro ejemplo. Las letras subrayadas serán usadas para abreviar las etiquetas Las etiquetas en Σi son o bien transiciones internas (τ ) o transiciones de comunicación que comienzan por el nombre de la operación seguido por un ‘!’ ó ‘?’ si son acciones de salida o entrada, respectivamente. Las etiquetas también contienen una expresión que describe el tipo del contenido del mensaje. Este contenido es ofrecido en acciones de entrada y requerido en acciones de salida. En general, esta expresión consta de una lista de tipos que representan los argumentos de la acción. Si es necesario, las operaciones pueden ser renombradas y prefijadas con una identificación única de sus respectivos servicios a fin de distinguirlas entre servicios. Por ejemplo, podemos renombrar las operaciones de nuestro ejemplo para distinguir a:end! y b:end! entre los servicios a y b. Ejemplo D.2.2 Figura D.3 muestra el posible comportamiento de los 229

APÉNDICE D. ADAPTACIÓN DE SERVICIOS SOFTWARE servicios de nuestro ejemplo. Los estados iniciales son marcados con una flecha entrante sin origen. Los estados finales están rellenos en negro. Las elecciones internas (e.g., condiciones if-then-else o swith) son modeladas con transiciones τ . Las elecciones externas (e.g., pick en BPEL) son modeladas mediante transiciones etiquetadas con acciones de entrada saliendo del mismo estado. En particular podemos ver que el servicio b empieza con una elección externa en donde está dispuesto a recivir una petición de login o a enviar un mensaje end . Despues del login, el servicio b puede decidir internamente continuar por la rama izquierda y rechazar la sesión o seguir por la rama derecha y enviar la notificación connected . La semántica intensional de los contratos de adaptación especifica las interacciones deseadas entre los servicios adaptados. En otras palabras, la semántica intensional de un contrato describe cuales son las condiciones necesarias para que un adaptador cumpla un contrato de adaptación dado.

D.3 Semántica intensional de los contratos de adaptación La semántica intensional de un contrato de adaptación proporciona las interacciones entre los servicios y el adaptador permitidas por el contrato. Formalmente, la semántica intensional de un contrato de adaptación c es un sistema de transiciones − →c x etiquetado con configuraciones de la forma hs, ∆i donde s es el estado actual del contrato y ∆ es un multiconjunto de acciones pendientes que el adaptador tiene que ejecutar tarde o temprano. Una transición hs, ∆i − →c xhs0 , ∆0 i indica que un adaptador puede, por el contrato c, ejecutar la acción x en el estado s con las acciones pendientes ∆. El sistema de transiciones − →c x se define mediante las siguientes reglas de inferencia:

(s, a ♦ b, s0 ) ∈ T c |a

hs, ∆i −→c

hs0 , ∆ ∪ {b|}i

(s, a ♦ b, s0 ) ∈ T c

(I1)

b|

hs, ∆i −→c

hs, ∆ ∪ {x}i − →c xhs, ∆i

(I2)

hs0 , ∆ ∪ {|a}i

(I3)

donde las acciones complementarias de una acción de comunicación a se denota mediante a (e.g., si a = do! entonces a = do?, y vice-versa). Notese que las etiquetas que denotan acciones del adaptador están anotadas con una barra vertical a la izquierda o a la derecha para representar 230

D.3. SEMÁNTICA INTENSIONAL DE LOS CONTRATOS DE ADAPTACIÓN explícitamente hacia qué lado se efectuará la acción, es decir, a la iquierda (|a) o a la derecha (b|). Notese también que se asume una semántica de ordenación entre las acciones pendientes, esto es, en la regla I3 asumimos que si existe más de una acción x en el multiconjunto ∆, entonces se manda la x de mayor antiguedad en ∆. Finalmente, dado que un vector a ♦ b puede carecer de a ó b, la definición de − →c x incluye también las siguientes reglas:

(s, a ♦ , s0 ) ∈ T c |a

hs, ∆i −→c hs0 , ∆i

(s, ♦ b, s0 ) ∈ T c

(I4)

b|

(I5)

hs, ∆i −→c hs0 , ∆i

La máquina de estados resultante de las reglas I1-5 se denota Ac0 y representa la semántica intensional (posiblemente indeterminista) de un contrato de adaptación.

D.3.1

Elecciones anticipadas

Merece la pena destacar que la semántica intensional definida por las reglas (I1) a (I5) pueden forcar elecciones anticipadas. Dichas elecciones anticipadas ocurren, por ejemplo, cuando el contrato de adaptación contiene más de un vector para una acción a. Ejemplo D.3.1 Considerando un contrato sencillo c = hΣc , Sc , sc0 , F c , T c i donde

Σc = {a ♦ b, a ♦ c}; sc0 = s0 ; T c = {(s0 , a ♦ b, s1 ), (s0 , a ♦ c, s1 )}

Sc = {s0 , s1 } F c = {s1 }

Entonces las siguientes dos transiciones, que parten del estado inicial, forzarían una elección anticipada del adaptador. El adaptador debe decidir arbitrariamente entre ejecutar b| ó c| cuando recibe |a debido a: |a

hs0 , 0i / −→c hs1 , {b|}i

y

|a

hs0 , 0i / −→c hs1 , {c|}i

Intuitivamente, tales elecciones anticipadas pueden llevar al adaptador a fallar en algunas trazas. Podríamos limitarnos a contratos deterministas y decir que los contratos que presentan elecciones anticipadas no son válidos pero, en su lugar, vamos a permitir dicha flexibilidad incluyendo una regla que convierta las elecciones anticipadas en elecciones perezosas. De esta manera, obtenemos adaptadores deterministas. 231

APÉNDICE D. ADAPTACIÓN DE SERVICIOS SOFTWARE Las elecciones perezosas son modeladas transformando el sistema de transiciones − →c x para trabajar con conjunto de pares hs, ∆i usando la construcción de superconjuntos [Sip96]:

A0 = {hs0 , ∆0 i | ∃hs, ∆i ∈ A . hs, ∆i − →c xhs0 , ∆0 i} 6= 0/ x

A ,→c A0

(L)

La máquina de estados resultante de esta transformación se denomina

Ac . Es trivial demostrar que Ac acepta las mismas trazas que Ac0 y que Ac0 Ac .

D.3.2

Dependencias de datos

Hasta ahora, la semántica intensional de los contratos representada por el x sistema de transiciones ,→c no tiene en cuenta los parámetros simbólicos que corresponden a los argumentos recibidos y enviados por el adaptador. Estos parámetros simbólicos imponen una restricción en cuanto a la sequencia en que los vectores pueden ser aplicados. Por ejemplo, sabes que el adaptador tiene que recibir un mensaje correspondiente a a!D ♦ antes de tener los datos suficientes para mandar ♦ b?D dado que necesita al primero para obtener el valor de D requerido por el segundo. De forma similar a como hicimos con las elecciones anticipadas, refinamos la semántica intensional de los contratos de adaptación para incorporar estas restricciones entre los parámetros simbólicos en Sección 8.4.1. Sin embargo, por simplicidad, podemos asumir sin pérdida de generalidad que dichas restricciones han sido incluídas expílicitamente en el sistema de transiciones del contrato, T c . x Denotaremos como Ac al sistema de transiciones (,→c ) que representa la semántica intensional determinista, incluyendo las dependencias de datos, de un contrato de adaptación c.

D.4 Adaptadores de comportamiento Un adaptador es un servicio especial que cumple con la semántica intensional del contrato de adaptación dado. A fin de definir formalmente un adaptador primero necesitamos definir una relación de conformidad entre una máquina de estados y un contrato de adaptación. Para ello usaremos la noción de simulación () de las álgebras de proceso pero redefinidas para manejar los estados finales de las máquinas de estados. 232

D.4. ADAPTADORES DE COMPORTAMIENTO Definición D.4.1 (Simulación) Una relación de simulación entre los estados de dos máquinas de estados A1 = Σ1 , S1 , s10 , F1 , T1 y

A2 = Σ2 , S2 , s20 , F2 , T2 es definida como RA1 ,A2 ⊆ O1 × O2 tal que las siquientes condiciones se cumplen para todo par (s1 , s2 ) ∈ RA1 ,A2 : α

1. Para toda transición (s1 −→ s01 ) ∈ T1 debe existir una transición α (s2 −→ s02 ) ∈ T2 tal que (s01 , s02 ) ∈ RA1 ,A2 . 2. Si s1 ∈ F1 entonces s2 ∈ F2 . De la noción previa se puede deribar una relación de simulación para máquina de estados

A1 A2 si, y sólo si existe RA1 ,A2 tal que (s10 , s20 ) ∈ RA1 ,A2 donde s10 and s20 son los estados iniciales de A1 y A2 , respectivamente. Ahora es trivial definir un adaptador que cumple con un contrato de adaptación dado. Definición D.4.2 (Adaptador) Un adaptador A que cumple con un contrato c es una máquina de estados determinista tal que A Ac . El determinismo debe de ser entendido como la restricción de que toda las transiciones salientes del mismo estado deben tener diferentes etiquetas y ninguna de ellas debe de ser una transición interna (τ ). l

l

1 2 ∀{s0 −→ s1 , s0 −→ s2 } ⊆ T

D.4.1

entonces

τ 6= l1 6= l2 6= τ

Sincronización entre servicios y adaptadores

Las transiciones de un adaptador (Definition D.4.2) y las transiciones del interfaz de un servicio (Definition D.2.2) tienen distintos tipos de etiquetas. La diferencia es que los argumentos de las etiquetas en un adaptador están representados por parámetros simbólicos mientras que los argumentos del interfaz de un servicio están representados por sus correspondientes tipos. Consideraremos que cada parámetro simbólico tiene su correspondiente tipo asociado. Por lo tanto, podemos obtener el interfaz del servicio correspondiente a un adaptador sustituyendo cada ocurrencia de un parámetro simbólico en el alfabeto del adaptador (Σi ) por sus correspondientes tipos. De nuevo, las barras verticales que denotan el lado de la comunicación pueden ser omitidas si las acciones son renombradas acorde a sus interfaces. 233

APÉNDICE D. ADAPTACIÓN DE SERVICIOS SOFTWARE Ejemplo D.4.1 Para nuestro ejemplo, el contrato (Fig. D.2) requiere de la siguiente substitución para obtener el interfaz del adaptador compatible con los servicios mostrados en Fig. D.3.

θ = {usr/U, psw/P,temp/T, humid/T,time/D} Ahora podemos modelar la sincronización entre interfaces de servicios y adaptadores mediante la composición paralela (⊗). α¯

α

s1 −→ s01

τ

α

s −→ s0

s2 −→ s02

α

s ⊗ s1 −→ s0 ⊗ s1

s1 ⊗ s2 − → s01 ⊗ s02

Siendo dados el estado inicial del interfaz de un adaptador (sa0 ) y el estado inicial del interfaz de un servicio (ss0 ) sabemos que

{(sai , ssi ) | sa0 ⊗ ss0 − → ∗ sai ⊗ ssi } τ

son todas los estados alcanzables durante la sincronización entre los servicios y el adaptador. Merece destacar que, por la asociatividad de ⊗, se puede representar de igual manera la composición paralela de múltiples servicios dado que ss0 puede ser igual a la composición paralela de estos servicios, i.e.:

ss0 = (s10 ⊗ . . . ⊗ sn0 ) Las operaciones pueden ser renombradas para forzar la sincronización únicamente a través del adaptador. Esta composición paralela es revisada en Capítulo 4) para soportar la adaptación dinámica de servicios y de nuevo en Capítulo 8 para modelar la comunicación de datos criptográficos.

D.4.2

Interbloqueos

Tal y como vimos en Sección D.3, no es suficiente que un adaptador cumpla con un contrato de adaptación dado que, sin más análisis, pueden surgir interbloqueos debido a las incompatibilidades entre los servicios. Dependiendo de las decisiones internas de los servicios, éstos pueden llevar a la orquestación a un interbloqueo. Por ello, el adaptador tiene que ser consciente de estas incompatibilidades y evitarlas mediante su comportamiento. 234

D.4. ADAPTADORES DE COMPORTAMIENTO Definición D.4.3 (Interbloqueo) Un interfaz presenta un interbloqueo si llega a un estado donde no se puede alcanzar un estado final. Formalmente, existe un estado de interbloqueo s tal que

s0 − → ∗ s and @s0 ∈ F s.t. s − → ∗ s0 τ

τ

Ejemplo D.4.2 Consideremos el comportamiento de los servicios del ejemplo mostrado en la Fig. D.3. Asumiendo que el servicio b decide internamente conectarse (la rama τ de la derecha), entonces la semántica intensional del contrato (en Fig. D.2) permite la siguiente secuencia de vectores: vu ·vl ·vc ·v p ·vq . Ésta, entre otras, corresponde a la traza |?u · |?s· !l|· ?c| · |?p· ?q| donde las acciones están representadas por sus carácteres subrayados y ‘·’ es el operador the anexión. Esta secuencia llevaría al sistema a un interbloqueo porque, en este punto, el servicio b no puede participar en ninguno de los otros vectores necesarios para alcanzar un estado final (i.e., vd y va , al menos).

D.4.3

Síntesis de adaptadores

La definición de los adaptadores (Definition D.4.2) sólo depende de un contrato de adaptación dado. En contraste, para obtener adaptadores que eviten el interbloqueo de los servicios a adaptar, éstos tienen que ser sintetizados teniendo en cuenta el orden en que los servicios intercambian sus mensajes. Por lo tanto, llamaremos a estos adaptadores sin bloqueos adaptadores de servicios, porque éstos dependen precisamente del comportamiento de los servicios. En definitiva, el objetivo final de los adaptadores de servicios es controlar el comportamiento de los servicios a fin de guiarlos a estados de éxito/finales/estables evitando posibles interbloqueos. Los adaptadores de servicios son otro refinamiento de la semántica intensional de los contratos de adaptación. Este refinamiento es un concepto clave en los trabajos tradicionales sobre la síntesis de adaptadores [AINT07, BP06, KNB+ 09, MPS08]. Estos trabajos relacionados están enfocados en el tiempo de diseño y necesitan conocer con antelación el comportamiento de los servicios. Nosotros, en cambio, mostraremos como dichos adaptadores son: a) generados dinámicamente sin conocer con antelación el comportamiento de los servicios en Capítulo 4; o b) sintetizados, verificados y refinados en Capítulo 8 si los servicios tienen políticas de seguridad incompatibles. Ejemplo D.4.3 El adaptador más general que cumple con el contrato de la Fig. D.2 y los servicios mostrados en la Fig. D.3 se encuentra en 235

APÉNDICE D. ADAPTACIÓN DE SERVICIOS SOFTWARE

Figura D.4: El adaptador estático más general que cumple con el contrato y los servicios mostrados en Fig. D.2 y Fig. D.3, respectivamente. Las acciones han sido reemplazadas con sus carácteres subrayados y prefijadas con el correspondiente identificador de servicio la Fig. D.4. Las acciones han sido reducidas a sus correspondientes carácteres subrayados en el contrato y han sido prefijados con la identificación de sus correspondientes servicios.

D.5 Conclusiones Internet está plagado por servicios, la computación en la nube ha proporcionado la infraestructura y los dispositivos personales (smartphones, tablets y portátiles) otorgan ubiquidad de acceso. Lo único que falta es una forma fácil de integrar toda esta capacidad de forma cohesionada, permitiendo así que podamos disfrutar de aplicaciones compuestas de varios de estos servicios, plataformas y dispositivos sin estar atrapados en las soluciones de un único vendedor. Sin embargo, la situación se complica cuando salimos de la visión homogénea y coherente de un único proveedor y nos adentramos en la jungla de servicios implementados en múltiples plataformas, con distintos interfaces, protocolos y semántica. Aunque los proveedores de servicios ofrecen interfaces externas, los servicios no están siempre pensados para facilitar la cooperación con otros servicios (posiblemente ofrecidos por la competencia), con lo que suelen surgir incompatibilidades, fallos de seguridad y otras situaciones indeseables. En esta tesis hemos propuesto el uso de adaptadores para facilitar el desarrollo de sistemas compuestos de varios servicios inicialmente incompatibles. A lo largo de este documento hemos afrontado varias 236

D.5. CONCLUSIONES fases del proceso de desarrollo basado en adaptadores (descubrimiento, diseño, síntesis, verificación, refinamiento y adaptación dinámica) y los hemos aplicado a diferentes escenarios (computación ubícua, registros de servicios Web y seguridad). Comenzamos con el descubrimiento de servicios Web (Capítulo 5). Si promovemos el uso de adaptadores que solventan incompatibilidades, entonces las propuestas tradicionales de descubrimiento donde se buscan servicios que encajan perfectamente son demasiado restrictivas. En su lugar, hemos desarrollado un árbol de búsqueda que es capaz de descubrir eficientemente servicios cuya signatura y comportamiento pueden ser adaptados para interoperar pese a sus incompatibilidades. Una vez son descubiertos los servicios adaptables, entonces tenemos que diseñar un contrato de adaptación. Estos contratos son un emparejamiento entre las operaciones y argumentos de los servicios, así como una máquina de estados finita que (opcionalmente) especifica como estos emparejamientos pueden ser aplicados. Los contratos de adaptación pueden ser diseñados de forma asistida usando la herramienta ACIDE, parte de ITACA [CMS+ 09a] o bien pueden ser generados automáticamente usando Dinapter (Capítulo 6). Una vez diseñados, los contratos pueden ser usados para simular y verificar el modelo del sistema antes de proceder a la siguiente fase. Entonces procedemos a sintetizar el comportamiento del adaptador que cumple con el contrato de adaptación y los servicios dados. El proceso de síntesis garantiza que todos los servicios y el contrato siempre alcanzan un estado final evitando interbloqueos, con lo que cumplen con el objetivo de la adaptación. Además, hemos visto como codificar las políticas de seguridad de los servicios en contratos de adaptación de seguridad (Capítulo 7). La síntesis de adaptadores de seguridad (Capítulo 8) requiere de un paso adicional, el refinamiento. Una vez se sintetiza un adaptador que cumple con un contrato de adaptación de seguridad, entonces se tiene que analizar para descubrir posibles puntos débiles en los que un atacante puede hacerse con información sensible. Si esto ocurre, entonces se pasa a refinar el comportamiento del adaptador y a remover aquellas interacciones que permitieron dichos ataques, manteniendo la ausencia de interbloqueos al mismo tiempo. Tanto la generación de contratos como la síntesis de adaptadores están pensadas en tiempo de diseño. Sin embargo, si los servicios pueden cambiar drásticamente su comportamiento (quizás parte del servicio es desactivado debido a un fallo hardware o a falta de batería) entonces el adaptador tiene que ser re-sintetizado. Si queremos evitar este paso (dado que la síntesis de adaptadores es un proceso costoso) o si no es 237

APÉNDICE D. ADAPTACIÓN DE SERVICIOS SOFTWARE posible desplegar nuevas versiones de los adaptadores (e.g., puede que esto involucrara desplegar físicamente los adaptadores en cada mota de una red de sensores), entonces podemos recurrir a los adaptadores con aprendizaje (Capítulo 4), los cuales son capaces de reaccionar dinámicamente a estos cambios.

D.5.1

Epílogo

Así concluye mi tesis después de algo más de cuatro años. Ya sea llamado diseño de orquestadores, coreografía de servicios, composición o coordinación de servicios Web, la adaptación es una herramienta necesaria si no quieres desarrollar sistemas desde cero. A lo largo de esta tesis he presentado una metodología formal para automatizar el desarrollo de adaptadores correctos y he intentado convencerte de sus muchas ventajas. Espero haber tenido éxito. Si así fue, por favor aprovéchate de mi trabajo y, si te ayudó, referéncialo. Y, si te ha solventado algún problema importante, puede que incluso quieras considerar ofrecerme un postdoc o –dedos cruzados– una plaza permanente. Si no estás de acuerdo con lo aquí expuesto, si consideras que hay partes confusas o si has encontrado algún error en el documento, por favor no dudes en contactarme (“José Antonio Martín” ). La crítica constructiva es más que bienvenida. Si estamos desacuerdo en algún tema interesante, puede que escribamos un artículo juntos o, al menos, puede que consiga alguna referencia. Sinceramente, espero que este documento te haya parecido interesante y no demasiado difícil de leer. Gracias.

238

Official acknowledgements This thesis has been supported by a FPU fellowship and project TIN200805932 (ReSCUE), both funded by the Ministerio de Ciencia e Innovación and the FEDER funds, and by the EU-funded network of excellence NESSoS (FP7-256980).

239

Bibliography [A+ 05]

[AB05]

[ADMR05]

[AG97]

[AINT07]

[AL08]

[And95] [And11] [Arn94] [BBC05]

[BBC07] [BBDN+ 06]

[BBG+ 06]

[BCP04]

[BCP06]

T. Andrews et al. Business Process Execution Language for Web Services (WSBPEL). BEA Systems, IBM, Microsoft, SAP AG, and Siebel Systems, February 2005. Martín Abadi and Bruno Blanchet. Analyzing Security Protocols with Secrecy Types and Logic Programs. Journal of the ACM, 52(1):102–146, January 2005. D. Aumueller, H. H. Do, S. Massmann, and E. Rahm. Schema and Ontology Matching with COMA++. In Proc. of SIGMOD’05, pages 906–908. ACM Press, 2005. Martín Abadi and Andrew D. Gordon. A calculus for cryptographic protocols: the spi calculus. In Proc. of CCS’97, pages 36–47. ACM, 1997. M. Autili, P. Inverardi, A. Navarra, and M. Tivoli. SYNTHESIS: A Tool for Automatically Assembling Correct and Distributed ComponentBased Systems. In Proc. of ICSE’07, pages 784–787. IEEE, 2007. M. Alia and M. Lacoste. A QoS and security adaptation model for autonomic pervasive systems. In Proc. of COMPSAC’08, pages 943 – 948. IEEE, 2008. H. R. Andersen. Partial model checking (extended abstract). In Proc. of LICS’95, pages 398–407. IEEE Computer, 1995. Kurt Andersen. The Protester - Person of the Year 2011. The Time, 178(25), 26 December 2011. A. Arnold. Finite Transition Systems. International Series in Computer Science. Prentice-Hall, 1994. A. Bracciali, A. Brogi, and C. Canal. A Formal Approach to Component Adaptation. Journal of Systems and Software, 74(1):45–54, 2005. Fabrizio Benigni, Antonio Brogi, and Sara Corfini. Discovering Service Compositions That Feature a Desired Behaviour. In Proc. of ICSOC’07, pages 56–68. SV, 2007. M. Boreale, R. Bruni, R. De Nicola, I. Lanese, M. Loreti, F. Martins, U. Montanari, A. Ravara, D. Sangiorgi, V. Vasconcelos, and G. Zavattaro. SCC: A Service Centered Calculus, volume 4184 of LNCS, pages 38–57. SV, 2006. S. Becker, A. Brogi, I. Gorton, S. Overhage, A. Romanovsky, and M. Tivoli. Towards an Engineering Approach to Component Adaptation. In Architecting Systems with Trustworthy Components, volume 3938 of LNCS, pages 193–215. Springer, 2006. A. Brogi, C. Canal, and E. Pimentel. Measuring Component Adaptation. In Proc. of COORDINATION’04, volume 2949 of LNCS, pages 71–86. Springer, 2004. A. Brogi, C. Canal, and E. Pimentel. On the semantics of software adaptation. Science of Computer Programming, 61(2):136–151, 2006. 241

BIBLIOGRAPHY [BH05]

[Bla09] [BP06] [CCP10]

[CCS09]

[CMP06]

[CMR08]

[CMS+ 09a]

[CMS+ 09b]

[CMS+ 10a]

[CMS+ 10b]

[CNMF11] [CNP09]

[COR]

242

Christian Bettstetter and Christian Hartmann. Connectivity of wireless multihop networks in a shadow fading environment. Wireless Networks, 11:571–579, Sept 2005. Bruno Blanchet. Automatic verification of correspondences for security protocols. Computer Security, 17(4):363–434, 2009. A. Brogi and R. Popescu. Automated Generation of BPEL Adapters. In Proc. of ICSOC’06, volume 4294 of LNCS. Springer, 2006. J. Cubo, C. Canal, and Ernesto Pimentel. Context-Aware Composition and Adaptation Based on Model Transformation. Journal of Universal Computer Science, 2010. In press. Javier Cámara, Carlos Canal, and Gwen Salaün. Behavioural Self-Adaptation of Services in Ubiquitous Computing Environments. In Proc. of SEAMS’09, pages 28–37, 2009. C. Canal, J.M. Murillo, and P. Poizat. Software Adaptation. L’Objet, 12(1):9–31, 2006. Special Issue on Coordination and Adaptation Techniques for Software Entities. Yannick Chevalier, Mohammed Anis Mekki, and Michaël Rusinowitch. Automatic composition of services with security policies. In Proceedings of the 2008 IEEE Congress on Services - Part I, SERVICES ’08, pages 529–537, Washington, DC, USA, 2008. IEEE Computer Society. doi:10.1109/SERVICES-1.2008.13. J. Cámara, J.A. Martín, G. Salaün, J. Cubo, M. Ouederni, C. Canal, and E. Pimentel. ITACA: An Integrated Toolbox for the Automatic Composition and Adaptation of Web Services. In Proc. of ICSE’09, pages 627–630. IEEE, 2009. Javier Cámara, J. A. Martín, Gwen Salaün, Carlos Canal, and Ernesto Pimentel. On Behavioural Interfaces and Contracts for Software Adaptation. In Proc. of the 3rd Workshop on Formal Language and Analysis of Contract-Oriented Software, pages 3–8, 24–25 September 2009. J. Cámara, J.A. Martín, G. Salaün, C. Canal, and E. Pimentel. A Case Study in Model-Based Adaptation of Web Services. In Tiziana Margaria and Bernhard Steffen, editors, Proc. of ISoLA’10, volume 6416 of LNCS, pages 112–126. Springer, 2010. J. Cámara, J.A. Martín, G. Salaün, C. Canal, and E. Pimentel. Semi-Automatic Specification of Behavioural Service Adaptation Contracts. ENTCS, 264(1):19–34, 2010. In proc. of FESCA’10. doi:10.1016/j.entcs.2010.07.003. J. Cubo, G. Nadia, J.A. Martín, and L. Fuentes. Contract-Based Discovery in Sensor Web. In Proc. of FLACOS’11, pages 87–96, 2011. Luca Cavallaro, Elisabetta Nitto, and Matteo Pradella. An Automatic Approach to Enable Replacement of Conversational Services. In Proc. of ICSOC’09, pages 159–174. Springer, 2009. CORBA [online]. Available from: http://www.omg.org/ spec/CORBA/ [cited 10 November 2011].

BIBLIOGRAPHY [CPS06]

[CPS08] [CSC+ 07]

[CSCO09]

[DOS09]

[DP01]

[DSW06]

[FBS04] [FH03] [FUK06]

[HCDB99]

[HD07]

[HKB99]

[HKK06]

[HL95] [Hol10]

C. Canal, P. Poizat, and G. Salaün. Synchronizing Behavioural Mismatch in Software Composition. In Proc. of FMOODS’06, volume 4037 of LNCS. Springer, 2006. C. Canal, P. Poizat, and G. Salaün. Model-Based Adaptation of Behavioural Mismatching Components. IEEE Transactions on Software Engineering, 34(4):546–563, 2008. J. Cubo, G. Salaün, C. Canal, E. Pimentel, and P. Poizat. A Model-Based Approach to the Verification and Adaptation of WF/.NET Components. In Proc. of FACS’07, volume 215 of ENTCS, pages 39–55. Elsevier, 2007. Javier Cámara, Gwen Salaün, Carlos Canal, and Meriem Ouederni. Interactive Specification and Verification of Behavioural Adaptation Contracts. In Proc. of QSIC’09, pages 65–75. IEEE, 2009. F. Duran, M. Ouederni, and G. Salaün. Checking Protocol Compatibility using Maude. In Proc. of FOCLASA’09, volume 255 of ENTCS, pages 65–81. Elsevier, 2009. Pierpaolo Degano and Corrado Priami. Enhanced operational semantics: a tool for describing and analyzing concurrent systems. ACM Comput. Surv., 33:135–176, 2001. M. Dumas, M. Spork, and K. Wang. Adapt or Perish: Algebra and Visual Notation for Service Interface Adaptation. In Proc. of BPM’06, volume 4102 of LNCS, pages 65–80. Springer, 2006. X. Fu, T. Bultan, and J. Su. Analysis of Interacting BPEL Web Services. In Proc. of WWW’04, pages 621–630. ACM Press, 2004. E. Friedman-Hill. Jess in Action: Java Rule-Based Systems. Manning Publications Co., 2003. H. Foster, S. Uchitel, and J. Kramer. LTSA-WS: A Tool for Modelbased Verification of Web Service Compositions and Choreography. In Proc. of ICSE’06, pages 771–774. ACM Press, 2006. H. Hinton, C. Cowan, L. Delcambre, and S. Bowers. SAM: Security adaptation manager. In Proc. of ACSAC’99, pages 361 – 370. IEEE, 1999. J. Harney and P. Doshi. Speeding up Adaptation of Web Service Compositions Using Expiration Times. In Proc. of WWW’07, pages 1023–1032. ACM, 2007. Wendi Rabiner Heinzelman, Joanna Kulik, and Hari Balakrishnan. Adaptive protocols for information dissemination in wireless sensor networks. In Proc. of MobiCom’99, pages 174–185. ACM, 1999. Jun Han, R. Kowalczyk, and K.M. Khan. Security-oriented service composition and evolution. In Software Engineering Conference, 2006. APSEC 2006. 13th Asia Pacific, pages 71 –78, dec 2006. doi:10.1109/APSEC.2006.51. M. Hennessy and H. Lin. Symbolic Bisimulations. Theor. Comput. Sci., 138(2):353–389, 1995. Casandra Holotescu. Controlling the Unknown. Technical 243

BIBLIOGRAPHY

[HSI+ 01]

[ISO89]

[IT99]

[ITY07]

[KMNC05] [KNB+ 09]

[KqL99]

[KSZ+ 07]

[LYR02] [Mar03] [MBP11] [MGO+ 03]

[Mil89] 244

Report 13, KIT, 28–30 June 2010. In Proc. of FoVeOOS’10. John Heidemann, Fabio Silva, Chalermek Intanagonwiwat, Ramesh Govindan, Deborah Estrin, and Deepak Ganesan. Building efficient wireless sensor networks with low-level naming. In Proc. of symposium on Operating systems principles, SOSP’01, pages 146–159. ACM, 2001. ISO/IEC. LOTOS — A Formal Description Technique Based on the Temporal Ordering of Observational Behaviour. International Standard 8807, ISO, 1989. ITU-TS. ITU-TS Recommendation Z.120: Message Sequence Chart (MSC). Technical report, International Telecommunication Union-TS, 1999. Feyza Merve Isik, Bulent Tastan, and Pinar Yolum. Automatic Adaptation of BPEL Processes Using Semantic Rules: Design and Development of a Loan Approval System. In Proc. of ICDEW’07, pages 944–951, Washington, DC, USA, 2007. IEEE Computer Society. Andreas Klenk, Marcus Masekowsky, Heiko Niedermayer, and Georg Carle. ESAF - an extensible security adaptation framework. In Proc. of NordSec’05, 2005. Woralak Kongdenfha, Hamid R. Motahari Nezhad, Boualem Benatallah, Fabio Casati, and Régis Saint-Paul. Mismatch Patterns and Adaptation Aspects: A Foundation for Rapid Development of Web Service Adapters. IEEE TSC, 2(2):94–107, 2009. Young Yong Kim and San qi Li. Capturing important statistics of a fading/shadowing channel for network performance analysis. Selected Areas in Communications, 17(5):888 –901, may 1999. doi:10.1109/49.768203. A. Kozlenkov, G. Spanoudakis, A. Zisman, V. Fasoulas, and F. Sanchez. Architecture-Driven Service Discovery for Service Centric Systems. International Journal of Web Services Research, 4(2):82–113, 2007. Jun Li, Mark Yarvis, and Peter Reiher. Securing distributed adaptation. Computer Networks, 38(3):347–371, 2002. Fabio Martinelli. Analysis of security protocols as open systems. TCS, 290(1):1057–1106, 2003. J.A. Martín, A. Brogi, and E. Pimentel. Learning from Failures: a Lightweight Approach to Run-Time Behavioural Adaptation. In Proc. of FACS’11, page ??, 2011. Manamohan Mysore, Moshe Golan, Eric Osterweil, Deborah Estrin, and Mohammad Rahimi. TinyDiffusion in the extensible sensing system [online]. 12 August 2003. Available from:

http://www.cens.ucla.edu/~mmysore/Design/OPP/ [cited 29 April 2011]. R. Milner. Communication and Concurrency. Prentice-Hall, 1989.

BIBLIOGRAPHY [Mil99] [MM10] [MMP12] [MNBM+ 07]

[MP09a]

[MP09b]

[MP10a] [MP10b]

[MP10c]

[MP10d] [MP11a]

[MP11b] [MPS05]

[MPS08]

[MPV02]

[MRD08]

[Mur89]

Robin Milner. Communicating and mobile systems: the &pgr;calculus. Cambridge University Press, New York, NY, USA, 1999. Fabio Martinelli and Ilaria Matteucci. A framework for automatic generation of security controller. STVR, 2010. J.A. Martín, F. Martinelli, and E. Pimentel. Synthesis of Secure Adaptors. JLAP, 81(2):99 – 126, 2012. doi:10.1016/j.jlap.2011.08.001. H. R. Motahari Nezhad, B. Benatallah, A. Martens, F. Curbera, and F. Casati. Semi-Automated Adaptation of Service Interactions. In Proc. of WWW’07, pages 993–1002. ACM Press, 2007. J. A. Martín and E. Pimentel. Automatic Generation of Adaptation Contracts. In Proc. of FOCLASA’08, volume 229 of ENTCS, pages 115–131. Elsevier, 21 July 2009. J. A. Martín and E. Pimentel. Dinapter: Automatic Adapter Specification for Software Composition. Electronic Notes in Theoretical Computer Science, 248:161–171, 5 August 2009. J.A. Martín and E. Pimentel. Behavioural Adaptation for Scalable Service Discovery. Presentation at FOCLASA’10, 2010. J.A. Martín and E. Pimentel. Feature-Based Discovery of Services with Adaptable Behaviour. In Proc. of the 8th European Conference on Web Services (ECOWS), pages 91—-98. IEEE, 2010. J.A. Martín and E. Pimentel. Scalable Discovery of Behavioural Services through Software Adaptation. Work in progress at PROLE’10, 2010. J.A. Martín and E. Pimentel. Synthesis and Analysis of Adaptors through Security Contracts. In Proc. of FLACOS’10, 2010. J.A. Martín and E. Pimentel. Contracts for Security Adaptation. JLAP, 80(3-5):154 – 179, 2011. doi: 10.1016/j.jlap.2010.07.001. J.A. Martín and E. Pimentel. Security Adaptation Contracts. In Proc. of PROLE’11, 2011. Oded Maler, Amir Pnueli, and Joseph Sifakis. On the synthesis of discrete controllers for timed systems. In Proc. of STACS’95, volume 900 of LNCS, pages 229–242. Springer, 2005. R. Mateescu, P. Poizat, and G. Salaün. Adaptation of Service Protocols using Process Algebra and On-the-Fly Reduction Techniques. In Proc. of ICSOC’08, volume 5364 of LNCS, pages 84–99. Springer, 2008. Fabio Martinelli, Marinella Petrocchi, and Anna Vaccarelli. Automated analysis of some security mechanisms of scep. In Proc. of ISC’02, pages 414–427, 2002. O. Moser, F. Rosenberg, and S. Dustdar. Non-Intrusive Monitoring and Adaptation for WS-BPEL. In Proc. of WWW’08, 2008. To appear. T. Murata. Petri Nets: Properties, Analysis and Appli245

BIBLIOGRAPHY cations.

[NBM+ 07]

[NL98]

[NPKR07]

[NRXB10]

[OSP10]

[OWL04]

[Pad09]

[Plo81] [PMBT05]

[PPM04]

[PS07]

[PTDL07]

[RB01] 246

Proceedings of the IEEE, 77(4):541–580, 1989. doi:10.1109/5.24143. H.R.M. Nezhad, B. Benatallah, A. Martens, F. Curbera, and F. Casati. Semi-automated adaptation of service interactions. In Proceedings of the 16th International Conference on the World Wide Web (WWW ’07), pages 993–1002. ACM, 2007. George C. Necula and Peter Lee. Safe, Untrusted Agents Using Proof-Carrying Code. In Mobile Agents and Security, pages 61–91, 1998. N. C. Narendra, K. Ponnalagu, J. Krishnamurthy, and R. Ramakumar. Run-Time Adaptation of Non-Functional Properties of Composite Web Services Using Aspect-Oriented Programming. In Proc. of ICSOC’07, volume 4749 of LNCS, pages 1023–1032. Elsevier, 2007. M. Nezhad, H. Reza, G.Y. Xu, and B. Benatallah. Protocol-aware matching of web service interfaces for adapter development. In Proc. of WWW’10, pages 731–740, New York, NY, USA, 2010. ACM. doi:10.1145/1772690.1772765. Meriem Ouederni, Gwen Salaün, and Ernesto Pimentel. Quantifying service compatibility: A step beyond the boolean approaches. In Proc. of ICSOC’10, volume 6470 of LNCS, pages 619–626. Springer, 2010. doi:10.1007/978-3-642-17358-5_47. Owl-s: Semantic markup for web services [online]. 23 November 2004. Available from: http://www.w3.org/Submission/ OWL-S/ [cited 10 November 2011]. Luca Padovani. Contract-Based Discovery and Adaptation of Web Services. In Formal Methods for Web Services, volume 5569 of LNCS, pages 213–260. Springer, 2009. G. D. Plotkin. A Structural Approach to Operational Semantics. Lecture notes, Aarhus University, 1981. Marco Pistore, Annapaola Marconi, Piergiorgio Bertoli, and Paolo Traverso. Automated Composition of Web Services by Planning at the Knowledge Level. In IJCAI, pages 1252–1259, 2005. T. Pedersen, S. Patwardhan, and J. Michelizzi. Word-Net::Similarity - Measuring the relatedness of concepts. In Proc. of Intelligent Systems Demonstrations, held in conjunction with AAAI, pages 267–270. AAAI Press, 2004. P. Poizat and G. Salaün. Adaptation of Open Component-Based Systems. In Marcello M. Bonsangue and Einar Broch Johnsen, editors, Proc of FMOODS, volume 4468 of LNCS, pages 141–156. Springer, 2007. Michael P. Papazoglou, Paolo Traverso, Schahram Dustdar, and Frank Leymann. Service-Oriented Computing: State of the Art and Research Challenges. Computer, 40(11):38–45, 2007. E. Rahm and P.A. Bernstein. A survey of approaches to automatic

BIBLIOGRAPHY

[RCHP00]

[RN95] [Sal08]

[SBS06] [SCMS+ 09]

[Scr07] [Sip96] [SR02] [SSL+ 09]

[SZK05]

[vdAtH05]

[Vig06] [WDOV08]

[WHW07]

schema matching. The VLDB Journal, 10:334–350, December 2001. doi:10.1007/s007780100057. D. J. Ragsdale, C. A. Carver, J. W. Humphries, and U. W. Pooch. Adaptation techniques for intrusion detection and intrusion response systems. In Proc. of SAM’00, volume 4, pages 2344 – 2349. IEEE, 2000. S. Russel and P. Norvig. Artificial Intelligence: a Modern Approach. Prentice-Hall, 1995. G. Salaün. Generation of Service Wrapper Protocols from Choreography Specifications. In Proc. of SEFM’08, pages 313–322. IEEE Computer Society, 2008. G. Salaün, L. Bordeaux, and M. Schaerf. Describing and Reasoning on Web Services using Process Algebra. IJBPIM, 1(2):116–128, 2006. Francisco Sánchez-Cid, Antonio Maña, George Spanoudakis, Christos Kloukinas, Daniel Serrano, and Antonio Muñoz. Security and Dependability for Ambient Intelligence, chapter 5, pages 69–95. 2009. K. Scribner. Microsoft Windows Workflow Foundation: Step by Step. Microsoft Press, 2007. Michael Sipser. Introduction to the Theory of Computation. International Thomson Publishing, 1996. H. W. Schmidt and R. H. Reussner. Generating adapters for concurrent component protocol synchronisation. In Proc. of FMOODS’02, pages 213–229. Kluwer, B.V., 2002. A. Souza, B. Silva, F. Lins, J. Damasceno, N. Rosa, P. Maciel, R. Medeiros, B. Stephenson, H. Motahari-Nezhad, J. Li, and C. Northfleet. Incorporating security requirements into service composition: From modelling to execution. In Proc. of ICSOC’09, volume 5900 of LNCS, pages 373–388. Springer, 2009. doi:10.1007/978-3-642-10383-4_27. G. Spanoudakis, A. Zisman, and A. Kozlenkov. A Service Discovery Framework for Service Centric Systems. In Services Computing, 2005 IEEE International Conference on, volume 1, pages 251–259, July 2005. W. M. P. van der Aalst and A. H. M. ter Hofstede. Yawl: yet another workflow language. Information Systems, 30(4):245 – 275, 2005. doi:10.1016/j.is.2004.02.002. Luca Viganò. Automated security protocol analysis with the avispa tool. ENTCS, 155:69–86, 2006. Kenneth W. Wang, Marlon Dumas, Chun Ouyang, and Julien Vayssiere. The service adaptation machine. In Proc. of ECOWS’08, 2008. D. W. Williams, J. Huan, and W. Wang. Graph Database Indexing Using Structured Graph Decomposition. In Proc. of ICDE’07, 247

BIBLIOGRAPHY

[WSD01]

[WSM05]

[YS97]

[YYH05]

[ZSD08]

[ZZ04]

248

pages 976–985. IEEE Computer Society Press, 2007. Web service definition language (WSDL) [online]. 14 March 2001. Available from: http://www.w3.org/TR/wsdl [cited 10 November 2011]. Web service modeling ontology (wsmo) [online]. 13 June 2005. Available from: http://www.w3.org/Submission/WSMO/ [cited 10 November 2011]. D. M. Yellin and R. E. Strom. Protocol Specifications and Component Adaptors. ACM Trans. Program. Lang. Syst., 19(2):292–333, 1997. X. Yan, P. S. Yu, and J. Han. Graph Indexing Based on Discriminative Frequent Structure Analysis. ACM Trans. Database Syst., 30(4):960–993, 2005. A. Zisman, G. Spanoudakis, and J. Dooley. A Framework for Dynamic Service Discovery. In Proc. of ASE’08, pages 158–167. IEEE Computer Society, 2008. W. Zhi and G. Zhogwen. A dynamic security adaptation mechanism for mobile agents. In Proc. of ICC’09, pages 334–339, 2004.

Glossary A* An informed search algorithm which is optimal in the number of explored nodes if the heuristic function is optimistic. See [RN95]. 91, 97, 100–106, 108, 109 ACIDE is an ITACA tool for the assisted design, simulation and verification of adaptation contracts. 23, 106, 207, 208, 217, 230 AD tree Adaptation-Dependency tree is a tree with OR nodes, AND nodes and END nodes where edges are labelled with a (possibly empty) set of features. Sets of features are prefixed by ‘?’ or ‘!’ depending on whether these features are required or provided, respectively. See Definition 5.2.1. 78–83, 186 (Definition 2.1.1) is a finite state machine adaptation

c contract Σ , Sc , sc0 , F c , T c where Σc is a set of vectors, Sc is a set of states, sc0 ∈ Sc is the initial state, F c ⊆ Sc is the set of final states, and T c ⊆ (Sc × Σc × Sc ) is a set of labelled transitions. 16, 17, 19, 23, 24, 29, 30, 40–42, 50, 53, 56, 57, 69, 89, 90, 108, 117, 120, 149, 181, 182, 186–188, 203, 205, 208, 209 contract is short of adaptation contract. 16, 23, 29, 42, 52, 53, 57, 89, 103, 106–109, 117, 120, 182, 186, 188, 205, 208 SAC a Security Adaptation Contracts is an adaptation contract whose adaptation vectors (Σc ) contain the cryptographic primitives. These cryptographic primitives are applied whenever a matching message is received (to verify security properties such as integrity, confidentiality and authenticity) or sent (to enforce those properties and adapt the information to different security policies). See Definition 7.3.7. 17–19, 114–117, 120–123, 135, 137, 140, 143, 144, 146, 147, 149–152, 154–157, 174, 176, 178, 188 adaptation vector a vector between the adaptor and one (e.g., a ♦ or ♦ a) or two services (e.g., a ♦ b). 24, 50, 143, 205 vector is short of adaptation vector. 24, 25, 28, 53, 60, 67, 69, 94–100, 104, 106, 120, 205, 208 adaptor a service orchestrator, usually compliant with an adaptation contract, which is able to overcome signature and behavioural incompatibilities and successfully avoid lock situations. 14, 15, 17, 23–25, 29, 51, 52, 73, 88–90, 95, 96, 98, 99, 103, 106, 108, 109, 116, 121, 122, 126, 135–140, 142–144, 146, 168, 182–188 dynamic adaptor an adaptor which evolves at run time and is able to react to unexpected changes in the environment. 17, 54, 182, 188, 208 learning adaptor a dynamic adaptor which is not synthesised, instead, it directly communicates with the services and it learns 249

Glossary from successful and failed sessions how to adapt subsequent interactions. 17, 19, 51, 52, 54, 56, 60, 62, 208, 209 security adaptor is an adaptor able to adapt QoS security. In particular, is able to perform cryptographic operations over the messages to verify and recompose them while preserving secrecy properties over sensitive data. 17, 113–117, 121, 149–152, 155–157, 162, 163, 165, 169, 174, 178, 182 behaviour (of a service) the order in which operations are required/called and offered/received by the service. A service behaviour may include loops, internal choices (e.g., ifs) and external choices (e.g., picks). 15, 16, 28, 89, 91, 95, 114, 125, 127, 147, 184, 188 contract term is the argument expression at either side of the vectors of a SAC. contract terms are typed messages with symbolic parameters. See Section 7.3.1. 117, 135–144, 146, 156, 159–161, 164, 195 Crypto-CCS is a process algebra similar to CCS but extended with cryptographic operations and security verification algorithms. 43, 130, 132, 134, 135, 149–151, 156, 157, 159, 161–163, 169, 171, 174, 176, 178, 195 DAMASCo a framework for discovery, adaptation and monitoring of context-aware services and components. 41 Dinapter is the tool we developed for the automatic generation of adaptation contracts, see Chapter 6. 19, 23, 89, 92, 94, 103, 105–108, 182, 187, 188, 205, 207, 208, 217, 230 eavesdrop means to listen secretly to the private conversation of others. 131, 132, 135, 152, 153, 155, 169, 174, 177, 195 FD rule Feature-Dependency Rule belong to F × {F} × {F} and are denoted by:

f ← { f1r , . . . , fnr } | { f1e , . . . , fme }

where fir , f je ∈ F, i ∈ [1, n], j ∈ [1, m] and F is the set of features. See Definition 5.2.2. 78, 81–86, 88, 186 FD tree Feature-Dependency Tree is a search tree for discovering adaptable services according to their corresponding FD rules. 84–87, 186 feature is an annotation on a service action which describes either an argument, a non-functional property or a side effect of the action. 250

Glossary Features are classified into provided and required features. The former are offered by the input action they annotate whereas the latter are needed/requested by an annotated output action. 73–76, 78, 80–88 hash a kind of mathematical function which i) maps a large domain into a smaller image, ii) it is virtually impossible to inverse, and iii) tries to avoid collisions. The value returned by a hash function is usually called digest. 67, 113, 114, 127–129, 135, 136, 138, 146, 154, 157, 162 digest is the name given to the results of a hash function. 113, 119, 135, 137, 142 MAC is a keyed (cryptographic) hash function. It serves to protect the both the integrity and authenticity of a message since the MAC function is, in essence, a hash function which accepts as input a shared secret key. 129, 154, 155 mashup is a Web page or application that uses and combines data, presentation or functionality from two or more sources to create new services. The term implies easy, fast integration, frequently using open APIs and data sources to produce enriched results that were not necessarily the original reason for producing the raw source data. 12 nonce is a number, usually randomly generated, which is used only once throughout the life of the protocol for authenticity and integrity reasons. 113, 114, 119–121, 129, 130, 136, 138, 146, 162 signature (of a service) the name and arguments, including types, of the service operations. 14–16, 25, 89, 114, 125, 147, 184, 188 STS-XML is the XML-based language used to describe service interfaces and adaptation contracts in the ITACA toolbox, see Section B.1. 203, 205, 208, 209 WS-Policy is an XML-based language to describe the security policies (e.g., supported encryption algorithms) offered and required by an entity. 122, 124, 125 WS-SecureConversation is an extension of WS-Security which defines Security Context Token to support sessions composed of several WS-Security messages. 114, 124 WS-Security is a SOAP extension which includes message integrity, confidentiality and single-message authentication, see Section 7.2.1. 113, 115, 122–126, 128, 135, 137, 147 251

Glossary WS-Trust is an specification which describes how to issue, renew and validate security tokes to establish, assess and broker trust relationships among entities. 114, 123–126

252

Acronyms AI Artificial Intelligence. 10 API Application Programming Interface. 11–14, 38, 74, 79, 216 ATOM Atom Syndication Format. 12 BPEL Business Process Execution Language is used to describe the possible sequence of operations which conform a business processes, usually exposed through a Web service. See [A+ 05]. 23, 29, 38, 39, 42, 43, 49, 71, 75, 79, 80, 82, 90–93, 105, 119, 122, 123, 127, 152, 155, 182–184, 216, 223 BPMN Business Process Modeling Language. 43 CADP Construction and Analysis of Distributed Processes. 184 CASE Computer-Aided Software Engineering. 23 CCS Calculus for Communicating Systems [Mil89]. 43, 130, 131, 156 CORBA Common Object Request Broker Architecture. 37, 89 CRM Customer Relationship Management. 11 FSM Finite State Machine. 24, 28, 32, 40, 50, 56, 91–93, 119, 120, 122, 144, 183, 188, 205, 218, 222 hMSC high-level Message Sequence Chart. See [IT99]. 38, 39 HTML HyperText Markup Language. 12, 13 HTTP HyperText Transfer Protocol. 15 HTTPS HyperText Transfer Protocol Secure. 13 ICSE International Conference on Software Engineering. 19 IDL Interface Description Language. 37, 38, 89 ITACA Integrated Toolbox for Automatic Composition and Adaptation [CMS+ 09a]. More information at http://itaca.gisum. uma.es/. 19, 23, 64, 92, 105–107, 123, 164, 186–188, 203, 208, 217, 230 JCR Journal Citation Reports. 17 JSON JavaScript Object Notation. 12–14 LOTOS Language Of Temporal Ordering Specification [ISO89]. 42 LTL Linear Temporal Logic. 88 LTS Labelled Transition System [Plo81]. 39, 41 OAuth Open Authorization. 13 OWL-S Web Ontology Language for Services. See [OWL04]. 38 253

Acronyms PHP Hypertext Preprocessor. 13 PROLE Jornadas de PRogramación y LEnguajes. 19 QoS Quality of Service. 15, 38, 71, 149, 151, 178, 185 REST REpresentational State Transfer. 13, 38 RESTful adjective concerning to REST. 12, 15 RSS Really Simple Syndication. 12 SCC Service-Centered Calculus. See [BBDN+ 06]. 109 SM State Machine. 30, 31 SME Small-Medium Enterprise. 10 SMS Short Message Service. 12 SOA Service Oriented Architecture. 23, 216 SOAP Simple Object Access Protocol. 12, 13, 49, 74, 113, 123–125 SPIN Sensor Protocol Information via Negotiation. 52, 53, 55, 64, 203 SSH Secure SHell. 119 STG Symbolic Transition Graph. 42 STS Symbolic Transition System. 42, 106 URL Uniform/Universal Resource Locator. 12, 182, 183 WF Windows Workflow (Foundation). See [Scr07]. 23, 38, 71, 90, 91, 106, 122, 152, 216 WS Web service. 10, 12, 23, 37, 71, 75, 89–91, 113, 115, 117, 122–124, 147, 155, 181, 184–186, 188, 216, 217 WSAN Wireless Sensor Network. 52, 188 WSDL Web Service Description Language is used to describe the operations offered by a Web service. 12, 37, 38, 49, 89, 182–184 WSMO Web Services Modeling Ontology. See [WSM05]. 38 XML Extensible Markup Language. 40, 124, 182, 183, 203, 209 XML-RPC XML remote procedure call. 12 YAWL Yet Another Workflow Language. See [vdAtH05]. 39, 90

254

List of symbols ≤ is a partial order between two FD rules. It intuitively represents when

a FD rule requires more features, or requires them earlier, than another FD rule. See Definition 5.2.3. 82, 83 ∈ is the operator which tells when a given trace t is one of the inhibited traces I , i.e., I ∈ t . 57, 58, 61–63, 68, 194 · is the append operator. 34, 57–59, 63, 68, 228 :: is the concatenation operator. 57, 59, 62, 63 . is the prefix operator. 57 a simulation relation between FSMs. See Definition 2.3.1. 31, 32, 225, 226 ` is the operator which specifies whether a contract term T matches an interface action, i.e., T ` T . See Theorem 7.3.1. 140, 142, 195 l − →c represents a transition in the (possibly non-deterministic) intensional semantics of an adaptation contract c. See Section 2.2. 30, 31, 95, 138, 144, 145, 165, 224, 225 l

,→c represents a transition in the deterministic intensional semantics of an adaptation contract c. See L. 31, 32, 57, 58, 138, 156, 225, 226 l − → standard symbol to represent a regular (possibly labelled) transition. 25, 32–34, 38, 57, 58, 60, 68, 132, 135, 138, 193, 195, 213, 214, 219, 226–228

♦ bidirectional separator between the two sides of an adaptation vector, see Definition 2.1.1. 24, 25, 27, 30–32, 53, 54, 68, 94, 95, 97–102, 107, 114, 121, 138, 143–145, 165, 205, 214, 218, 221, 224–226 [T ]θ is the function which, being given a substitution θ from symbolic parameters to Types, it transforms the actions of an adaptor (∈ Σa ) into their corresponding interface (∈ Σi ). See Definition 7.3.2. 139, 140, 143, 156, 157, 159, 160, 162, 195, 196, 214

Ac deterministic SM representing the intensional semantics of an adaptation contract such that Ac0 Ac . 31–33, 164, 166–168, 174–176, 198, 201, 225–227

Ac0 (possibly non-deterministic) SM representing the intensional semantics of an adaptation contract, see Section 2.2. 30, 31, 224, 225

Act is the set of typed actions. 127, 138 BM is the set of basic messages. 127 BT is the set of basic types. 127, 136, 138, 142 Chan is the set of communication channels. 127, 131, 138 255

List of symbols

CTerm is the set of contract terms. 136, 138, 139, 141 D is the function which returns the set of typed messages that can be inferred from the given knowledge through the rules in IS, see Table 7.2. 171–173, 200

F is the set of cryptographic constructors. 127 F c successful states of an adaptation contract. 24, 58, 218 IS is an Inference System to process cryptographic messages, see Table 7.2. 127, 130, 131, 134, 144, 157, 159–162, 169

κ is a substitution which replaces symbolic parameters (∈ Param) with run-time values. 121, 138, 139, 145, 146, 159, 161–163, 165, 196

Msgs is the set of typed messages. 127, 138 O[M] denotes the set of possible transition traces starting from the initial state of the FSM M . 25, 167, 197 Param is the set of symbolic parameters. 136, 138–141, 165 Sc is the set of contract states. 24, 218 sc0 the initial state of an adaptation contract. 24, 218 Σa is the set of actions of an adaptation contract, e.g., login?U, P. 24, 30, Σc

31, 57, 59, 61, 62, 138, 139, 156, 157, 218

⊆ (Σa × Σa ) ∪ Σa is the set of adaptation vectors, e.g., pass!P ♦ login?U, P. 24, 27, 28, 30, 31, 54, 67, 95, 121, 138, 144, 145, 157, 165, 205, 218, 221, 225

Σi interface actions (see Definition 7.2.1) are denoted by their channel (∈ Chan), an exclamation mark or a question mark for output and

input actions, respectively, followed by a (possibly structured) type of the message (IAct = (Chan × {!, ?} × Types) ∪ {τ}). 28, 114, 127, 132, 138, 156, 203, 222, 227 Sort is the function which returns the set of channels ocurring in the given Crypto-CCS process. 171–173, 176, 198, 200, 201

t˜ denotes a trace of transitions. Other alternative notations are l0 ln t˜ = t1 · . . . · tn = s0 −→ . . . −→ sn+1 . 25 c T the set of labelled transitions of an adaptation contract. 24, 25, 32, 121, 144, 205, 218, 219, 226

TER Transition Error Rate represents the probability of any given synchronisation to forcibly fail due to sporadic errors. 66 256

List of symbols

θ is a substitution which replaces symbolic parameters (∈ Param) with Types. 121, 138–140, 145, 146, 156, 159–163, 165, 195 θT,T denotes the only minimal substitution θ able to match T and calT or, more formally, T ` T . See Definition 7.3.3. 140, 142 Types is the set of structured types including cryptographic constructors. 127, 135–140

257

Secure adaptation of software services

Secure adaptation of software services

Suggest Documents

Secure Software Engineering

Software Adaptation

Finding Secure Compositions of Software Services: Towards A Pattern ...

Secure Spiral: A Secure Software Development Model

developing secure software - CiteSeerX

Secure Software Development

Untitled - Secure Infrastructure & Services

Adaptation dynamique de services

Software Adaptation - Semantic Scholar

Secure Media Streaming & Secure Adaptation for Non ... - HP Labs

Secure Scrum: Development of Secure Software ... - Semantic Scholar

Secure Scrum: Development of Secure Software ... - Semantic Scholar

developing secure software - Semantic Scholar

Secure SOA Web Services - Conferences

Safe & secure with Lenovo Services

Building Secure Software (tutorial) - Usenix

Secure Security Software - Google Sites

Dynamic Adaptation in Ubiquitous Services

Self-Optimization of Secure Web Services - CiteSeerX

Creating Secure Software - CERN School of Computing

Security Requirement Elicitation Phase of Secure Software ...

Secure dissemination of software updates for ... - SpringerOpen

Traceability for the Maintenance of Secure Software

Secure Software Development Model - International Association of ...