A Data Confidentiality Architecture for Developing ...

3 downloads 0 Views 2MB Size Report
and system architecture called Maestro that allows developers to express their data privacy concerns and enforce policies during mashup executions.
12th IFIP/IEEE International Symposium on Integrated Network Management 2011

A Data Confidentiality Architecture for Developing Management Mashups Winnie Cheng and Nikos Anerousis

Carlos R. P. dos Santos and Rafael s. Bezerra and Lisandro Z. Granville and Leandro M. Bertholdo

IBM Watson Research

Institute of Informatics - UFRGS - Porto Alegre, RS, Brazil

Hawthorne, NY 10532, USA

Email: {crpsantos.rsbezerra.berthold.granville}@inf.ufrgs.br

Email: {wcheng.nikos}@us.ibm.com

Abstract-Mashups are powerful applications created from

in this paper. Especially in network management applications,

accessing and composing multiple and distributed information

mashups may be required to handle and display sensitive data

sources. Their ease-of-use and modularity allow users at any

that require well-defined disclosure policies. For example, in

skill level to construct, share and integrate their own applications.

a common application that shares traffic information between

However, data security concerns remain a hindering factor in its widespread adoption, in particular, for network management.

Internet service providers, a certain provider may engage in

In this paper, we propose a novel development methodology

proprietary peering relationships with competing providers

and system architecture called Maestro that allows developers

and the inadvertent publishing of such detailed information

to express their data privacy concerns and enforce policies

through a mashup may compromise their market position. On

during mashup executions. We evaluated Maestro by building two

the other hand, the disclosure of aggregate traffic can be very

mashup applications for managing live networks and by running

useful to all parties to monitor quality of service levels, as long

performance tests that show that our runtime has negligible

as the details are not being disclosed. Such examples motivated

overhead.

us to study further the requirements for data protection in

I. INTRODUCTION

shared mashup environments; particularly what mechanisms must be put in place to satisfy the confidentiality concerns of

In recent years we have observed a quick proliferation of a new class of Internet applications better known as "mashups"

different parties.

or "situational applications". A mashup can be characterized

Protecting sensitive data in a highly collaborative and dy­

as the product of combining two or more complimentary

namic mashup environment poses several non-trivial questions.

web-based applications and/or data sources. The increasing

Our research indicates that the solution should be centered on

availability of APIs for popular web applications and services

an enhanced composition programming model that specifically

makes it possible now for even the most inexperienced users

captures and enforces disclosure policies. In this way, data

to compose mashups to solve some very specific problems

protection mechanisms are built directly into these applica­

(hence the term "situational"). It is very common nowadays

tions. The model should be easy to understand, sufficiently

to encounter applications that overlay data objects on top of

expressive, and able to represent broad categories of security

a mapping service and thereby providing an enhanced view

concerns among a wide range of mashup applications.

of the data. It was not long before systems and network man­

To satisfy the above-mentioned requirements we adopt the

agement practitioners took advantage of the mashup paradigm

concept of Information Flow Control (IFC) in our architecture.

to enable a new class of management applications on top of

IFC is a security model that tracks the flow of sensitive data,

existing repositories of management information.

enables the specification of disclosure policies and enforces

In our previous work [1], we conducted a set of experiments

them. IFC is a promising direction in enhancing the security of

and analysis of the mashups approach and we concluded that

mashups, particularly in the context where data is interchanged

mashups are a promising and appropriate tool for monitoring,

with other mashup modules. This paper introduces a new

diagnosing and managing networks. They provide an excel­

mashup development system, named Maestro, that extends

lent visualization medium for composed information, collect­

a state-of-the-art IFC programming model, Aeolus [2], with

ing and aggregating data from multiple network monitoring

a highly modular mashup composition architecture. The key

and performance tools. Furthermore, the modular nature of

contributions of our research work are to:

mashups encourages re-use and sharing of modules between

1) Propose a novel secure mashup development environ­

different service providers and network administration staff,

ment;

which in turn, allows composing even more sophisticated

2) Show a proof-of-concept prototype with negligible over­

applications.

head and little compromise on usability;

As the spectrum of possible applications widens, a number

3) Demonstrate realistic scenarios for privacy-preserving

of challenges intrinsic to mashups become apparent. Data

network management.

security in particular proves to be extremely challenging in

We designed and implemented a secure mashup develop-

an open mashup environment, and is the main topic of study

978-1-4244-9221-31111$26.00 ©2011 IEEE

49

ment and deployment platform that enables mashup developers

sources tend to be heterogeneous, wrappers are often used

to respect various data confidentiality concerns during the

to standardize and retrieve the interested data. During the

composition process. We created two management mashups

augmentation step, the user composes the data by defining

for administrating Autonomous Systems (AS): one focuses on

modules that operate over the data, transform it, and generate

network quality management and the other on network traffic

more meaningful information from the aggregate. The result

monitoring. These scenarios are derived from interviewing

of this process is presented to the mashup creator and viewers

real-world experts, particularly network administrators, and

in the presentation step. The presentation can range from a

from our understanding of their concerns. The network quality

simple text file containing the results of the composition, to a

management mashup aggregates information such as network

complex and highly interactive map display.

traffic, number of announced routes, input and output queue

Mashups can also be built in an ad-hoc manner, using

sizes, number of discarded packets, and CRC errors from

standard programming tools and integrated development en­

BGP-enabled routers connecting AS peers. Such management

vironments (IDE) to develop one specific composition. This

capability is important because it allows network adminis­

approach relies heavily on the skill of web developers. Mashup

trators in each AS to verify whether their service providers

systems make the creation of mashups accessible to a larger

have violated Service Level Agreements (SLAs), requiring the

community, with no programming expertise needed to create

enforcement of financial provisions in the contract. The second

custom mashup applications. Such systems are environments

mashup is designed for network traffic monitoring providing

for mashup creation, similar to an IDE for traditional devel­

different views of the backbone network traffic with different

opment, adhering to the development methodology described

access levels of the information for each group.

earlier. Mashups systems are also runtime platforms for exe­ cuting and storing created compositions.

The remainder of this paper is organized as follows. In Section 2, we provide a background of our research. In Section

In the context of the network management area, we con­

3, we introduce our architecture, Maestro, which supports

ducted a set of experiments and analysis in our previous work

strong confidentiality mechanisms for mashup development.

[1] that has shown mashups as a promising and an appropriate

In Section 4, we describe two network management scenarios

tool for monitoring, diagnosing and managing networks. This

where the enforcement of such mechanisms is necessary. In

work also provides depth details of the mashup system used

Section 5, we study two implementations of Maestro for these

and extended in this paper. Compared to the current network

scenarios by creating the corresponding mashup applications

management solutions, easier and faster development are the

and evaluating their implementation. Section 6 discusses re­

main advantages of solutions based on mashups, allowing the creation of larger applications, and reducing development costs

lated work. The paper concludes in Section 7.

and time. II. BACKGROUND A.

B.

Web 2.0 and Mashups

Mashups enable new capabilities in data visualization. How­

In recent years, web applications have taken on renewed

ever, the ability to protect such data is essential. W hile data

interested both in industry and academia, termed as Web 2.0.

protection techniques have been proposed since the 1970s [6]

These trends propose web applications that focus on the end­

[7], the majority of the work in this area centered on providing

users and more importantly, encourage them to be organizers

data protection for static data such as databases and files [8].

of content in addition to being consumers of information. As a

Mashups present a very different operating environment, in

result, there is an increasing demand for principles that support

which data may be generated dynamically as the output of

re-use of content, collaborative sharing and compilation of

one mashup module and fed through a variety of intermediate

content among users [3]. Mashups have emerged as a class

ones before the final result is displayed to the end-user.

of web application that composes content from a variety of

Conventional techniques for ensuring the confidentiality

web resources such as web pages, RSS feeds, web services

of data can be broadly classified into two bodies of work:

and online APIs [4].

INGESTlOH Retrieval

--

Data Protection

access control and cryptographic key management. Access AUGIIENTAllON Composition PlaNling

--'

control provides frameworks for identifying the users of a PRESENTATION

system and seeks to restrict the access different users have

Viewing

on the data. For example, role-based access control [9] allows

r-------, Provisioning

Slandardizallon

the specification of a task-based role such as "Accounting Department Personnel", the types of data (e.g., revenue and

Fig. 1.

Mashup Development Methodology

expense report files) that is accessible to users of this role, and the operations (e.g., read, and/or write) that can be done

Typically, mashups are built following the methodology pro­

on the data. Access control protects data in storage rather

posed by Jhingran et. at [5], presented in Figure 1. There are

than in flight, the latter is common to mashups. It deals

three main steps: ingestion, augmentation, and presentation.

poorly with data that are output of computations and also

The ingestion step gathers data from disparate sources, such

with data that have been transformed, in many cases iteratively

as RSS Feeds, Web Services, and online APIs. Since these

by several mashup modules. Cryptographic key management

50

[10] uses encryption algorithms to hide data and relies on

The secrecy rule says that the destination's secrecy label

the distribution of the encryption/decryption keys to restricted

(B.Ls) must be a superset of the source's secrecy label (A.Ls)

users of the system to protect the confidentiality of the data. It

to ensure that the sensitivity of all sources contributing to the

can better hide in-flight data than access control techniques, at

destination is captured by the destination's secrecy label.

the expense of significantly complicating the key generation

To support our secrecy mechanism, we designed the com­

and distribution schemes as it has to cover all possible data

ponents highlighted in Figure 2. These components are re­

transfer points between mashups. Encryption protocols must

sponsible for tracking the propagation of data and preventing

be agreed upon between two interacting mashups and new

accidental sensitive data leakage. They do the former by

keys must be generated whenever sharing relationships change.

enforcing the secrecy rule and the latter by disallowing any

The key strength of encryption is in ensuring confidentiality

data with non-empty secrecy label to leave the system. For

of data content during transfers rather than in expressing

example, a text field that has the Financial tag in its secrecy

confidentiality policies between myriad of network endpoints.

label cannot be published on a public web portal. The Secrecy

An area of security research that has recently regained

component provides labeling support and propagation of data

interests is information flow control [11]. Information flow

in accordance with the secrecy rule. The User Authentication

control focuses on controlling the propagation of sensitive

component in Figure 2, besides providing user login validation,

data. It does this by labeling the data and by providing rules

uses the user credentials to relate it to a principal. The runtime

on how labeled data can travel through a system. Different

engine in the Composition Runner, runs the mashup as a

information flow control techniques have been proposed and

particular principal that has very specific authority on what

applied to a variety of settings: at the programming language

sensitive data it can disclose.

level, at distributed application programming model, and for operating system processes monitoring. In this work, we

Publisher

extend the Aeolus programming model in [2] for mashup

User Authentication

development. Comparisons with other IFC works were also

i I

done in [2]. Compared with other IFC models, Aeolus is easy­ to-use with a simple set of rules yet retaining expressiveness of confidentiality policies. Compared with other IFC implemen­

Composition Runner

tations, Aeolus is particularly suited to describing functions and outputs of computations.

�--

)

J: