and system architecture called Maestro that allows developers to express their data privacy concerns and enforce policies during mashup executions.
12th IFIP/IEEE International Symposium on Integrated Network Management 2011
A Data Confidentiality Architecture for Developing Management Mashups Winnie Cheng and Nikos Anerousis
Carlos R. P. dos Santos and Rafael s. Bezerra and Lisandro Z. Granville and Leandro M. Bertholdo
IBM Watson Research
Institute of Informatics - UFRGS - Porto Alegre, RS, Brazil
Hawthorne, NY 10532, USA
Email: {crpsantos.rsbezerra.berthold.granville}@inf.ufrgs.br
Email: {wcheng.nikos}@us.ibm.com
Abstract-Mashups are powerful applications created from
in this paper. Especially in network management applications,
accessing and composing multiple and distributed information
mashups may be required to handle and display sensitive data
sources. Their ease-of-use and modularity allow users at any
that require well-defined disclosure policies. For example, in
skill level to construct, share and integrate their own applications.
a common application that shares traffic information between
However, data security concerns remain a hindering factor in its widespread adoption, in particular, for network management.
Internet service providers, a certain provider may engage in
In this paper, we propose a novel development methodology
proprietary peering relationships with competing providers
and system architecture called Maestro that allows developers
and the inadvertent publishing of such detailed information
to express their data privacy concerns and enforce policies
through a mashup may compromise their market position. On
during mashup executions. We evaluated Maestro by building two
the other hand, the disclosure of aggregate traffic can be very
mashup applications for managing live networks and by running
useful to all parties to monitor quality of service levels, as long
performance tests that show that our runtime has negligible
as the details are not being disclosed. Such examples motivated
overhead.
us to study further the requirements for data protection in
I. INTRODUCTION
shared mashup environments; particularly what mechanisms must be put in place to satisfy the confidentiality concerns of
In recent years we have observed a quick proliferation of a new class of Internet applications better known as "mashups"
different parties.
or "situational applications". A mashup can be characterized
Protecting sensitive data in a highly collaborative and dy
as the product of combining two or more complimentary
namic mashup environment poses several non-trivial questions.
web-based applications and/or data sources. The increasing
Our research indicates that the solution should be centered on
availability of APIs for popular web applications and services
an enhanced composition programming model that specifically
makes it possible now for even the most inexperienced users
captures and enforces disclosure policies. In this way, data
to compose mashups to solve some very specific problems
protection mechanisms are built directly into these applica
(hence the term "situational"). It is very common nowadays
tions. The model should be easy to understand, sufficiently
to encounter applications that overlay data objects on top of
expressive, and able to represent broad categories of security
a mapping service and thereby providing an enhanced view
concerns among a wide range of mashup applications.
of the data. It was not long before systems and network man
To satisfy the above-mentioned requirements we adopt the
agement practitioners took advantage of the mashup paradigm
concept of Information Flow Control (IFC) in our architecture.
to enable a new class of management applications on top of
IFC is a security model that tracks the flow of sensitive data,
existing repositories of management information.
enables the specification of disclosure policies and enforces
In our previous work [1], we conducted a set of experiments
them. IFC is a promising direction in enhancing the security of
and analysis of the mashups approach and we concluded that
mashups, particularly in the context where data is interchanged
mashups are a promising and appropriate tool for monitoring,
with other mashup modules. This paper introduces a new
diagnosing and managing networks. They provide an excel
mashup development system, named Maestro, that extends
lent visualization medium for composed information, collect
a state-of-the-art IFC programming model, Aeolus [2], with
ing and aggregating data from multiple network monitoring
a highly modular mashup composition architecture. The key
and performance tools. Furthermore, the modular nature of
contributions of our research work are to:
mashups encourages re-use and sharing of modules between
1) Propose a novel secure mashup development environ
different service providers and network administration staff,
ment;
which in turn, allows composing even more sophisticated
2) Show a proof-of-concept prototype with negligible over
applications.
head and little compromise on usability;
As the spectrum of possible applications widens, a number
3) Demonstrate realistic scenarios for privacy-preserving
of challenges intrinsic to mashups become apparent. Data
network management.
security in particular proves to be extremely challenging in
We designed and implemented a secure mashup develop-
an open mashup environment, and is the main topic of study
978-1-4244-9221-31111$26.00 ©2011 IEEE
49
ment and deployment platform that enables mashup developers
sources tend to be heterogeneous, wrappers are often used
to respect various data confidentiality concerns during the
to standardize and retrieve the interested data. During the
composition process. We created two management mashups
augmentation step, the user composes the data by defining
for administrating Autonomous Systems (AS): one focuses on
modules that operate over the data, transform it, and generate
network quality management and the other on network traffic
more meaningful information from the aggregate. The result
monitoring. These scenarios are derived from interviewing
of this process is presented to the mashup creator and viewers
real-world experts, particularly network administrators, and
in the presentation step. The presentation can range from a
from our understanding of their concerns. The network quality
simple text file containing the results of the composition, to a
management mashup aggregates information such as network
complex and highly interactive map display.
traffic, number of announced routes, input and output queue
Mashups can also be built in an ad-hoc manner, using
sizes, number of discarded packets, and CRC errors from
standard programming tools and integrated development en
BGP-enabled routers connecting AS peers. Such management
vironments (IDE) to develop one specific composition. This
capability is important because it allows network adminis
approach relies heavily on the skill of web developers. Mashup
trators in each AS to verify whether their service providers
systems make the creation of mashups accessible to a larger
have violated Service Level Agreements (SLAs), requiring the
community, with no programming expertise needed to create
enforcement of financial provisions in the contract. The second
custom mashup applications. Such systems are environments
mashup is designed for network traffic monitoring providing
for mashup creation, similar to an IDE for traditional devel
different views of the backbone network traffic with different
opment, adhering to the development methodology described
access levels of the information for each group.
earlier. Mashups systems are also runtime platforms for exe cuting and storing created compositions.
The remainder of this paper is organized as follows. In Section 2, we provide a background of our research. In Section
In the context of the network management area, we con
3, we introduce our architecture, Maestro, which supports
ducted a set of experiments and analysis in our previous work
strong confidentiality mechanisms for mashup development.
[1] that has shown mashups as a promising and an appropriate
In Section 4, we describe two network management scenarios
tool for monitoring, diagnosing and managing networks. This
where the enforcement of such mechanisms is necessary. In
work also provides depth details of the mashup system used
Section 5, we study two implementations of Maestro for these
and extended in this paper. Compared to the current network
scenarios by creating the corresponding mashup applications
management solutions, easier and faster development are the
and evaluating their implementation. Section 6 discusses re
main advantages of solutions based on mashups, allowing the creation of larger applications, and reducing development costs
lated work. The paper concludes in Section 7.
and time. II. BACKGROUND A.
B.
Web 2.0 and Mashups
Mashups enable new capabilities in data visualization. How
In recent years, web applications have taken on renewed
ever, the ability to protect such data is essential. W hile data
interested both in industry and academia, termed as Web 2.0.
protection techniques have been proposed since the 1970s [6]
These trends propose web applications that focus on the end
[7], the majority of the work in this area centered on providing
users and more importantly, encourage them to be organizers
data protection for static data such as databases and files [8].
of content in addition to being consumers of information. As a
Mashups present a very different operating environment, in
result, there is an increasing demand for principles that support
which data may be generated dynamically as the output of
re-use of content, collaborative sharing and compilation of
one mashup module and fed through a variety of intermediate
content among users [3]. Mashups have emerged as a class
ones before the final result is displayed to the end-user.
of web application that composes content from a variety of
Conventional techniques for ensuring the confidentiality
web resources such as web pages, RSS feeds, web services
of data can be broadly classified into two bodies of work:
and online APIs [4].
INGESTlOH Retrieval
--
Data Protection
access control and cryptographic key management. Access AUGIIENTAllON Composition PlaNling
--'
control provides frameworks for identifying the users of a PRESENTATION
system and seeks to restrict the access different users have
Viewing
on the data. For example, role-based access control [9] allows
r-------, Provisioning
Slandardizallon
the specification of a task-based role such as "Accounting Department Personnel", the types of data (e.g., revenue and
Fig. 1.
Mashup Development Methodology
expense report files) that is accessible to users of this role, and the operations (e.g., read, and/or write) that can be done
Typically, mashups are built following the methodology pro
on the data. Access control protects data in storage rather
posed by Jhingran et. at [5], presented in Figure 1. There are
than in flight, the latter is common to mashups. It deals
three main steps: ingestion, augmentation, and presentation.
poorly with data that are output of computations and also
The ingestion step gathers data from disparate sources, such
with data that have been transformed, in many cases iteratively
as RSS Feeds, Web Services, and online APIs. Since these
by several mashup modules. Cryptographic key management
50
[10] uses encryption algorithms to hide data and relies on
The secrecy rule says that the destination's secrecy label
the distribution of the encryption/decryption keys to restricted
(B.Ls) must be a superset of the source's secrecy label (A.Ls)
users of the system to protect the confidentiality of the data. It
to ensure that the sensitivity of all sources contributing to the
can better hide in-flight data than access control techniques, at
destination is captured by the destination's secrecy label.
the expense of significantly complicating the key generation
To support our secrecy mechanism, we designed the com
and distribution schemes as it has to cover all possible data
ponents highlighted in Figure 2. These components are re
transfer points between mashups. Encryption protocols must
sponsible for tracking the propagation of data and preventing
be agreed upon between two interacting mashups and new
accidental sensitive data leakage. They do the former by
keys must be generated whenever sharing relationships change.
enforcing the secrecy rule and the latter by disallowing any
The key strength of encryption is in ensuring confidentiality
data with non-empty secrecy label to leave the system. For
of data content during transfers rather than in expressing
example, a text field that has the Financial tag in its secrecy
confidentiality policies between myriad of network endpoints.
label cannot be published on a public web portal. The Secrecy
An area of security research that has recently regained
component provides labeling support and propagation of data
interests is information flow control [11]. Information flow
in accordance with the secrecy rule. The User Authentication
control focuses on controlling the propagation of sensitive
component in Figure 2, besides providing user login validation,
data. It does this by labeling the data and by providing rules
uses the user credentials to relate it to a principal. The runtime
on how labeled data can travel through a system. Different
engine in the Composition Runner, runs the mashup as a
information flow control techniques have been proposed and
particular principal that has very specific authority on what
applied to a variety of settings: at the programming language
sensitive data it can disclose.
level, at distributed application programming model, and for operating system processes monitoring. In this work, we
Publisher
extend the Aeolus programming model in [2] for mashup
User Authentication
development. Comparisons with other IFC works were also
i I
done in [2]. Compared with other IFC models, Aeolus is easy to-use with a simple set of rules yet retaining expressiveness of confidentiality policies. Compared with other IFC implemen
Composition Runner
tations, Aeolus is particularly suited to describing functions and outputs of computations.
�--
)
J: