A Component-Based Framework for the Internet Content Adaptation ...

5 downloads 127969 Views 535KB Size Report
reuse for the development of applications for Internet content ... for intercepting the delivery of content and adapting it. These ..... [1] Akamai Technologies.
A Component-based Framework for the Internet Content Adaptation Domain Marcos Forte

Renato A. T. Claudino

Wanderley Lopes de Souza

Centro Universitário Fundação Santo André

DBA Engenharia de Sistemas Ltda. Antonio Francisco do Prado Av. Presidente Vargas, 3131 – 3o andar Luiz H. Z. Santana Av Principe de Gales, 821 – 09060-650 20210-030 – Rio de Janeiro – RJ - Brazil Federal University of São Carlos Santo André – SP – Brazil 55 21 2515 3222 Po Box 676 - 13565-905 55 11 4979 3300 São Carlos – SP - Brazil [email protected] 55 11 3351 8233 [email protected]

{desouza,prado,luiz_santana} @dc.ufscar.br language and personal preferences. In this extremely heterogeneous environment, adaptations are aimed, to the fullest possible extent, at ensuring user satisfaction in terms of personal interests and needs.

ABSTRACT Small mobile devices for accessing the Internet through wireless access networks have become increasingly common in recent years. In this new context, a major challenge is the adaptation of content to these devices, satisfying their capabilities and user preferences and optimizing the use of wireless access networks. This paper therefore presents a framework based on component reuse for the development of applications for Internet content adaptations.

The emergence of new Internet access network technologies has contributed to drive this area. Some of these technologies involve broadband (e.g., cable connections, Asymmetric Digital Subscriber Lines – ADSL) while others involve narrow band (e.g., 56 kbps modems) and yet others variable band (e.g., Wireless Local Access Networks - WLANs). On the other hand, the development of small mobile devices equipped with Internet access but with limited displays, batteries, memory and processing capacity has reinforced the need for adapting the various types of contents distributed on the Internet.

Categories and Subject Descriptors D.2.13 [Software Engineering]: Reusable Software— Reusable libraries

General Terms

Services networks, which constitute a new Internet layer and allow adaptation services to be carried out, were developed in response to this problem. These networks describe mechanisms for intercepting the delivery of content and adapting it. These service networks are based mainly on the Open Pluggable Edge Services – OPES model [20].

Design, Languages, Experimentation.

Keywords Software reuse, component based development, framework, content adaptation, adaptation policy.

An essential requirement for carrying out these services is the definition of adaptation policies, which define what adaptation is to be done on a given content, when, and who should do it. If several adaptations are required, this policy must also specify the sequence in which they should be carried out. To be effective, the policy must take into account information on users, devices, the access network, content, and the service agreement.

1. INTRODUCTION Content adaptation involves modifying the representation of Internet content in order to come up with versions that meet diverse user requirements and the distinct characteristics of devices and access networks [7]. Interest in this subject has grown considerably in recent years due to the quality and variety of available content and the ever-expanding access to the Internet.

This paper therefore presents a component-based framework, which provides a flexible infrastructure for the development of content adaptation applications. Based on the reuse of software components, this framework defines a general adaptation policy that can be used by the various applications of the Internet’s content adaptation domain.

The dissemination of personal computers has given millions of users access to Internet content. These users display highly heterogeneous profiles in terms, for instance, of economic and social background, occupation, age, geographic location,

Section 2 of this paper addresses the theme of Internet content adaptation, highlighting its main applications and the adaptation policy, while section 3 presents the proposed framework, section 4 describes an experience of reuse of this framework, considering the adaptation policy and the protocols employed, section 5 discusses correlated works, and lastly, section 6 presents our conclusions of this work and recommendations for future studies.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC’07, March 11-15, 2007, Seoul, Korea. Copyright 2007 ACM 1-59593-480-4/07/0003…$5.00.

1450

adaptations, when the adaptations will be requested, and the sequence of their execution. To make these decisions efficiently, information is required about the environment; user preferences and personal data; the characteristics and capabilities of their devices; the properties of the requested contents; the conditions of the access network; and the service agreement between the user and the service provider [19].

2. CONTENT ADAPTATION th

The late 20 century saw an amazing growth in the number of Internet users, leading to an exponential increase in the flow of data through this network and resulting in congested traffic and bottlenecks [1]. This problem was minimized through the development of content distribution networks, which bring the content to the user through proxies, thus preventing network bottlenecks between users and content servers [21].

This information may be static or dynamic. Static information (e.g., users, devices, service agreement) can be made available through profiles and is added when the user connects to the network, since it does not usually require frequent updating. Dynamic information (e.g., content, access network) is collected from the environment in each adaptation process, since it is periodically modified.

However, the wide diversity of contents, users, networks and access devices led to the need for on-the-fly adaptations of Internet contents to meet user expectations, device constraints, and improve the use of network resources. In response to these needs, service networks were created whose content adaptation function is distributed among remote servers called content service modules. Service networks offer a mechanism for directing content to these modules, which in turn modify the content according to the required adaptations.

The user profile contains personal information and adaptation preferences, and the adaptation process is directed at meeting those preferences. Since users may require different services for the same content, adaptations not based on their profiles may be inadequate or undesirable.

2.1. Content Adaptation Applications Among the Internet content adaptation applications [3, 12] that determined the requirements of the framework proposed here, the following stand out:

The device profile contains the characteristics and capabilities of hardware (e.g., display size, memory) and software (e.g., operating system, browser). Knowledge of these capabilities is essential to the adaptation process, which considers the limitations of each device.

· Virus scan: searches for viruses before delivering the content to the user. This service reduces the user’s chances of receiving infected contents when accessing Internet contents.

The service agreement profile (Service Level Agreement – SLA) contains the services offered by the provider for access by the user. The service provider offers several service plans, seeking meet the user’s needs. A service can be denied if it is not contained in this profile.

· Ad Insertion: inserts advertisements into a content (e.g., Web pages) based on user interests and/or location. This adaptation allows, for instance, for original ads on Web pages to be replaced by regional ones or ads directed at the user’s profile. · Markup Language Translation: allows devices that do not support Hypertext Markup Language (HTML) pages but support other markup languages (e.g., Wireless Markup Language) to receive the content of such pages. This service, besides offering a larger number of contents to limited devices, reduces the Web designer’s work and the Web server’s burden, by obviating the need for the former to develop and for the latter to contain several versions of the same page.

Content information contains information such as image type, format and dimensions, the language of a Web page, the size in bytes and the source of the content. This information can be found in the message header of the protocol or in the content’s metadata. Access network information comprises characteristics such as bandwidth, latency and traffic. This information is analyzed to optimize the use of the network’s resources, avoiding, for instance, high error ratings in video transmissions.

· Data compression: allows the origin server to send its content in compressed form so that the edge device can extract it, thereby reducing the bandwidth used in this communication.

Since not all requested content should undergo the adaptations offered by the server, a mechanism is needed to invoke the adaptation only when certain conditions are satisfied. To guide this decision process using the profiles and dynamic information, the adaptation policy should also consider a set of adaptation rules.

· Content Filtering: redirects an unauthorized content request or blocks a response containing unsuitable content. This service enables parents to control their children’s access to inappropriate contents. · Image Transcodification: processes image files in order, for example, to transform its format, reduce its size and/or resolution, or modify its color range.

These rules indicate the conditions to be met for a given adaptation to be realized. Such conditions must be coherent with the user, device, agreement, content and access network profiles. The OPES WG work group has developed two languages to specify content adaptation rules: Intermediary Rule Markup Language (IRML) and P.

· Language Translate: translates a Web page from one language to another (e.g., from Portuguese to Chinese). This is useful when the origin server does not contain all the necessary translations to deliver the same content in several languages.

Based on eXtensible Markup Language (XML), the IRML was designed to express simple and efficient service execution policies and to reflect the interests of two end points of a content transaction: the origin server and the user [10]. The P language, based on Smalltalk and C++, is interpreted and presents the

2.2. Adaptation Policy Another important framework requirement relates to the adaptation policy, which defines the adaptation services to be executed, the local or remote adapters that will execute these

1451

following qualitative aspects: exactness, flexibility, efficiency, simplicity, security and hardiness [5].

proxy and the service module, the Callout Protocol Client component was defined, which makes the remote adaptation call. This component supports different protocols, including those creates specifically for this communication: Internet Content Adaptation Protocol (ICAP) [8] and OPES Callout Protocol (OCP) [17].

As demonstrated in this section, several requirements must be considered when offering adaptation services. For example, to avoid overloading the proxy in a service network, one important aspect is the distribution of adaptation services among dedicated servers. The framework that is the main object of this article was constructed based on these requirements.

The Cache is a component built to improve the ICAF’s performance. It temporarily stores contents requested from the Internet, as well as Web pages, videos and images. Before requesting content from the origin server, the Adaptation proxy checks whether it is already in the Cache. If it is, a request to the origin server is avoided and the content is delivered more rapidly to the user.

3. FRAMEWORK Figure 1 gives a general view of the ICAF (Internet Content Adaptation Framework) developed here. The ICAF is based on the ideas of service networks and comprises two main packages: Adaptation Proxy, whose role is that of proxy, and Adaptation Server, which plays the role of adaptation service module.

User



Adaptation Proxy

Although it uses adaptation service modules, the ICAF offers resources via the Local Adapter component, which handles content adaptations locally. However, since this component uses the resources of the proxy itself, its wide-scale use may affect the performance of the Adaptation Proxy, delaying the delivery of content to the user.

Adaptation Server



The proxy manager handles the flow of data from the Adaptation Proxy. Through the Content Transfer Protocol, this component receives user requests and responses from the origin server. With this information, the proxy manager requests an adaptation analysis from the Adaptation Decision component. If any adaptation is needed, the Proxy Manager invokes this service remotely using the Callout Protocol Client component, or locally through the Local Adapter component. Before requesting a content, the Proxy Manager checks if the content is present in the cache, thus avoiding unnecessary requests.

Origin Server

Figure 1. ICAF general view The actors, i.e., User and Origin Server, interact with the adaptation proxy through content requests and responses. The Adaptation Proxy analyzes these interactions based on an adaptation policy and uses the services available at the Adaptation Server, which in turn adapts the transmitted content.

3.1. Adaptation Proxy

An important adaptation requirement is the definition of an adaptation policy. The ICAF’s policy is based on the adaptation rules, user information, device characteristics, access network conditions, content, and the service agreement, and is implemented through the Adaptation Decision, Profile Loader, Adaptation Rules Updater and Network Data Collector components.

Figure 2 illustrates the components of the Adaptation Proxy package. The Content Transfer Protocol component is responsible for communications with the user and the content origin server, and is generic to support the application protocols used for content transmission on the Internet. The main protocols include: Hypertext Transfer Protocol (HTTP), Real-time Transport Protocol (RTP), Simple Object Access Protocol (SOAP) and File Transfer Protocol (FTP).

The Profile Loader component fetches the user, device and agreement profiles stored in the ProfilesDB database. These profiles enable the framework to meet the user’s interests and preferences and the device capabilities and constraints, and to determine the services the user is allowed to access. To insert the user and device profiles into the ProfilesDB database, an Internet interface must be made available to the user. This interface can be a Web page containing a form in which the user fills out preestablished fields indicating the profiles. The service agreement profile is inserted by the system administrator, also via a Web interface. The ICAF policy is controlled by the Adaptation Decision component. This component receives the alterations made in the network from the Network Data Collector, which monitors the parameters of the access network. It also receives the profiles stored in the ProfilesDB from the Profile Loader component, and the content to be analyzed from the Proxy Manager component. Based on this information, the Adaptation Decision component uses the adaptations rules to decide what adaptations will be done, which servers (local and/or remote) will execute them, and the order in which they will be executed.

Figure 2. Components of the Adaptation Proxy package Since the adaptation proxy requests adaptations from an Application Server using protocols that communicate between the

1452

To ensure that outdated or invalid rules are not processed, the Adaptation Rules Updater component inserts, removes and updates the adaptation rules implemented in the Adaptation Decision component. The Rules Author actor carries out these updates using an interface provided by the Adaptation Rules Updater.

Figure 4. Execution points of the adaptation policy

3.2. Adaptation Server

4. ICAF REUSE

This package is responsible for the content adaptations requested by the Adaptation Proxy. Considering that the adaptation servers can carry out different types of services, each Adaptation Server can have a different structure. Thus, based on the concepts of component-based development, a generic structure was defined for the various Application Servers, comprising the components illustrated in Figure 3. The Callout Protocol Server component handles communications with the Adaptation Proxy, which supports various protocols (e.g., ICAP, HTTP), receives and analyzes adaptation requests from the Callout Protocol Client, defining the action to be realized and its parameters. Based on this information, sent in the header of the request protocol, the Callout Protocol Server invokes the requested adaptation from the Remote Adapter.

The ICAF was developed with the purpose of offering a basic structure for creating content adaptation applications on the Internet through the reuse of its components. This section presents a reuse experiment of the proposed framework, which is represented in the component model shown in Figure 5. This model contains the components reused through direct instantiation from ICAF: Local Adapter, Remote Adapter, Proxy Manager, Cache, Adaptation Rules Updater, Network Data Collector and Profile Loader.

Figure 3. Components of the Adaptation Server package The Remote Adapter component is responsible for executing content adaptations. The internal structure of this component supports variations according to the type of adaptation offered by the Adaptation Server. An example of the design of this component is available in [9]. The Remote Adapter carries out the same functions as the Local Adapter, but does not use the processing resources of the Adaptation Proxy. The decision to adapt content locally or remotely must take into account the relation between the execution time of the Callout Protocol and the execution time of the content adaptation. The lower this relation, the more the decision will favor remote adaptation.

Figure 5. ICAF Reuse Since the main difficulty in adaptation systems is the implementation of the adaptation policy, the ICAF considers that this policy should take into account the adaptation rules and the information of the adaptation environment. However, there are various ways of realizing this decision process, including the implementation of a procedural language algorithm. To meet this specific application requirement, several components were added and others modified, as follows.

3.3. The ICAF Operation The ICAF was built to support two operating modes [8]: in reqmode, the request sent by the user to the server is adapted; in respmod, the response delivered by the server is adapted.

4.1. Inserting an Inference Mechanism The decision of ICAF’s adaptation policy is the responsibility of the Adaptation Decision component. In this experiment, this component was adapted to work integrated with an inference mechanism based on the adaptation rules and the environment’s information, defining the adaptation services to be executed. The Adaptation Decision component defines only the sequence of adaptation if multiple adaptations are required, and defines the adapter (local or/and remote) that will perform the adaptations.

Figure 4 shows four points for processing the adaptation rules [2]: at point 1, the user’s request is received, but the cache has not yet been checked to verify if the requested content is available. At point 2, the content is not available in the cache and the request is directed to the origin server. At point 3, the origin server’s response is accepted but not yet stored in the cache. At point 4, the response of the cache or origin server is ready to be sent to the user.

In this ICAF reuse experiment, the inference mechanism was introduced through a Prolog Knowledge Base. This mechanism improves the flexibility of the framework’s adaptation policy and confers some intelligence on the decision process.

The definition of the processing points to be used depends on the type of adaptation service to be executed. For example, a virus checking service should be processed at point 3, thus preventing an infected content from being stored in the cache.

1453

In addition to the Knowledge Base, two new components were added: KBManager and SQL2Prolog. The KBManager handles the work of the KB, executing the following functions: it receives the updates of the adaptation rules from the Adaptation Rules Updater, translates them into the Prolog language and storing them in the Knowledge Base, and carries out a query retrieving information on users, devices and the access network when requested by the Adaptation Decision component. The KB answers this query based on user, device and agreement profiles, as well as information about the access network and the content. The fact that the KB is implemented in Prolog and the ProfilesDB in SQL language makes these bases incompatible. The SQL2Prolog component, which allows information to be exchanged between these bases through an interface, was developed to solve this problem, enabling the KB to receive and analyze the profiles stored in the ProfilesDB.

T(Origin Server), the origin server response time; T(Analysis), the time spent on processing the adaptation policy; T(Adaptation), the time spent by the ICAP protocol and by the content adaptations; T(Delivery), the time spent to deliver the content to the user after executing all the adaptations.

In this ICAF reuse experiment, three components were adapted with specific implementations of their interfaces, thus characterizing reuse through specialization. The Content Transfer Protocol component was specialized to transmit the requests and responses of content through the HTTP. This protocol is widely used on the Internet to transfer Web pages, images, video files, dynamic pages and applications. The HTTP consists of two fields: the header and the payload (i.e., content). The content is modified by an HTTP parser, which identifies the semantic actions of this protocol. The Callout Protocol Client and Callout Protocol Server components were specialized to allow for communication, through ICAP, between the proxy and the adaptation server. This is a lightweight protocol with request and response headers very similar to those of HTTP. An ICAP message encapsulates the content and the HTTP messages exchanged between the user and the origin server. Upon receiving a service request, the Callout Protocol Client encapsulates the actions, parameters and content of an ICAP request message and sends it to the Callout Protocol Server, which makes a semantic analysis of this message, retrieving the necessary information, and invokes the requested service from the Remote Adapter. After adapting the content, the Callout Protocol Server encapsulates it in an ICAP response message and returns it to the Adaptation Proxy.

The overheads caused by the adaptation policy and content adaptations were provided by T(Analysis) and T(Adaptation), respectively. For this evaluation, 1000 requests for the page www.folha.com.br were executed. Through the implemented Adaptation Servers, this Web page went through five different adaptation tests: no adaptation (NA); using the three Adaptation Servers (VS, IA and CF) independently; and using the three servers in combination (VS+IA+CF). Figure 7 shows the average times of T(Origin Server), T(Analysis), T(Adaptation) and T(Delivery) obtained. The tests indicated that the longest delay occurred in waiting for the response from the origin server and that the time spent on the policy was similar in all the adaptations.

Figure 6. Times measured in the evaluation of ICAF reuse

Figure 7. Framework overheads

4.2. Performance Analysis

In the tests performed with no adaptation, the average time of all the processes was 121 ms, of which 103 ms corresponded to the response from the origin server, 17 ms to the adaptation policy and only 1 ms to delivery of the content to the user. Therefore, the delay introduced by the inference mechanism (17 ms) is relatively short and is considered satisfactory for this application.

Since adaptation services tend to increase the content delivery time, thus reducing the user’s level of satisfaction, the ICAF was developed based on two main objectives: to offer adaptation services without degrading the performance of the Internet’s infrastructure, and to define an adaptation policy with short processing time.

Using the VS and IA servers, the average content adaptation times were 5.3 ms and 2.5 ms, respectively. The servers are therefore effective, since they caused a negligible delay in the content delivery time. However, the CF server introduced a 40 ms delay, which can be considered relevant.

To measure the time spent on content adaptation and by the adaptation policy, the performance of the reuse experiment was evaluated on a network comprising five computers: 1 Adaptation Proxy, 3 Adaptation Servers and 1 User. The implementation of the Adaptation Proxy was based on [15]; that of the Virus Scan (VS) and Image Adapter (IA) adaptation servers was based on [11], and that of the content filter was based on [Forte 2004]. The origin servers were accessed directly from Internet content servers.

5. CORRELATED WORKS In the mid-90s, content adaptation work concentrated on adaptations in the proxy itself [18, 6]. The disadvantage of this technique lies in the accumulation of adaptations of the same content, which can overload the proxy. After the creation of the OPES network model, which distributes adaptations among dedicated servers, it became feasible to build a single architecture

To evaluate the performance of the components from the model described in [14], the temporal collection points t0 to t7 depicted in Figure 6 were defined, and the following times were calculated:

1454

Caching and Content Distribution, EUA, 2001.

offering several types of adaptation. However, to prevent this architecture from carrying out all the adaptations available, an adaptation policy is required [2]. [4] presented an architecture for executing content adaptations, which contains a decision-making mechanism based on a set of conditions. These conditions and related actions constitute the adaptation rules, which were specified through IRML. However, these conditions were limited because they did not use information about the adaptation environment. [16] proposed a framework to manage the personalization of services, whose adaptation policy is based on a combination of user preferences, device constraints, and content characteristics.

[5]

Beck, A.; Rousskov, A. P: Message Processing Language. OPES WG, IETF Internet Draft, 2003.

[6]

Bharadvaj, H.; Joshi, A.; Auephanwiriyakul, S. An Active Transcoding Proxy to Support Mobile Web Access. In Proceedings of the 17th Symposium on Reliable Distributed Systems, IEEE, p. 118-123, 1998.

[7]

Buchholz, S.; Schill, A. Adaptation-aware web caching: Caching in the future pervasive web. In Proceedings of the 13th GI/ITG Conference Kommunikation in verteilten Systemen (KiVS), Leipzig, Germany, 2003.

[8]

[13] presented an adaptation architecture designed specifically for mobile devices, which offers image, audio, and text compression adaptations. That work also involved the use of access network information for decisions about the best adaptation.

Elson, J.; Cerpa, A. Internet Content Adaptation Protocol. IETF Request for Comments 3507, 2003.

[9]

Unlike the aforementioned works, this paper presents an extremely flexible framework thanks to the reuse of its components. Moreover, the framework contains an adaptation policy based on user, device and the service agreement profiles and on information about the access network and content.

Forte, M.; Souza, W.; Prado, A. A content classification and filtering server for the Internet. In Proceedings of the 21st Annual ACM Symposium on Applied Computing, Vol. 2, pp. 1166-1171, Dijon, France, 2006.

[10] Hofmann, M.; Beck, A. IRML: A Rule Specification Language for Intermediary Services. IETF Internet Draft, 2001. [11] ICAP Server cap.org/docs/.

6. CONCLUSIONS AND FUTURE WORKS

[15] Proxy Shweby. http://shweby.sourceforge.net/. [16] Ravindran, G.; Jaseemudin, M.; Rayhan, A. A management framework for service personalization. In Proceedings of the 5th IFIP/IEEE International Conference on Management of Multimedia Networks and Services, EUA, pp. 276 – 288, 2002. [17] Rousskov, A. OPES Callout Protocol (OCP) Core. IETF Request for Comments 4037, 2005. [18] Smith, J.; Mohan, R.; Li, C. Content-based Transcoding of Images in the Internet. In Proceedings of the IEEE International Conference on Image Processing, Chicago, USA, p. 7-11, 1998.

7. REFERENCES Babir, A. et al. Policy, Authorization, and Enforcement Requirements of the OPES. IETF Request for Comments 3507, 2004.

[3]

Beck, A.; Hofmann, M. Example Services for Network Edge Proxies. Internet Draft, 2000.

[4]

Beck, A.; Hofmann, M. Enabling the Internet to deliver content-oriented services. Sixth Int. Workshop on Web

http://www.i-

[14] Mastoli, V.; Desai, V.; Shi, W. SEE: A Service Execution Environment for Edge Services. In Proceedings of the Third IEEE Workshop on Internet Applications (WIAPP'03), San Jose, CA, 2003.

This framework will be used in future works, focusing on a wide diversity of content adaptations. Our goal is make it possible to gradually refine its components in order to serve an increasing number of applications. We believe that, as the framework becomes increasingly generalized, it will be possible to define software standards for the Internet content adaptation domain.

[2]

Appliance.

[13] Marques, M.; Loureiro, A. Adaptation in mobile computing. In Proceedings of the 21st Brazilian Symposium on Computer Networks (SBRC), 2004.

This mechanism facilitates the maintenance of the adaptation rules, making the framework’s decision process highly flexible. With regard to its performance, we demonstrated that the insertion of this mechanism satisfies the user’s expectations in terms of content delivery time.

Akamai Technologies. Internet Bottlenecks: the case for Edge Delivery Services, 2000.

Network

[12] McHenry, S. et. al. OPES Use Cases and Deployment Scenarios. IETF Internet Draft, 2001.

This article presented a component-based framework for the Internet content adaptation domain, and described an experiment reusing this framework. In this experiment, the framework was instantiated using the ICAP and HTTP protocols and a Prologbased inference mechanism was included in the implementation of the adaptation policy.

[1]

by

[19] Souza, W. et. al. Adaptação de conteúdo baseada em perfis de dispositivo, conteúdo, usuário e serviço de rede. In Proceedings of the 20th Brazilian Symposium on Computer Networks (SBRC). vol. 2, p. 554-568, 2002 [20] Tommlinson, G. et. al. A Model for Open Pluggable Edge Services. IETF Internet Draft, Work In Progress, 2001. [21] Vakali, A.; Pallis, G. Content Delivery Networks: Status and Trends. IEEE Internet Computing, Vol. 7, No. 6, pp. 68-74, 2003.

1455

Suggest Documents