Sleepers and Workaholics: Caching strategies for mobile environments(extended version). MOBIDATA: An interactive journal of mobile computing, 1(1), 1994.
1
Profile based Caching to Enhance Data Availability in Push/Pull Mobile Environments Ravindra Kambalakatta, Mohan Kumar and Sajal K. Das Center for Research in Wireless Mobility and Networking (CReWMaN) Department of Computer Science and Engineering, University of Texas at Arlington Arlington, TX 76019, USA. {kambala, kumar, das}@cse.uta.edu Abstract Caching techniques have been successfully employed to overcome some of the problems posed by disconnection and/or limited bandwidth in mobile environments. Demand driven and prefetching techniques to maintain optimal caches have been found to have limited success in such environments. In this paper, we develop profile based methods to maintain and enhance data availability at mobile client’s caches. Profiling techniques determine data items to be prefetched and cached depending on a user group and context. We have developed a prototype system that demonstrates the applicability of the proposed scheme in a crisis management situation. Results of implementation of the above technique in experimental mobile environments show that profile based caching can significantly enhance data availability in mobile and ubiquitous environments.
1. Introduction In communication constrained situations, traditional demand driven approaches alone are found to be inadequate for ensuring data availability [6]. Various caching techniques have been successfully employed to enhance data delivery in mobile distributed systems prone to disconnection and limited bandwidth. Caches maintained at mobile devices can mitigate the effects of disconnection or scarcity of the bandwidth. Profiling currently used in many applications to deliver user specific information can be extended to manage cache on mobile devices as demonstrated in [2]. In this paper, we employ user(s) profiles to provide a hint of what might be useful to prefetch data of interest to the user(s). User profiles may be used in many scenarios to provide user(s) with most relevant data. For example, consider a fire accident scenario where emergency personnel such as firefighters and paramedics are working together to minimize the effects of a crisis. Critical information such as entry and exit paths,
availability of oxygen masks for firefighters and vacancies in hospitals and traffic information can be very vital to the smooth functioning of these emergency personnel. It is desirable to have a proactive system that can provide critical information just in time. Using the notion of user profiles in the fire accident scenario will result in providing most relevant critical information to emergency personnel. Also passive prefetching [5] scheme can be employed in mobile devices to prefetch data that are likely to be accessed in the future to reduce access latency and fully utilize the scarce wireless bandwidth. Passive prefetching employs a data admission policy based on relative utility client caching. It is extremely important to decide what data to store locally and what data is of more value in mobile devices as irrelevant information could degrade the performance. Traditional caching techniques such as LRU and MRU do not reflect the user priorities and also do not take into consideration, user’s changing requirements. In addition to an efficient push based caching technique it is also important to consider the dynamicity of the situation. Data pushing mechanisms [7], do not take into account user(s) needs and availability of resources at the user’s terminal. There is a need for an efficient context aware mechanism at the user’s terminal to assess the importance of the pushed data to the user. Efficient caching of pushed data based on profiles results in hiding access latency. A combination of pushing data based on user(s) profiles, passive prefetching and efficient caching of pushed data using a greedy replacement algorithm at client’s device can effectively reduce the bandwidth load. In this paper, we propose a profile based caching mechanism to enhance data availability at mobile devices. Various issues associated with profile descriptions, formulations and processing are addressed. In our scheme, caching data items at the client’s device is different from traditional caching techniques due to the incorporation of profiles of groups of user(s) into cache management schemes. We employ extensible markup language (XML) [9] to
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE
2 express profile semantics and document type definition (DTD) to validate the XML document to be well formed. User(s) profiles are formulated by using user or system (or both) feedback depending on user interest and system capabilities. Users provide feedback to update profiles. The system also infers interests of the users by using a simple algorithm on the log file to extract relevant information. Algorithms for updating profiles and greedy replacement of cached data at client’s device are developed. Broadcast data is cached only if important to the user. A prototype depicting a crisis management scenario has been implemented and tested in real time. The results obtained show that our scheme performs better than other schemes like LRU with and without profiling. Our scheme improves efficiency enabling high data availability at client’s device with high hit ratio and low average turn around time. The rest of the paper is structured as follows. An overview of profiling and caching in mobile devices is presented in Section 2. Following this we will discuss our system architecture in Section 3. In Section 4 we present implementation details while in Section 5 experimental results are discussed. Finally future work and conclusions are presented in Section 6.
2. Background work 2.1 Push/ Pull architecture A mobile environment consists of a downlink and an uplink communication channel. The battery power consumption during uplink communication is more than that during downlink communication [20]. In typical client-server models there are two basic modes of data dissemination [7]: 1. Pushing Mode (Push): Data is multicast periodically on the downlink channel. Access to multicast data does not require uplink transmission. 2. On-Demand Mode (Pull): the client requests a data item through the uplink channel and the server responds by sending the data to the client on the downlink channel Broadcasting of most frequently accessed data items (hot data) results in bandwidth saving as clients do not have to make requests over the uplink channel. However, the average turn around time of push operation is low especially when large data files are broadcast. This is because the client has to wait on average half a broadcast cycle to get the required data. On the other hand, in on-demand mode the server delivers data immediately to the client. On-demand mode of information dissemination in mobile environments is very popular. In addition, caching
frequently accessed data items on the client side is an important technique to reduce the number of uplink messages and the number of downloads for a data item. The effectiveness of a caching technique is measured in terms of average turn around time and hit ratio.
2.2 Profiling issues Profiling has been employed to improve performance of data dissemination mechanisms in distributed systems, including the Internet. There are several types of profiles: user profiles, application profiles, device profiles, etc. In this paper, we focus our attention to user profiles. In mobile and ubiquitous environments user profiles can be effectively used to provide information of users’ location and situation. A profile can be used to specify objects of interest to the user(s), personal details of the user(s) and perhaps the user(s) device details. Profiling can also be used to specify utility (importance) of data items to a group of users distributed geographically or temporally. Device profiles can include such information as processor/battery power, memory capacity and the display capabilities so that the system can push data based on device details of the users. User profiles for text-based data have been extensively investigated in the context of informationfiltering and selective dissemination of information research [4]. Information retrieval (IR) techniques are used for filtering unstructured text-based documents [11]. In general, IR profile systems use either a Boolean model or a similarity-based model. In the Boolean model a user profile is constructed by combining keywords with Boolean operators (e.g., AND, OR, NOT), and “exact match” semantics are used to determine whether a document satisfies the predicate or not. A document whose similarity to a profile is above a certain threshold is said to match that profile. The language with which a profile is expressed is a key component in defining profiles. The expressive power of that language determines to what extent the profiles will be useful [1]. The work in [4] does not consider the rich context of the user(s) in the profile. Typically, there is a trade off between the expressive power of a profile language and the ability to process it efficiently. 2.2.1 Expressing profiles Any language for specifying profiles must allow data objects to be specified declaratively. The profile language must allow membership of the data object depending on content and metadata [1]. Metadata-
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE
3 based criteria might examine a data item’s format (e.g., school schedules should be text files), structure (as in the case of XML data, where structure might be specified with a document type definition), and source (e.g., entry and exit paths of a building are taken from the official website of the construction company which built it). The profile language must allow data objects to be determined dynamically, allowing profile to: i) expand if the profile manager finds new objects of interest to the user(s); or ii) contract if a data object becomes stale, or uninteresting or not desired by the user. This may happen when a client’s context (e.g. spatial) changes or if the user provides feedback to remove the data. 2.2.2 Device profiles Device context deals with the delivery of information the device is capable of rendering as shown in Fig 1. For example, if a particular device is incapable of displaying HTML pages and can render only plain text, the server transcodes the information in the HTML page into plain text/WML format before delivering to the client. This helps in saving bandwidth and optimizing bandwidth usage. Thus the server saves time as well as network resources by being aware of the device capabilities. The device profile of the user/group is taken into account before pushing data to the users. For example if the client’s device supports only GIF format for images, then all images are transcoded and then pushed. Also if the client terminal memory is less, the images are compressed with a relative loss of quality of the image and then pushed to the client.
Fig.1 Device Profile for the group
simpler by XML’s use of semantically meaningful data tags. The XFilter system [13] is a recent example of a filtering system. Unlike existing profile models, XFilter, however does not support variable utility. The system must be capable of handling a very large number of users in the presence of large amounts of data. Some of the issues are, handling context aware data, and evaluating interests and disinterests of the users. In handling the context aware data, user’s (groups) context can be quite rich, consisting of attributes such as physical location, personal history, temporal data etc. Reacting to such data can be very tricky at times as the system needs to draw a line between proactivity and transparency. Processing context data can be very difficult, particularly when location is involved as it depends on the infrastructure support. 2.2.4 Profile formulation Profiles can be created by users or by the system. In the former case, the user(s) specifies his/her area(s) of interest in the form of a list of (possibly weighted) terms that are used to guide the filtering process. Profiles can also be created by the system by extracting information from users’ activity log. A set of data items which have already been identified by the user as relevant, are analyzed in order to identify the most frequent and relevant interests of the users. The interests identified are then weighted according to the frequency of their appearance, and these constitute the user/group profile. A combination of the above two methods can be employed when the user interacts with the system to create a profile. First, an initial profile is given by the user(s) which is regarded as the default profile. Then the system infers interest of the user(s) based on activities, and then updates interests accordingly. Users may provide feedback [19] stating their interests/disinterests to update the profile (by adding or deleting terms, and changing their weights). Chng et al, [3] present a greedy resource allocation algorithm that allocates resources based on availability. The paper does not reveal details of how the profiles are built interactively with the users, and how the system dynamically updates interests of users. Cherniack et al. [1] discuss data recharging based on profiles, but do not discuss how the profiles are formulated in detail.
2.2.3 XML to express profiles
2.3 Caching issues With the advent of XML, filtering of webdocuments based on structure as well as content has become more feasible [11]. XML is the most popular and widely used language to represent data on the web. Data restructuring and integration is made
Profile information has also been used in a limited way to direct the management of caches in mobile environments. An example of early work is the “quasicaching” system of Alonso et al. [17], in which
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE
4 specification of user requirements in terms of data quality are used to reduce the amount of data sent from a server to update client caches. More recently, a cache maintenance technique that exploits user specifications of preferences in terms of the tradeoff between latency and recency is presented in [18]. Caching replacement policy is widely studied in web proxy caching for which, the deterministic and random replacement algorithms have been proposed in the literature [16]. In the random replacement algorithm, data objects are chosen randomly for eviction from a cache. An improved randomized algorithm draws N sample documents from the cache and evicts the least useful documents in samples. The randomized algorithm cannot be expected to perform well in mobile devices as the cache in mobile device is very small compared to web proxy cache. In traditional systems, the calculation of utility value for data items used in replacement algorithm considers only the recency, frequency, size, and cost of fetching a document. But in recent years, providing user specific information has become very important and critical. So, use of profile in cache management leads to efficient caching of data items relevant to user(s).
3. The proposed scheme Profiles expressed in XML are processed for purposes of querying and or updating. XML documents can be processed either by using simple API for XML (SAX) [9], or DOM. In our system we use DOM (Document Object Model) [12]. The DOM platform is a language-neutral interface that provides a standard model of how the objects in an XML object are put together, and a standard interface for accessing and manipulating these objects. DOM allows creation and modification of documents in memory, as well as read a document from an XML source file. DOM is the better choice for modifying an XML document and saving the changed document to memory. Also if random access to information is crucial, it is better to use the DOM to create a tree structure of the data in memory which leads to efficient processing. Using the above technique, profiles with rich context can be processed easily. For profile formulation, both the user(s) and system feedback are considered. Profile formulation involves users in the profile creation and updating process. For maintaining the cache, a greedy replacement algorithm at the client side is employed. Utility values for each data item are calculated and assigned. The parameters considered for computing the utility value are: the size of the data item, access count (number of times the item is accessed) and the
profile utility. Each parameter in the calculation is given a weight. For every new data item, first its utility value is calculated and then admitted into the cache if its relative utility value is more than any of the data items in the cache.
System Manager
System Manager
Remote Server
Main Server
Fig.2 System Architecture Algorithms for updating the profiles, handling feedback from the user, extracting information from the user’s log for system feedback and cache replacement at the clients handheld are discussed in Section4.
3.1 System Architecture This section presents the proposed system architecture consisting of the main server and a remote server as shown in Fig.2. Data is distributed among the main and remote servers. The main server contacts remote server if a requested data is not in the local cache. The main server is responsible for handling requests from users of different groups. The system manager in the main server is responsible for managing other modules in the main server. The system manager is also responsible for communicating with the clients (users) that are serving the requests and pushing data to the group. The diagrammatic representation of the main server is shown in Figure 3.
Feedback Manager (FM)
Update/ Query
System Manager
Context Manager (CM)
Profile Manager (PM)
Group Profile
Group Interest Extractor
Fig.3 System Architecture of Main Server
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE
5 Profile Manager (PM) is mainly responsible for managing different group profiles by dynamically updating the profiles. The PM is also responsible for pushing data to users based on context and passive prefetching. Context Manager (CM) is responsible for providing context data to the system. The system uses time for all activities managed by the context manager. The time can be incremented by an offset to cover all time interval of the user’s activities. This can also provide the location information for spatial context awareness. Feedback Manager (FM) module is responsible for handling the feedback from the users in the group. Users can enter their interest with context information. Group Interest Extractor extracts data of the user’s log to infer the corresponding group interests / disinterests. Uses a simple algorithm to infer user’s interest and to update the group profile. Cache Manager at Client’s side is responsible for managing client’s cache. Uses a greedy replacement algorithm for admitting new data item.
then used to infer group interests and then incorporated into the corresponding profile. The group profile consists of group details (e.g. for a paramedic group, the details of the company they work for etc.), Interests for the group with priority, context information, disinterests for the group, device details of users in the group. The group profile is expressed as shown in Fig 4.
4. Implementation details
In our system, device details consist of image, display (text/wml) and memory details. Based on memory details, decision is taken either to compress images or not. Based on image displaying capabilities images are transcoded to suit client device capabilities.
Java is used for both client and server side implementations. A new thread is forked at the server, to handle every new client. Our implementation is powered by group profiles that comprise necessary details, push and pull combinations, passive prefetching and greedy caching at clients’ side. The profile manager, context manager, feedback manager and group interest extractor are implemented as singleton design patterns in the system. For pushing data to different groups, publisher/subscriber design pattern is used. Each client subscribes to the group the user is interested in. Our system supports both push and pull. Initially profiles for different groups are built by the system using the information provided by users of the group. Profiles are expressed in XML giving semantic meaning to the data in the profile and validated to be well-formed by specifying a DTD. The profiles are formulated using both user(s) and system feedback. Clients pull (request) data items from the server. The requests generated by clients are skewed towards the profile using a biasing algorithm. The server push can be one of two types, context aware data push – where the relevant data is pushed to the users (group) based on temporal context and passive prefetching. Users in a group can dynamically update the profile using the feedback interface that allows them to enter both interests and disinterests. Users can also receive updated information every n time units (e.g. traffic information every 5 minutes). Every request activity of user(s) is captured in a log file, which is
Fig. 4 Profile of the Paramedic group
4.1 Profile manager Profile manager is mainly responsible for managing various group profiles of the system. A separate XML document for expressing profiles are maintained for different groups, thus reducing document query time and hence the processing time. Group profile information can also be updated either through feedback from the clients or group interest extractor output, which is the system’s inference of group interests. Context awareness of the group’s profile information is also managed by querying the profiles. The following example gives an illustration of the profile manager’s function. Imagine a rescue team in a fire accident scenario. Different groups of emergency personnel such as paramedics, firefighters, and police may be involved in handling the accident scene. Each group of users desire different sets of information/data and have different types of context information requirements. For the paramedics group, the information required may be about the medical equipment or number or vacancies in nearby hospital every 10 minutes. For firefighters group, information about the number of available oxygen masks, entry/exit routes, say every 5 minutes, may be of interest. In other words, the group profiles are
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE
6 different for paramedics is different from that for fire fighters.
provide information at the right time. For example, in a fire accident scenario, paramedics may be in need of traffic information say every 10 minutes to transport patients to the nearby hospital. Graduate students at a university may be interested in the topics to be covered in the class that day and they may need information regarding the same, say at 15:15 every Thursday. So the temporal context is checked between 15:00- 15:30 and again during 15:30-16:00.
4.3 Group interest extractor
Fig.5 Interests of Paramedic Group.
Feedback Manager (GUI)
Profile Manager (PM)
Group Interest Extractor
Group Profile
LOG (History)
Fig.6 Profile Manager
4.2 Context manager This module is responsible for providing context information to the system. The context information can be spatial (location), temporal (time) etc. This module maintains a clock (temporal parameter) which is used by all the other modules in the system. The profile manager uses this clock to check for temporal contexts. The context manager has a placeholder to handle spatial context. Context aware data for different groups are checked continuously to
This module is responsible for extracting information from users log to infer the group interests/disinterests. A simple algorithm is used to infer users’ interest and updates the corresponding group profile. Details like interest, timestamp; day of the week are captured in the log file. The interest data evaluated are processed and updated into the profile of the group. In the interest extraction process, there are two types of threshold values used: i) Extraction threshold - a measure of how often the extraction process is undertaken. e.g., In a fire accident scenario, the log file is processed frequently (say every hour) to infer the interests of different groups involved. In case of a school scenario, where different groups identified are under-graduate and graduate students, faculty etc, the extraction threshold can be very high (once in a day); ii) Extraction interest threshold is a measure of the system’s recognition of the interests of various groups. Evaluating a value for this threshold can be application-specific. The algorithm is shown in Fig. 9.
4.4 Feedback manager Feedback from a client is handled using the user interface. Details entered by the user of the group to enter an interest include: user interests, priority of the interest (utility value of the data item), day of the week, single / multiple context data (multiple – push weather info news every 10 minutes), time / time to last (in case of multiple) and finally the frequency of data push (in case of multiple). For disinterested data, only the keyword/data is entered. The threshold for disinterest is given by Disinterest Feedback Threshold = No_of_Users * Weight_of_Disinterest (say 40%). Feedback interface at client’s GUI is shown in fig 7. Figure 8. shows how feedback from users is handled.
4.5 Cache manager at client terminal One of the key responsibilities of a client cache manager is to determine which data items should be retained in the cache, given limited cache space. Such
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE
7 decisions are made using a greedy cache replacement policy; each of the items is given a utility value and when space must be made available in the cache, the item(s) with the least value are chosen as replacement victims. The value function for cache items in our system is based on access count, size and profile utility. The implementation of the cache is a hash table with the hash key being the data item id and the hash value being the object. The size, utility value, and image flag are wrapped in the object. Utility value is calculated using: Utility =Weight_OF_Profile * Profile_Utility + Weight_OF_AccessCount * AccessCount + Weight_OF_Size*(1/Size)
Access Count = No of times the item is accessed Profile Utility= Utility value of the data item from the Server (BS) Size= Size of the data item i. Weight_OF_Profile + Weight_OF_AccessCount + Weight_OF_Size = 1; Each weight is an indication of the importance of the parameter. For example, in the fire scenario, an image file showing the entry and exit maps of the building can be very useful to the fire fighters. So in this case the Weight_OF_Size should be given more importance than Weight_OF_AccessCount. The greedy replacement algorithm at client’s cache is given in Fig.10.
on existing user profile. How the profile is updated or moved when the user(s) moves to a new location is important. The remote server, currently serving the user(s), can contact the main server for the profile. The profile at the remote server is checked for access recency. If the profile is updated, it replaces the old profile with the new one. Imagine a graduate student, moving from one building to another, to attend a class. As he enters the new building, he is interested in knowing today’s topics to be covered in the class. So the profile has to be updated frequently in order to provide student the most relevant data. When the student enters the new building, the server at the new location contacts the home server, compares its local copy of profile with profile document at home server and updates its copy if necessary. Results for disconnection experiments are given in Section 5. DisinterestThreshold: This is the threshold value, only above which the disinterest data is updated into the profile information. Interest Data Items: For each interest Data_Item from Feedback If (Data_Item Not In Profile && Data_Item not in Disinterest List) Add the Data_Item to the Profile. Disinterest Data items: For each interest Data_Item from Feedback Increment DisInterestCount [Data_Item] If (Disinterest_Count [Data_Item] > DisinterestThreshold) If (Data_Item Not in Profile && Data_Item not in Interest List) Add the Data item to the Profile.
Fig 8. Update profile by feedback manager New_List = Group_Interests (Log file)
Fig. 7 Feedback Interface at client’s GUI
4.6 Mobility When the user(s) moves to a new location, the user should still be supplied with relevant data based
For each list (interested_data, disinterested_data) If (interested_data found in Interested_List Then increase utility_value of interested_data in the Interested_List If (interested_data not in (Disinterested_List)) && (count_of (interested_data in New_List) > Threshold) Insert into Inserted_List with a Utility value If (interested_data found in Disinterested_List) Then remove from the Disinterested_List and insert into Interested_List. If disinterested data found in (Interested_List OR new list) Remove from the Interested_List and insert into disinterested field.
Fig.9 Group Interest Extractor Algorithm
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE
8
Greedy Replacement (Data_item, Mode) Begin If (Cache not full) CalcuateUtilValue (Data_item) Cache Data_item Else If (Cache Full && Mode = Request) Update UtilValues Of Items in Cache Decide to Cache Data_item or Not. Else If (Cache Full && Mode= Broadcast) If (From Profile) If (Data_item in Cache) Increase the Utility Value of the Data_item Else (Data_item Not in Cache) (Definite Cache) Utility(Data_item) = (MaxUtil+MinUtil) Sum (Utility(All items in Cache)) + Min (Util in Cache)) Return true Return false
mobile clients, and 250 data objects. The data object size varies from 650 bytes to 16.15 k bytes. And the total data base size is 2.5 Mbytes. The number of requests made is equal to 125. The average data size of an item is 10 k bytes. The metadata with each data item sent from the server is 150bytes.
5.2 Analysis of experimental results Several tests were run with different bias towards the profiles. All the results obtained are compared with the results obtained with LRU cache replacement policy. LRU without profiling, LRU with profiling is used to compare with our scheme. 5.2.1 Experiments The hit ratio and the average turn-around time were measured with different cache sizes ranging from 62.5K – 250K. The requests by the users are biased towards the profile. Our experiments were measured with 40-60% bias. Cache Size Vs Hit Ratio 30 25
Fig.10 Greedy Replacement at Client Cache Hit Ratio
5. Experimental results
Profile
Experiments were carried out to evaluate the performance of our implemented system in terms of the following parameters:
x
5.1 System parameter settings All tests were run for 125 requests each by all the 6 users in the real time environment and the average value was determined. The distribution of sizes for data items is uniform. In our experiments we use 6
LRU without Profiling
10
0 62.5
Average turn around time –(request time + time to check in cache + time to fetch data item if not in cache), and Hit ratio
In order to simulate the profile of users, a profile range was identified for different groups. The requests generated were biased to simulate group interests. Experiments conducted also included variation of the bias towards the profile. In order to cover all the activities of the users, the system maintained its own clock. In our experiments, windows based laptops are used as client terminals.
LRU
15
5
125
187.5
250
Cache Size (KBytes)
Fig.11. Hit Ratio vs. Cache (Bias 40%)
Cache Size Vs Hit Ratio
Hit Ratio
x
20
45 40 35 30 25 20 15 10 5 0
Profile
LRU
LRU Without Profiling
62.5
125
187.5
250
Cache Size (KBytes)
Fig.12. Hit Ratio vs. Cache (Bias 60%)
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE
9
Hit Ratio Vs Time (Requests) Disconnection 25
200
Profile
150
LRU
100
LRU Without Profiling
50
H it R a ti o
A v g T u r n A r o u n d T im e (m s )
Cache Size Vs Avg Turn Around Time (ms)
20
C = 62.5K Bias =60%
15
Profile
10
LRU
5
0
0 62.5
125
187.5
250
30 60 90 120 150 180 210 240 270 300 330 360
Cache Size (KBytes)
Requests
Fig.15. Hit Ratio vs. Time (Requests)
Fig.13 Average turn around time vs. Cache (Bias40%)
Hit Ratio Vs Time (Requests) Cache Size Vs Avg Turn Around Time(ms)
Disconnection
200
C = 125K Bias =60%
25
Profile
150
LRU
100
LRU Without Profiling
50
Hit Ratio
Avg T u rn Aro u n d T im e(m s)
30
20 Profile
15
LRU
10 5 0 30 60 90 120 150 180 210 240 270 300 330 360
0 62.5
125
187.5
250
Requests
Cache Size (KBytes)
Fig.16 Hit Ratio vs. Time (Requests) Fig.14 Average Turn around time vs. Cache (Bias 60%) Our experimental results show that the proposed profile based scheme shows significant improvement over LRU schemes (with and without profiling) in terms of hit ratio and average turn around time. LRU scheme with profiling performs marginally better than LRU without profiling. In general, the performance of all schemes improve with cache size. 5.2.2 Disconnection experiments We evaluate the performance of the proposed caching mechanism during disconnection. For disconnection experiment, the prototype was run for 120 requests, then a disconnection is simulated and results recorded for 120 requests during disconnection. Again after reconnection results were recorded for another 120 requests. Proposed mechanism is compared with LRU with profiling.
The profile based scheme performs better than LRU with profiling during disconnection, due to the caching of data items relevant to user(s). 5.2.3 Best case scenario This experiment is to measure the response time in the best case scenario. The cache size is assumed to be equal to the total size of data items on the server, and bias is set at 100%. The simulation was run for 250 requests on six different mobile clients and the average turn around time recorded. Scheme Profile based caching LRU with profiling
Turn around time (ms) 32.7 68.5
Fig.17 Average turn around time (Bias =100%)
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE
10 From the above experiment, it can be seen that our profile based scheme performs better than LRU with profiling in the best case scenario.
6. Conclusion and future work In this paper we presented the design, implementation and experimental validation of a profile based caching mechanism for mobile environments. A novel feature of our implementation is the creation of profiles for groups of users. These profiles are used at servers to determine broadcast data items. We also develop algorithms for system inference of interests, to handle feedback from users, and a greedy replacement algorithm at the client’s terminal. A working experimental system has been implemented to carry out performance evaluation. Our results show that the proposed profile based mechanism performs better in terms of hit ratio and average turn around time. We also extend the mechanism for mobility situations. Our future work includes extending the system to cellular networks. The system will also be evaluated for scalability in the presence of large number of users and increased data sizes. It is also envisaged to enhance the system for multimedia caching. Acknowledgements: The work carried out in this paper was funded by TXARP Grant Number 14771032, and the National Science Foundation Grant Number 0129682.
7. References [1] Cherniack, M., Franklin, M.J, and Zdonik, S., Expressing User Profiles for Data Recharging, IEEE Personal Communications, August 2001, Pgs. 32-37. [2] Cherniack, M., Eduardo F. Galvez, Michael J. Franklin, Stan Zdonik Profile driven cache management, Intl. Conf. on Data Engineering (ICDE), 2003, Bangalore, India. [3] B. Chng, Danny POO and J-M Goh A hybrid approach for User Profiling., 36th Annual Hawaii International Conference on System Sciences (HICSS'03) Jan. 2003, Big Island, Hawaii. [4] Tsvi Kuflik, Peretz Shoval, Generation of User profiles for Information filtering Research Agenda. , SIGIR 2000 ACM. [5] Huaping shen, Mohan Kumar, Sajal K Das, Zhijun Wang. Energy Efficient Caching and prefetching with data consistency in mobile distributed systems, IEEE IPDPS, Santa Fe, NM, USA, April 2004. [6] Franklin, M & Zdonik, S. B (1998). “Data in Your Face”: Push Technology in Perspective. ACM SIGMOD Intl. Conf. on Management of Data , Seattle, Washington, pp, 183-194.
[7] Qinglong Hu, Dik Lun bee Cache algorithms based on adaptive invalidation reports for mobile environments.. Cluster Computing 1(1): 39-50 (1998). [9] http://www.w3.org/XML/ [10] Ajay Prabhu, Gaurav Jolly, Sachin Bhatkar Semantic Data Caching in Mobile Computing Project Report, UMBC May 2002. [11] M. Altinel and M. J. Franklin. Efficient filtering of xml documents for selective dissemination of Information. In A. E. Abbadi, M. L. Brodie, S. Chakravarthy, U. Dayal, N. Kamel, G. Schlageter, and K.Y. Whang, editors, VLDB 2000, Proceedings of 26th International Conference on Very LargeData Bases, September 10-14, 2000, Cairo, Egypt, pages 53–64. Morgan Kaufmann, 2000. [12]Understanding DOM in XML http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/TOC.html [13] N. J. Belkin and W. B. Croft. Information filtering and information retrieval: Two sides of the same coin? Communications of the ACM, T. W. Yan and H.GarciaMolina. The sift information dissemination system. TODS,24(4): 529–565, 1999. [14] D. Carney, S. Lee, and S. Zdonik. Scalable application aware data freshening. In Proceedings of 18th International Conference on Data Engineering (ICDE), 2002. [15] M. Satyanarayanan, “Pervasive Computing Vision and Challenges," IEEE Personal Comm., vol. 6, no. 8, Aug. 2001, pp. 10--17. [16] K. Psounis and B. Prabhakar, “Efficient Randomized Web-Cache Replacement schemes using samples from past eviction times” IEEE/ACM Trans. On Networking, Vol 10,No.4, pp 441-454, Aug 2002. [17] R. Alonso, D. Barbara, and H. Garcia-Molina. Data caching issues in an information retrieval system. TODS, 15(3):359-384, 1990. [18] L. Bright and L. Raschid. Using latency-recency profiles for data delivery on the web. in VLDB 2002, Proceedings of 28th International conference on Very Large Databases, Hong Kong China Aug 02. [19] L. Kelly and J. Dunnion, “INVAID: an intelligent navigational aid for the World Wide Web”, IEEE TwodaySeminar. Searching for Information: Artificial Intelligence and Information, p. 14-17, 1999. [20] D. Barbara and T.Imielinksi. Sleepers and Workaholics: Caching strategies for mobile environments(extended version). MOBIDATA: An interactive journal of mobile computing, 1(1), 1994.
Proceedings of the First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services (MobiQuitous’04) 0-7695-2208-4/04 $20.00 © 2004 IEEE