For instance, the stress testing of RFID middleware needs to deploy millions of tags ... responsible for receiving queries from applications and giving the results.
A Test Data Generation Tool for Testing RFID Middleware Haipeng Zhang, Wooseok Ryu, Bonghee Hong, Chungkyu Park Department of Computer Engineering Pusan National University Busan 609-735, Republic of Korea {jsjzhp, wsryu, bhhong, allan}@pusan.ac.kr Abstract—This paper presents a noble RFID data generation tool to enable efficient and low cost testing of the RFID middleware. RFID systems have been widely used in many business areas such as supply-chain management and yard management. As a key component of RFID system, the RFID middleware should be carefully evaluated from various environments. However, taking a variety of tests for RFID middleware is very complex and costly process because it requires construction of test environments such as deploying RFID devices, tagging of product items, and so on. For instance, the stress testing of RFID middleware needs to deploy millions of tags which take lots of money, time and human recourses. To solve these problems, we design virtual readers and tags for constructing virtual environment instead of real devices. Using virtual devices, we implement a test data generation tool which generates tag stream data for testing the RFID middleware. To simulate real RFID environment, we first define several parameters for representing real environments. By configuring the parameters, this tool can generate test data set which emulates various business scenarios. We also define several measurement metrics which are related to four factors like group, redundancy, noisy as well as route to verify the correctness of the generated data set. The experimental results show that our testing tool is cable for facilitating RFID middleware test. Keywords-RFID middleware;Test data generation;Software test
I.
INTRODUCTION
RFID (Radio Frequency Identification) technology has established a new era of business optimization. With its inherent advantages based on smart identification, various enterprise systems have been evolving into RFID enabled enterprises such as warehouse management, supply-chain management and yard management [1][2]. Lots of business environments begin to consider adoption of RFID technology for a higher business value. To adopt an RFID system, various components are required such as RFID tags, readers, and RFID middleware. Among them, the RFID middleware plays a very important role because it should maintain connections of many readers, collect huge stream of identified tags from readers, and filter the RFID streaming data. For example, in one project of Hanmi Information Technology [3], it uses 2 million tags for tracing pharmaceutical products. It means that RFID middleware should be able to process a huge amount of tags. It is also responsible for receiving queries from applications and giving “This work was supported by the Grant of the Korean Ministry of Education, Science, and Technology.” (The regional Core Research Program/ Institute of Logistics Information Technology)
the results. As the performance of RFID system mostly depends on the RFID middleware, testing of the RFID middleware is necessary to adopt RFID technology in the business environment. In order to test an RFID middleware, it is required to build test environment such as deploying RFID readers in business locations and tagging business items as well as moving items through the readers. However, this process is very complex and costly. For example, the stress testing of RFID middleware needs to use millions of tags for testing. It is unpractical that purchasing and deploying millions of tags just for stress testing. We will be confronted with the following problems when testing an RFID middleware. Firstly, choosing and purchasing physical devices take a lot of money and time. Secondly, it wastes a lot of time and human recourses. Thirdly, manual testing has limitations on system load, performance, reliability test and huge test data generation [4]. In a word, the problems of testing an RFID middleware are mainly caused by deploying real devices and setting various test environments. Constructing virtual environments can be a good way for testing RFID middleware. Our approach is to virtualize the real business environment which reduces the cost of testing the RFID middleware. In this paper, we design two kinds of virtual device which are virtual reader and tags for constructing test environments and implement a test data generation tool for testing of the RFID middleware. Basic idea of this paper is to virtualize the real environments using virtual devices and define several parameters for representing real RFID environments by analyzing the environment. The test data generation tool based on the parameter provides automatic generation of tag data identifications from test environments through various parameter configurations. We also define several measurement metrics for verifying correctness of the generated data set. The metric parameters mainly consist of four factors which are group, redundancy, noisy and route. The experimental results show that our testing tool can automatically generate various test data for RFID middleware test without deploying any real devices. The remainder of this paper is organized as follows. Section 2 discusses previous works regarding testing of RFID middleware and generating test data. Section 3 introduces our test tool and defines parameters for representing RFID
environment. In Section 4, we present the design of test data set. Section 5 defines measurement metrics for verifying generated test dataset and presents experimental results. We conclude our work in Section 6. II.
RELATED WORKS
In this section, we discuss some related works on RFID middleware evaluation and RFID test data generation. There are many research works that have been made in this field and now are continuously progressing on this field. Various factors of RFID middleware evaluation have been discussed in [5]. According to ISO/IEC 9126, it has defined some testing factors of RFID middleware that can evaluate the quality of RFID middleware. These factors are functionality, efficiency, reliability, usability and portability. Analytic hierarchy process is used to determine the importance of each factor. However, the methods how to evaluate these factors are not discussed in the paper. And the test environment and test data are also not considered too. Research of RFID middleware performance test is introduced in [6]. In this work, it defines test parameters of performance test and implements a testing tool to test its RFID middleware which is implemented based on Application Level Event (ALE) [3]. Test data can be automatically generated after parameter setting. Although this work can test the performance of RFID middleware, it cannot test other factors of RFID middleware which are mentioned in [5].An RFID middleware evaluation toolkit based on virtual reader emulator is introduced in [7]. This work focuses on building virtual test environments by using the virtual reader emulator. But it does not consider how to test parameters for generating test data and evaluate the generated test data. There are a lot of research works on test data generation [8], [9]. These works focus on how to automatically generate data for testing software. But the generated data is only used for testing the functionalities of software. They also do not consider other factors of software. In order to generate test data, we need to know input data features of the software. An RFID middleware can be treated as a DSMS (Data Stream Management System) [10]. RFID tag data has the common characteristics of real-time streaming data. However, it also has its own data features [11]. Virtual data simulation in DSMS field is discussed in [12]. Semantic Valid Data concept is used for generating information of streaming tollgate data. The test data has semantic meaning. But it has limitations to generate various test data with semantic meaning only the data that can represents the special case. In this paper, we also define Semantic Valid Data for generating test data which can represent test environments. In addition, we also define Semantic Invalid Data which does not contain any semantic meaning. III.
TEST DATA GENERATION TOOL
In this section, we discuss about the concepts for designing the testing tool and parameters setting for generating various test data.
A. Virtualization of RFID environment When testing RFID middleware, construction of test environment following enterprise’s business model is the first requirement. To construct RFID environment, RFID reader needs to be deployed at business location, RFID tag needs to be attached on product. One deployment strategy corresponds to an environment. In case of test environment is changed, deployment of devices needs to be changed as well. As RFID middleware test requires various test environments, devices deployment makes the test to be complex. To alleviate the deployment of RFID devices thereby address the problems of testing RFID middleware, we design virtual devices to simulate the real RFID readers and tags. Various virtual test environments can be easily constructed by different deployments of the virtual devices with a little workload. Figure 1 shows using virtual environment instead of real environment for testing RFID middleware. In this way, RFID middleware can be tested under the virtual environment without building real test environment so that the test cost can be reduced significantly. Real Environment
Reader
RFID Middleware
Reader
Virtual Environment
RFID Middleware
Virtual Reader
Virtual Reader
Virtual Controller Virtual Tag
Virtual Tag
Virtual Tag
Figure 1. RFID environment virtualization
Same with real environment, virtual readers and virtual tags are the most important components in the virtual environment. Various environments can be constructed by deploying virtual readers and virtual tags in different ways. The test data generated from these virtual devices should be able to represent the characteristics of the real environments. So designing virtual readers and virtual tags is a very important thing for developing the testing tool. The virtual devices should have the same functionalities with the real physical devices. Through analyzing the real readers and tags, we can get the requirements for designing virtual devices shown as follow: •
Virtual Reader: it plays the same role of real physical RFID readers. It responsible for reading tag data when they move into interrogation zone, generating tag events to middleware, executing tag operations received from middleware and reporting the results.
•
Virtual Tag: it plays the same role of real physical RFID tags. It responsible for generating unique identity codes (EPC codes) as well as user memory data if is required, executing and responding reader’s operations. Virtual tag has memory to store all the generated data.
•
Virtual Controller: it controls the movements of tag and reader according to the test environment. It also responsible for sending reader’s operation to tags and receiving responses from tags.
By using these virtual devices, we implement a test data generate tool for testing RFID middleware. Through our testing tool, virtual environment can be easily constructed and automatically test RFID middleware. Virtual devices can be deployed and configured through a flexible parameter configuration interface. Then our testing tool can automatically generate test data within the test environment to test RFID middleware. Figure 2 shows the block diagram of our testing tool. To test RFID middleware, firstly user virtualizes the real environment to get the parameter setting and deploys virtual devices, then runs the testing tool to test middleware. Test data is automatically generated and continuously sent to RFID middleware through communication interface. event
event
event
description, to build a conveyor belt test environment we need to know the following information: the number of readers deployed along the belt and the sequence of the readers which inflects the movement of tag; the number of tags that are contained in one case and tag data type; the velocity of the conveyor belt and the rate of tag appears at the first reader on the belt. Table I shows a summary of parameter definition for constructing conveyor belt environment and generating test data.
event
Virtual Reader 1 Virtual Reader 2
…
Virtual Reader N
Virtual Controller Virtual Tag Generator
Configuration
Communication Interface
Data Generator Test Data Generation Tool
Virtualization
Figure 2. Block diagram of testing tool
Test data generated from our testing tool can represent the virtual test environment thereby represent the real RFID environment. As different real environments generate different data, we need to make sure that different test data also can be generated from various virtual test environments. We have to consider giving a flexible parameter definition for generating various test data. To define parameters for generating test data, the first thing needs to be done is that analyzes the data features of RFID systems. In our previous explanation, RFID middleware can be considered as one special DSMS. So RFID data has some common features with the steaming data. However, RFID system also has its own data features beyond the general DSMS. For example, a pallet containing several cases move to a RFID reader, then all the tags attached on the cases will be read at almost same time. So, these data can be grouped. Testing tool should be able to generate the test data which has these unique data features and also can be flexibly configured to generate different test data. B. Parameters for emulating real environment In order to easily generate various test data, we need define the flexible parameters of test data generation. Here we use an example of conveyor belt to analyze RFDI data features and explain how to define the parameters. Figure 3 shows an RFID conveyor belt. In the conveyor belt system, some RFID readers are deployed at fixed place along the belt, the belt is moving with a fixed velocity, some cases which contain several products are unloaded on the belt and move with the belt. RFID tags attached on the products are successively read by the readers when the cases move on the belt. In order to test middleware using the conveyor belt, the test data generated from the testing tool should be same with real data that is generated from the real environment. According to the above
Figure 3. An example of conveyor belt
After setting these parameters, the test data which has same semantic meaning with real data can be continuously generated for testing RFID middleware. We can get some unique RFID data features from this example such as tag groups and tag moving path. The parameters defined in Table I are only suitable for specific conveyer belt system as the parameters are defined following the characteristics of conveyer belt system. Other RFID environments cannot be constructed by using these parameters. In order to define the parameters for constructing a variety of test environments, we need to summary the whole RFDI data features. TABLE I. Parameter
PARAMETERS FOR CONVEYOR BELT Description
Domain
NR
The number of readers for testing
[1-100]
TR
The travel time between two readers
[1-500s]
NG
The maximum number of tags in a group
[1-100]
TG
The rate of tag generation
[1-100/s]
RP
Reader name list in a route
[1- NR]
RFID data contains both real-time streaming data and its own features. The typical features of streaming data are continuous, mass and unbounded. After previous analysis, analyzing [10][11] and Application Level Events specification version 1.1 (ALE 1.1) [3] which is one international standard of RFID middleware, we get the following special features of RFID data: •
Redundant Data: As the RF communication is volatile, RFID reader reads tag data within read cycle which is a reading period to increase correctness rate of reading in physical reader. This cycle time is very short as it uses millisecond time unit. So if a tag stays in the interrogation zone of the reader for a certain time, it may be read in several times. In this case, RFID reader
will generate some same tag events that causes that the redundant data is occurred. •
Grouped Data: Grouped data is a special data features appeared in RFID environments. Grouped Data represents that a number of tags appear in the interrogation area of reader at the same time and move away together. For example in the previous conveyor belt system, a number of tags included in one case move into the reader range together, are read together and move away together.
•
Moving route: Moving path indicates the movement of RFID tags in RFID environments. In some business applications, the movement of products may follow some fixed ways. RFID readers are deployed at some meaningful places to indicate the business process such as entrance/exit door. If several readers are deployed on a moving path that means RFID tags which move on the path should be read in a predefined sequence of the readers. If some tags are not read by a reader which is on the moving path that means something wrong happened to these tags. So the moving path needs to be considered when generates the test data.
•
Noisy data: Sometime RFID readers may generate some wrong tag data because of the instability of RF communication and obstacle between tags and readers. So the generated test data should contain some wrong tag data.
Beyond these special data features, other data features such as continuous and unbounded can be gotten by comparing with DSMS data features. Different data features need to define different parameters. After getting the data features of RFID data and refer to the parameters defined in Table I, we can define some additional parameters for generating RFID test data. TABLE II.
data set which has semantic meaning. SID is mainly used for the performance tests. SVD having semantic meaning can represent the user defined test environments such as conveyor belt and product line. In this paper, we mainly focus on generating SVD. The purpose of Semantic Valid Data (SVD) is generating data sets to represent the business models in real environments and evaluating RFID middleware under user defined test environments. As data representations from tag level to reader level are different, we need to carefully design the data structures of among them. We define four data tables for generating SVD set which are tag table contains tag information, reader table contains reader information, route table contains information of tag movement and event table which is the output of our testing tool. Figure 4 shows the ER-diagram among the four data structures. The detail explanations of these tables are shown as follows:
Figure 4. ER diagram of four data tables
•
Reader_Table: it contains the configuration information of each reader. It has two attributes which are Reader_Name and Reading_Time. Reader_Name is the identification of reader and Reading_Time is the time that reader spends to finish one operation.
•
Route_Table: it contains the movements of tag. It has three main attribute which are Route_ID, Reader_Name and Travel_Time. Route_ID identifies a route. Travel_Time means the moving time between two neighbor readers in the route.
•
Tag_Table: it contains the configuration information of tag. It has four attributes which are Tag_ID, Group_ID, Route_ID and Init_Timeinitial time. Tag_ID identifies a tag. Init_Time is the generated time of the tag.
•
Event_Table: it is the output of our testing tool and is used for testing RFID middleware. It has five attributes which are Tag_ID, Group_ID, Reader_Name, Time and Event_Type. Time is generated time of the event. Event_Type indicates the tag is observed or purged.
ADDITIONAL PARAMETERS FOR DATA GENERATION
Parameter
Description
Domain
Rr
Ratio of redundant data
[0-1]
Gr
Ratio of grouped data
[0-1]
Nr
Nosiy ratio of each reader
[0-1]
Other parameters for configuring the virtual devices are not shown in the table such as the name of readers, IP address, the encoding scheme of EPC codes, and so on. A variety of test environments can be constructed through the flexible parameter setting. After parameter setting, the testing tool can continuously generate test data and send to RFID middleware. In the following section, we discuss the test data generation. IV.
DESIGN OF TEST DATA SET
According to the test objectives, the test data can be classified into two categories which are Semantic Invalid Data (SID) and Semantic Valid Data (SVD). SID is the data set which has no semantic meaning. On the other hand, SVD is the
The first three tables temporarily store data for generating event table. After user setting the value of parameters according to the test environment, testing tool first generates
reader table and route table. Next, it generates tag table refereeing to route table. Then virtual controller controls the movement of tag according to its route path. When tag moves to reader zone, reader will generate tag event and sent to middleware as well as tag leaves out of reader zone. In our implementation, we use the random method models to generate the data. Figure 5 shows an example of generated test data. Tag_ID x3500003200000640000000045 x3500003200000640000000046 x3500003200000640000000047 x3500003200000640000000048 x3500003200000640000000045 x3500003200000640000000046 x3500003200000640000000045
Reader_Name R1 R1 R1 R1 R1 R1 R3
Group_ID Time 1 2010.04.13 20:22:04 1 2010.04.13 20:22:04 2010.04.13 20:22:06 2010.04.13 20:22:06 1 2010.04.13 20:22:06 1 2010.04.13 20:22:06 1 2010.04.13 20:22:08
Event type evObserved evObserved evObserved evObserved evPurged evPurged evObserved
Figure 5. An example of generated tag events
We can change parameters of Table I and Table II to build various business environments. Above all descriptions of test data generation based on these parameters has the possibility to cover almost RFID data that can be generated from the real environments. V.
TEST DATA MEASUREMENT METRICS AND EXPERIMENTS
To verify correctness of the dataset, we first define several metrics for measuring data set. Next, we perform experiments on data set using the metrics. A. Measurement Metrics Test data generation tool continuously generates the test data set according to the input parameters and sends to RFID middleware for testing. However whether the test data satisfies the test requirements or not we are not sure about this. So, we need to verify the generated test data thereby verify the defined parameters. As the data generator can generate two data sets, we have to give data measurements for them. In previous explanation, SID is only used for performance test, so we just need to verify the throughput of the test data generator. Test data is continuously sent to the middleware as streams. We need to control the throughput of test data to test the performance of RFID middleware. To get the throughput we need to compute the number of tag events that are generated in a certain time. This is easy to do that just collects generated data in a certain time and computes the throughput. The throughput defined by parameters can be computed by as in (1). T =
TG ∗ N R t
(1)
Where TG is the rate of tag generation, NR is the number of readers and t is the defined unit time. In Section 3 we discuss some special features of RFID data. As the generated test data contains semantic meaning and can represent RFID data features, so we need to verify these
data features of SVD to make sure that it can be used to test RFID middleware. As the test data is continuously generated, so we can collect the data for a period of time to verify it. Because of the inherent feature of RFID reader, each reader may generate redundant data. Here we use redundancy ratio to represent the redundant feature of the data. To calculate the redundant ratio, we first define redundant ratio of reader which can be calculated as in (2)
Ri =
∑ (N
Ti
− 1)
(2)
N Ri
Where NTi is the number of same event of a tag generated from reader i, NRi is the total number of event generated from reader i. Then we can get the redundancy ratio of the test data. The redundant ratio can be calculated as in (3) NR
redundancy Ratio =
∑R
i
i
NR
(3)
Where NR is the number of readers used for testing and Ri is calculated as in (2). To verify the group feature, we use the group ratio which represents the grouped degree of test data. As the generated test data has an id of each group, so the group ratio is easy to calculate as in (4). n
groupRatio =
∑N i
ND
Gi
(4)
Where NGi is the number of generated data in group i, ND is the total number of generated data. In RFID environment, sometimes reader may generate wrong tag event as the instability of RF communication and obstacle between tags and readers. So the generated test data should contain some noisy data. In order to verify the noisy data, we need to define metric for noisy data. Route path represents the tag movement in real environment so that the generated route should represent these semantic meanings not only a list of reader names. We suppose a reader should not be appeared more than one time in a valid route. When a route is generated, we need to check it to make sure that it is valid. Here we omit these two metrics because of the limitation of paper length. B. Experimental Results By using these data measurement metrics defined above, we can verify the correctness of generated test data thereby verify parameter definition and the testing tool. After the testing tool running, it continuously generates tag data and sends data to RFID middleware. So that it is not easy to verify
the whole generated test data as streaming data is unbounded. We collect a certain number of test data to verify instead of the whole test data. Defined Ra tio
120%
VI.
Genera ted Ratio
Group ra tio
CONCLUSION
Testing RFID middleware is labor-intensive and expensive as there are various RFID environments which are very difficult to be constructed. In order to facilitate RFID middleware test, we propose and implement a test data generation tool which can simulate the real RFID environment and automatically generate test data for testing RFID middleware.
100% 80% 60% 40% 20% 0% 10%
20%
40% Test data sets
70%
100%
Figure 6. Comparison of group ratio Reliability 100% 80% Relia bility
experiments of noisy and route as the limitation of paper length.
60% 40% 20%
To do this, firstly we define several parameters for representing real environments and design test data structure for data generation. In order to verify the correctness of generated test data, then we define several measurement metrics and perform several experiments. Through all the experiment results we can find that the generated test sets have very small deviation with the expected data. However, the deviation is too small that we can ignore it as we use a portion of generated test data not the whole test data. Overall, our testing tool can generate test data which possesses RFID data features for testing RFID middleware.
0% 10%
20%
40% Test data sets
70%
100%
Figure 7. Reliability of group ratio
Firstly, we verify the group features. We collect 5 data sets with the group ratio 10, 20, 40, 70, 100%, respectively and the range of group size is from 1 to 100. Each data set contains 10,000 generated tag events. Figure 6 shows the comparison between the computed group ratio and the expected group ratio among the 5 generated test data sets. Figure 7 shows the group reliability of the test data sets. From the results we can see that our testing tool is able to generate group data with the expect group ratio. Defined ratio
60%
Generated ratio
Redunda ncy ra tio
50% 40% 30% 20% 10% 0% 10%
20%
30% 40% Test data sets
50%
Figure 8. Comparison of redundancy ratio
Secondly, we verify the redundancy feature of generated test data. Here we also collect 5 data sets from 5 readers with the redundancy ratio 10, 20, 30, 40, 50%, respectively. The number of generated tag event in data set is also 10,000. Figure 8 shows the comparison results among the 5 data sets. From the results we can see that our testing tool also has ability to generate redundant data with expected ratio. We also omit the
REFERENCES [1] [2] [3] [4]
B. Glover and H. Bhatt. RFID Essentials. O’Reilly Media, Inc. 2006. RFID JOURNAL. http://www.rfidjournal.com/ Hanmi IT. http://www.hanmiit.net/ E. Dustin, J. Rashka, J. Paul. “Automated Software Testing: Introduction, Management, and Performance”, Addison-Wesley Professional, 1999. [5] G. Oh, D. Kim, S. Kim, S. Rhew, “A Quality Evaluation Technique of RFID Middleware in Ubiquitous Computing”, 2006 International Conference on Hybrid Information Technology (ICHIT'06), Vol.2, pp. 730-735, 2006. [6] J. Kim, N. Kim, “Performance Test Tool for RFID Middleware: Parameters, Design, Implementation, and Features”, Advanced Communication Technology, 2006. ICACT 2006. The International Conference, Vol.1, pp. 149-152, 2006. [7] C. Park, W. Ryu, B. Hong, “RFID Middleware Evaluation Toolkit Based on a Virtual Reader Emulator”, In 1th International Conference on Emerging Databases, pp. 154-157, 2009. [8] C. Michael and G. McGraw, “Automated Software Test Data Generation for Complex Programs”. In 13th IEEE International Conference on Automated Software Engineering, pp. 136-146, 1998 [9] J. Edvardsson, “A Survey on Automatic Test Data Generation”. In Proceedings of the Second Conference on Computer Science and Engineering in Linköping, pp. 21-28. 1999. [10] F. Wang and P. Liu. “Temporal Management of RFID Data”. Proceeding of the VLDB05, pp. 1128-1139, 2005. [11] S. Chawathe, V. Krishnamurthy, S. Ramachandran, S. Sarma. "Managing RFID Data". Proceeding of the international conference on VLDB, pp. 1189-1195, 2004. [12] A. Arasu, M. Cherniack, E. galvez, D. Maier, A. S. Maskey, E. Ryvkina, M. Stonebraker, R. Tibbetts, “Linear Road: A Stream Data Management Benchmark”, Proceedings of the 30th VLDB Conference, pp. 480-491, 2004.