VTarget: An Improved Software Target Emulator for SANs - Springer Link

4 downloads 31868 Views 189KB Size Report
lator integrated with storage virtualization management called VTarget, and some ... PCI bus. FC_initiator. Application Server. PCI bus. FC_target. SCSI-RA ID.
VTarget: An Improved Software Target Emulator for SANs Hongcan Zhang, Jiwu Shu, Wei Xue, and Dingxing Wang Tsinghua University, Beijing,China [email protected] Abstract. With the increasing of storage scale and complexity, the storage management for heterogeneous environments is becoming more and more important. This paper introduces an improved software target emulator integrated with storage virtualization management called VTarget, and some key technologies are explained in detail. The VTarget system can manage various heterogeneous storage resources and provides one virtualization interface for storage management. It is implemented in the storage network layer in the SAN and can support heterogeneous operation systems. The VTarget system also provides an access control mechanism to enhance device security and adapts multi metadata copy technology to improve reliability. We have implemented a prototype of VTarget system and the testing results showed that the storage virtualization management influences the I/O bandwidth less than 3.6% and less than 8% in latency, which only has a very slight effect on storage performance for the mass heterogeneous storage system management.

1

Introduction

The prevalence of e-business and multimedia applications has caused the capacity of data storage to increase rapidly, and accordingly the storage systems are rapidly expanding. As a new technology with features of high scalability, high performance and long distance connectivity, Storage Area Networks (SANs) currently play an important role in building large storage systems. In the construction of SAN systems, the technology of the software target emulator has a higher performance-cost ratio than that of all of the hardware. The software target emulator can be accessed as storage devices by the application servers in the SAN, which is relatively quite cheaper and more flexible. Recently some fibre target emulators and iSCSI target emulators have appeared, among which the SCSI Target Emulator developed by the University of New Hampshire [1] is widely used. Adapted with a different Front-End Target Driver (FETD), the SCSI Target Emulator can represent the storages it attaches to the application servers as different kinds of storage devices such as fibre devices, iSCSI devices, etc. But it can only represent the devices it has attached directly to the application servers, and lacks effective management of storage resources. Increasing requirements of storage capacity also result in another problem that has become increasing more urgent: how to effectively manage storage resources. Storage virtualization management technology is applied to solve just G. Chen et al. (Eds.): ISPA Workshops 2005, LNCS 3759, pp. 91–100, 2005. c Springer-Verlag Berlin Heidelberg 2005 

92

H. Zhang et al.

this problem. It can manage various storage systems and provides a uniform storage resource view for users. It also provides functions of dynamic storage resource configuration, online backup, remote mirroring and so on. By striping one logical device across multiple physical disks, it achieves a higher I/O performance. Sistina’s Logical Volume Manager (LVM[2]) is the most widely used storage virtualization technology, and it has become a part of the Linux OS kernel. The Enterprise Volume Management System (EVMS[3]]) provides virtualized storage management for enterprises, but can only be adapted to a certain kind of Operating System (OS) and it only runs in a single host environment. The SANtopia Volume Manager[4] is a storage virtualization system for the SAN environment, but it is dedicated to a certain file system, and is not compatible with multiple operating systems. This paper introduces the VTarget system, which is a software target emulator integrated with storage virtualization management. It can provide a uniform storage view to heterogeneous platforms, and provides efficient management of storage resources, thus optimizing the storage resource utilization. The VTarget system can also control the access to virtual devices, and thus provides a certain degree of storage security.

2 2.1

Architectures Hardware Architecture for Building an FC-SAN

As a software target emulator, VTarget can be connected directly into fibre networks or ethernet as a network storage device, which is shown by Figure 1. In the storage network, the Fibre HBAs on front servers work in an initiator mode, while those on VTargets work in a target mode. One or more FC-switches connect them. According to the requests from the front application servers, the VTarget system can create Virtual Devices (VD), which can be found and accessed by front servers, no matter which kind of OS is adopted (Linux, Solaris, Windows, etc.). Providing a uniform storage resource interface for heterogeneous platforms is an outstanding virtue of VTarget. Applic ation Server PCI bus

Applic ation Server PCI bus

FC_initiator

.............

Applic ation Server PCI bus

FC_initiator

FC_initiator

FC Sw itc h Virtual Disk s VD

VD

PCI bus

VD

VD

FC_target

VD

PCI bus

VD FC_target

............. SCSI-RA ID

SCSI-RA ID

VT arget

VT arget

Fig. 1. Hardware architecture of FC-SAN Using VTarget

VTarget: An Improved Software Target Emulator for SANs

2.2

93

Software Architecture

As shown in Figure 2, the VTarget system is made up of two main modules: the SCSI Command Execute (SCE) Module and the Virtualization Module. The SCSI Command Execute Module (SCE Module) is responsible for handling the IO requests from the FETD. In fact, the IO requests are called virtual IO requests, as they are aimed at virtual devices. They can not be implemented directly unless storage address remapping has been completed. The Address Mapping Sub-Module is responding for transforming one virtual IO request into one or more physical IO requests, which are the IO requests that can be implemented on physical devices. The Virtualization Module is responsible for storage virtualization and access control. Storage virtualization includes physical storage device management, virtual storage device management and metadata management. The Physical Device Manager, Virtual Device Manager and Metadata Manager are designed for these functions respectively. The Storage Command Execute Module and Virtualization Module are connected by the Virtual Device’s information structure, which will be introduced in section 3.1.

FET D(Front-End T arget Driver) IO Request SCE Module

Virtualization Module A ccess Co n t ro ller

Address Mapping

Com m and Manager

Virtual Devic es m anager Physic al Devic es m anager

Metadata Manager

Sc si-MOD driver

Fig. 2. Software architecture of VTarget

3 3.1

Key Technology Address Mapping

When the SCE receives one IO request (or SCSI command) from the FETD, the type of this request is first tested. If it is one inquiry request, such as INQUIRY or READ CAPACITY, the SCE would return the virtualization information for this virtual device. If the request is a read/write command, the SCE should get the relevant physical addresses through the address mapping module first. One request from FETD may need read/write multiple physical data blocks, and

94

H. Zhang et al. VD_Name lun UUID

PS1

PS2

PS3

PD_ID

PD_ID

PD_ID

Type

Offset

Offset

Offset

Capacity

Length

Length

Length

Space_list

Next

Next

Next

Fig. 3. Data structure of virtual device

VTarget system would transform the request into several physical I/O requests. Figure 3 shows the data structure of the virtual device. As shown in Figure 3, the Physical Segment (PS) is one continuous storage space on a specified physical device. The address mapping module changes the logical address to a physical operation address through the data structure of the virtual device. It finds the corresponding VD by seeking the lun-VD table and then determines the first PS according to the logical offset, and the length is used to determine the number of physical operations and the data size. The I/O mapping operation is carried out for every data I/O request, and the VTarget system can create virtual devices or expand the virtual device size dynamically by changing the value of mapping information. One virtual request from the FETD may be converted into several physical I/O requests according to the address mapping module, but the transferred data is stored in one common storage pool for the FETD and the VTarget system, which ensures that the virtualization system can never copy the data from memory to memory, which improves the performance of the storage system and decreases the CPU utilization. After address mapping, the physical I/O request is sent to the command manager for execution, and it is very similar to the process of command execution in the SCSI Target Emulator[1]. 3.2

Storage Device Management

Facilitating storage management is the main goal of storage virtualization. The VTarget system eliminates the need for the front server to know about the end devices’ details, focusing instead on just one uniform storage pool. Further, the VTarget system has a set of storage device management tools, which provides the users great convenience for users. In detail, the physical device manager and virtual device manager are responsible for the management of physical devices and virtual devices respectively. The physical device manager scans all of the physical devices, and records them into a storage pool. When a storage device is found, it sets up a physical device structure and identifies the device with a 64-bit Universal Unique Identifier (UUID), through which the system can find the device after being rebooted or physical moved. The physical device manager divides the capacity of the storage device into fixed-size blocks (64K for default) and adds the blocks into the

VTarget: An Improved Software Target Emulator for SANs

95

storage pool, which is the platform for the management of all of the virtual devices. By this means, the details of physical devices such as capacity, rotation speed and so on are not taken into account; thus the virtualization function is free from having to monitor each device. The virtual devices manager is one of the key parts of the VTarget system. It implements some key functions such as creating/deleting/expanding virtual devices and data snapshot/mirroring. It controls the algorithm for converting from logical addresses to physical addresses, so it can create different types of virtual devices through different address mapping algorithms. These virtual device types include linear, striped, mirror and snapshot. From figure 3,we know that the storage space of VD is a table of PS. As one PS in the table represents one continuous storage space whose capacity is determined by the user freely,the virtual device manager can manage huge storage spaces. And when a new PS is added into this table, the virtual device’s size is expanded. Also the virtual device manager can decrease the virtual device capacity easily. With this technology, the VTarget system can dynamically manage the storage resources very well. 3.3

Access Control

Large data centers often need tens of thousands of virtual devices. If there is no efficient access control management of the devices, the application servers will have to receive too much unnecessary information about the network storage, and the security of data access can not be guaranteed. For example, without access control, one virtual device can be seen by two or more application servers. Therefore, these application servers may read from or write to the virtual device simultaneously. This probably would cause a conflict of data access and result in data loss. The access control function we implemented was to make one application server able to only read or write on the virtual devices with authority, and at the same time, each virtual device can only be seen by the servers that have authority to operate it. Details of the implementation of this system can be described as follows: The system holds an access authority table. Every entry of this table contains a virtual device, an access bitmap and a privilege bitmap. The access bitmap and privilege bitmap have the same length and every bit represents the authority of an application server. Together they hold the access control information of the corresponding devices. A possible access authority table is shown in Table 1. Table 1. Access authority information of virtual device VD NUMBER ACCESS BITMAP PRIVILEGE BITMAP VD 0 ......0001 ......0001 VD 1 ......0010 ......0010 VD 2 ......0110 ......0100 ....... VD n ......0000 ......0000

96

H. Zhang et al. Table 2. Access mode Access bit Privilege bit Access Mode 0 0 ACCESS DENY 1 0 READ ONLY 1 1 READ/WRITE

Every bit in the Access Bitmap represents whether a certain application server can access that virtual device or not. The No. 0 bit represents whether the No. 1 server can access it or not, the No.1 bit represents where the No. 2 server can access it or not, and so on. Every bit in the Privilege Bitmap represents the operating authority of that particular application server. The combination of this pair of bitmaps can guarantee each server’s access control information. The details of the access authority can be explained in Table 2. Consequently, the access information of the virtual device in Table 1 is explained as follows: server 1 is permitted to read from and write to VD 0; server 2 is permitted to read from and write to VD 1; server 3 is permitted to read from and write to VD 2; server 2 is only permitted to read from VD 2; etc. Every time when an application server loads the initiator-mode fibre driver, it must check the VTarget’s Access Authority table to find which virtual devices are ”visible” (whose corresponding access bit is 1), and then add the ”visible” virtual devices to the server’s SCSI device list. The application server then has the permission to perform IO operations on these virtual devices, but it has to consult the Access Authority table again to check if the device has permission for read or write requests. We can improve the security of the device access through this control mechanism, and thereby guarantee the data security of application servers. 3.4

Metadata Management

Metadata management is another important component of the virtualization module. The metadata of the VTarget system is the information of physical devices, virtual devices, access control and other related items. A rational metadata management method must ensure recovery of the previous system configuration when the system is reloaded. For this reason, the VTarget system provides a reentrant feature; the system’s reliability is improved by providing multiple metadata copies, and therefore the system could be recovered even if some of the metadata copies are invalid. In addition, the metadata of this system is capable of checking the configuration and choosing the appropriate part of the metadata to recover, so the system has very high availability. At present, the VTarget system saves the metadata to two SCSI disks, and both disks are called metadata disks. We named one of the double disks the primary metadata disk, and the other the secondary metadata disk. Each metadata disk uses its first 20M of space to save metadata. The two copies serve as backup copies of each other. Once the metadata is changed, the system must

VTarget: An Improved Software Target Emulator for SANs

97

update both metadata disks. There is a special tag in the metadata information, which indicates whether the metadata has been written to the disk completely or not. Every time metadata is updated, this special tag of the primary metadata disk will be reset to zero first, and then the metadata update begins. After accomplishing the update, the tag will be set to WRITE COMPLETE. Updating of the secondary metadata disk is implemented in the same way.

4 4.1

Performance Testing of VTarget IO Performance

Compared with the SCSI Target Emulator, the VTarget system has more advantages in the management of storage resources and data security because of the virtualization function. But this is not implemented at the cost of losing the target’s performance. We used Iometer at the application servers for testing, and measured the total bandwidth and average response time of multiple network disks in two cases: using VTarget as the target and using SCSI Target Emulator as the target. The IO requests are totally sequential, and their sizes are all 128K. The proportion of the read and write operation is 80% and 20% respectively. (This proportion matches the common data stream’s read-write feature.) We used the VTarget’s virtualization function to create a virtual device on each physical device, and measured the different quantities of virtual devices’ total bandwidth and average response time. We then compared VTarget system with the SCSI Target Emulator under the same test environment. The comparison of the total bandwidth and average response time are shown respectively in Figure 4-a and Figure 4-b.

(a)

(b)

Fig. 4. IO Performance of VTarget and SCSI Target Emulator

The comparison of the performance testing results show that the total bandwidth of VTarget is less than that of the SCSI Target Emulator, but this proportion does not exceed 3.6%; the average response time is more but does not

98

H. Zhang et al.

exceed 8%. This means the complex virtualization functions of VTarget did not cause too much extra cost. This is because the main latency of IO operations, which is at the millisecond level, happens on the read-write operation to the disks. The latency of address mapping on the data path is microsecond level. This can be ignored relative to the read-write latency, so the introduction of virtualization functions does not affect the IO performance very much. 4.2

CPU Utilizations

The performance test in section 4.1 considered the whole VTarget system as multiple virtual devices. It measured the access bandwidth and average latency that the system applies. This is VTarget’s external appearance. Because an address mapping process was added in the IO accessing path, there would be some influences on the system itself. The most representational influence must be on CPU utilization. In this section,we tested the relationship between the read-write bandwidth and CPU utilization of the VTarget system and the SCSI Target Emulator system. These relationships are shown in Figure 5-a and Figure 5-b. The corresponding parameters of Iometer were both 100% sequential requests, and blocks’ size was 128K. What was different was the read-bandwidth test using 100% read requests and the write-bandwidth test using 100% write requests. The reason we used complete read or complete write requests to test the relationship between total bandwidth and CPU utilization is because write requests are more complicated than read requests in the VTarget system. For write requests, the VTarget system must be interrupted twice to accept commands and data separately. But for read requests, only one interruption for commands is necessary.

(a)

(b)

Fig. 5. The Relation between CPU Utilization and bandwidth

From the figures we can see that in both the VTarget system and SCSI Target Emulator, the increase of read/write bandwidth will certainly cause the increase of CPU utilization. But within the same changing range, the increase of CPU utilization of the VTarget system is more than that of SCSI Target

VTarget: An Improved Software Target Emulator for SANs

99

Emulator. This means the VTarget system definitely extends the process path of IO requests, and the load of the system grows. The VTarget prototype system reached about 120MB read bandwidth or 100MB write bandwidth in this study. This was limited by the 160MB SCSI bus bandwidth of the Ultra 160. Even when we reached these bandwidths, the CPU utilization was only less than 25%, leaving a good amount of computation resources free to be uses. Even if the computation resources happen to not be enough, this can be managed by increasing the computation resources or updating the hardware system.

5

Summary and Future Work

The VTarget prototype system we implemented is based on the UNH’s SCSI Target Emulator. We added several modules like address mapping, storage device management, access control, and metadata management, so that the VTarget system is provided with virtualization storage management functions. When performing IO requests, the information of virtual devices and storage organization is queried first through the address mapping process. This causes the application servers to be aware of the changing of virtual devices. On the other hand, virtual devices are organized by the table mode, so the virtual storage spaces can be managed conveniently. The combination of these two gives the VTarget system the ability to dynamically create virtual devices and dynamically extend capacity, so application servers can use the storage more conveniently. The VTarget System thus satisfies the demand for dynamic configuration in ebusiness applications. The VTarget system applies device-level access control. Therefore, application servers only have the authority to perform IO operations on the virtual devices which are permitted. The security of application data is increased. In addition, the VTarget system is capable of checking the configurations of metadata, so it can be recovered selectively. Better availability is guaranteed in this way. The performance testing of the VTarget system showed that the additional virtual functions in VTarget do not impact the target’s IO performance much. It only consumes a small amount of the CPU resources of the target itself. The virtualization function in our VTarget prototype system only addresses the problem of the capacity of the server. Other dimensions are not considered (such as bandwidth, latency), and the system cannot adapt to sudden changes of the application servers’ storage requirement. To achieve this goal we must do some job scheduling on the IO requests which the VTarget system is dealing with. We will monitor the features of all IO requests and schedule the IO requests rationally, according to the demand of storage requirements. In short, our next goal is to provide storage services with performance guarantees.

Acknowledgement The work described in this paper was supported by the National High-Tech Program Plan of China under Grant No. 2004AA111120. We greatly appreciate

100

H. Zhang et al.

our teachers’ meaningful advice. We also are grateful to our colleagues including Bigang Li, Di Wang, Fei Mu and Junfei Wang for their kind help.

References 1. Ashish Palekar. Design and Implementation of A Linux SCSI Target for Storage Area Networks. Proceedings of the 5th Annual Linux Showcase & Conference, 2001 2. David C. Teigland, Heinz Mauelshagen. Volume Managers in Linux. Sistina Software,Inc. http://www.sistina.com, 2001 3. Steven Pratt. EVMS:A Common Framework for Volume Management. Linux Technology Center, IBM Corp., http://evms.sf.net 4. Chang-Soo Kim, Gyoung-Bae Kim,Bum-Joo Shin. Volume Management in SAN Environment. ICPADS 2001, pages 500-508. 1997 5. SCSI Primary Commands - 3 (SPC-3), SCSI Block Commands - 2 (SBC-2), Working Drafts. http://www.t10.org, 2004 6. C.R.Lumb, A. Merchant, and G. A. Alvarez. Fa?ade: Virtual storage devices with performance guarantees. In Proceedings of the 2nd USENIX conference on File and Storage technologies, pages 131-144, San Francisco, CA, April 2003. 7. StoreAge Networking Technologies Ltd. High-Performance Storage Virtualization Architecture. http://www.store-age.com, 2001 8. Intel Corp. ”Intel iSCSI project”. http://sourceforge.net/projects/intel-iscsi,2001 9. Li Bigang, Shu Jiwu, Zheng Weimin. Design and Optimization of an iSCSI system, .Jin, Y Pan, N. Xiao and J. Sun (Eds.), GCC’2004 Workshop on Storage Grid and Technologies, LNCS 3252, pages 262-269, 2004. 10. Shu Jiwu, Yao Jun, Fu Changdong, Zheng Weimin. A Highly Efficient FC-SAN Based on Load Stream. The Fifth International Workshop on Advanced Parallel Processing Technologies(APPT2003), Xiamen, China, September2003, LNCS 2834, pages 31-40,2003 11. John Wilkes. Traveling to Rome: QoS specifications for automated storage system management. In Proceedings of the International Workshop on Quality of Service (IWQoS), pages 75-91, June 2001 12. Lan Huang, Gang Peng, and Tzi-cker Chiueh. Multi-Dimensional Storage Virtualization. In Proceedings of the joint international conference on Measurement and modeling of computer systems, Pages 14 -24, New York, NY, USA, 2004. ACM Press.