Determining Overhead, Variance & Isolation Metrics in Virtualization ...

12 downloads 6801 Views 447KB Size Report
Determining Overhead, Variance & Isolation. Metrics in Virtualization for IaaS Cloud. Bukhary Ikhwan Ismail, Devendran Jagadisan, Mohammad Fairus Khalid.
Determining Overhead, Variance & Isolation Metrics in Virtualization for IaaS Cloud Bukhary Ikhwan Ismail, Devendran Jagadisan, Mohammad Fairus Khalid MIMOS Berhad, Malaysia {Ikhwan.ismail, deven.jagadisan, fairus.khalid}@mimos.my

Abstract Infrastructure as a Service cloud provides an open avenue for consumer to easily own computing and storage infrastructure. User can invent new services and have it deploy in a fraction of a time. As services being consumed, user expect consistent performance of their virtual machines. To host an IaaS, virtualization performance and behaviour need to be understood first. In this paper we present virtualization benchmarking methods, results and analysis in various resource utilization scenarios. The scenarios are Performance Overhead; the cost of virtualization compared to physical, Performance Variance; the VM instance performance when more VM is instantiate and Performance Fairness or Isolation; the study of fairness on each of the guest instances under stress environment. These studies provides us the fundamental understanding of virtualization technology.

1. Introduction “Infrastructure as a service” (IaaS) is a commodity base resource based on pool of physical servers. The underlying infrastructures consist of hardware i.e. servers, storage & network switches of heterogeneous nature. On top of these, software stacks such as hypervisors, operating system and hardware drivers play important role in the assembling of IaaS. Any changes on the technology stack will affect the performance of the Virtual Machines or guest instances. In order to control and manage the service effectively, we need to understand the performance or problem of each technology stack. In this paper, we present benchmarking methods, tools and results based on our first initial infrastructure setup. KVM will be use as our deployment hypervisor. Benchmarking includes in the areas of performance overhead, variance, isolation or fairness and overhead of guest OS. To direct our findings, we create test case based on the needs of IaaS designer and application developers who interest in the performance. By understanding virtualization behaviour, it can guide us on design decision for cloud deployment infrastructure. Section 2 describes brief review of virtualization technology. Section 3 explains our test strategies, metrics, methodologies and rationale behind each metrics. Section 4 discusses on tools and setup. Section 5 and 6 shows results & discussion. Lastly, we present our summary.

2. Virtualization Virtualization is the main ingredient blocks for cloud computing. Virtualization is a process of hiding the underlying physical hardware and makes it transparently usable and shareable by multiple VM. 2.1 KVM One of the most important innovations on Linux is the transformation into a Hypervisor. KVM, Lguest, UML, IBM Zvm, are some of the examples of Linux based hypervisor. They provide isolated virtual hardware platform for execution that in turn provide the illusion of the full access to the guest OS. Updates or optimization on Linux components will benefit both the host and the guest OS.

2

KVM, turn Linux OS into hypervisor by loading KVM modules into the kernel. Every KVM process is treated as normal Linux process. Linux consist of Kernel & User Mode. KVM add third mode called Guest Mode. In Guest Mode, reside guest own Kernel & User mode. KVM consist of 1) device driver for managing the virtual hardware and 2) user space component for emulating PC hardware. Linux kernel handles this very efficiently. Compared to non-kernel based hypervisor, other hypervisor put great effort on their scheduler and memory management system. (Jones, 2009) 2.2 Memory Memory is virtualizes through KVM. It provides virtualization of memory through /dev/kvm device. This involves sharing the physical RAM with dynamically allocating it to VM. VM memory is very similar to the virtual memory used in modern OS. Applications see it as a contiguous address space that is tied to the underlying RAM in the host. The OS keeps the mappings of virtual page numbers to physical page numbers stored in page tables (VMWare Inc 2007) Each guest operating system has its own address space that is mapped when the guest is instantiate. A set of shadow page table are maintained, to support the translation from VM physical address to host physical address. (Jones, 2007) 2.3 I/0 Operation I/O operation for VM operating system is provided by QEMU. QEMU is a platform virtualization solution that allows virtualization of an entire PC environment (including disks, graphic adapters, and network devices). Any I/O requests a VM OS makes are intercepted and routed to the user mode and emulated by the QEMU process.

3. Benchmarking 3.1 Testing Methodologies Experiments are designed to evaluate, Virtualization Overhead, Variance and Isolation of resource elements. from these 3 metrics, we will measure the CPU, memory, storage and application behaviour (database and Java). Results obtain from this studies, will act as initial result for further studies. Next, we justify the chosen metrics. Overhead - is the performance differences of the guest VM versus the actual bare metal host. From this metric, it will give us the insight of all resource elements and support the explanation of variance, isolation and fairness properties. Variance - In virtualization environment, resources are shared among guest OS. By incrementally increase the amount of guest OS within single host, we can discover the performance effects Isolation or fairness is the act of securing and providing the resources to VM in isolated container where other VM may co-exist in the same host. Ideally, all the host resources must be shared equally. It is the desirable metrics in this environment in order to guarantee SLA. For example, if one VM starts

3

misbehaving, i.e. consuming up all its own memory, other VMs are not affected and should continue to run normally. Here we test by simultaneous running resource intensive application to see the effects of non-isolation behaviour for VM guest server. Another area of isolation studies is security. For example, for any reason, an attacker attacks a VM or part of it and the VM is well isolated, it will not jeopardise other VM.(Why Virtualization 2009). For this point, we will not test it. Isolation metric is among discussed topics in virtualization studies. (Che Jianhua 2008)(Gaurav Somani 2009)(IBM June 23-24, 2008) Application Specific Behaviour - VM guest overhead, variance and isolation shows the perspective of micro-level resource utilization and behaviour. Server applications consume all resources and produce macro level results. Here we test SQLLite and Java to benchmark our guest OS. 3.2 Setup Below, are the testbed configuration and VM setting. Processor

4 Cores Intel Xeon CPU E5405 @ 1.99GHz

Mainboard

DellPrecision WorkStation T5400

Chipset

Intel 5400 Chipset Hub

Memory

2 x 4096 MB 667MHz

Disk

750GBHitachi HDS72107

Graphics

nVidia Quadro FX 1700,

OS

CentOS 5.3 64bit

Kernel

Kernel:2.6.18-128.4.1.el5 (x86_64)

File System

EXT3

Table 1: Host System Specification Processor

QEMU Virtual CPU 0.9.1 @ 7.97GHz (Total Cores: 1)

Mainboard

Unknown

Chipset

Unknown

Memory

512MB

Disk

12GB GEMU Hard disk

Graphics

Nil

OS

Fedora release 10 (Cambridge) (Eucalyptus Image)

Kernel

Kernel: 2.6.28-11-generic (x86_64)

File System

ext3

Table 2: Guest VM Specification 3.3 Testbed Setup

4

Figure 1: Testbed Setup Test environment use CentOS 5.3 as the based operating system due to stability and reliability of the OS. The following libraries are used  KVM 85  KMOD-KVM -2.6.3.0.1  Libvirt-0.6.5  QEMU-KVM 10.6.0 KVM 85 and KMOD-KVM-2.6.3.0.1 installed using yum repositories while libvirt 0.6.5 compiled from the source. The libvirt and qemu-kvm libraries are complied with the default parameters. The virtual machines will be running on the local disk of the physical machine. In our benchmark, the base image for the virtual machines are in “raw” format. 4.

Tools & Methodology

For each of the performance testing, we perform 10 round per metrics and tak the average. Each of test tools have input parameter which siginficantly affect the outcoming results. Careful consideration of each parameter Here we list all the tools used in our benchmark test.     

5.

CPU a. 7ZIP-Compression - http://www.7-zip.org/ Memory a. RAMspeed - http://alasir.com/software/ramspeed Storage a. Iozone - www.iozone.org b. Tiobench - http://tiobench.sourceforge.net/ Application a. Java-SciMark 2.0 - http://math.nist.gov/scimark2/ b. Database - SQLite - http://www.sqlite.org/ Network a. Netperf - http://netperf.org

Results

5.1 Overhead Here we present the overhead results. We present first the overhead occured for each server element and lastly we present application specific overhead. For the

5

guest VM, we use the exact configuration number of memory, CPU cores etc of physical host. This will show any performance degradtion of host versus virtual machine.

10000

6139

6077

MIPS

Host Guest

0

Figure 2: CPU Host vs. Guest CPU on guest machine is near to host-native performance with 1% overhead as shown in Figure 2. MB/s

3000

2579

2282

2367

2323

2366

2328

2000

Host

1000

Guest

0 int Add

int Copy

int Scale

Figure 3: Memory Host vs. Guest We benchmark memory using RAMSpeed with 3 subtests. Memory shows low overhead as shown in Figure 3.

MB/s

2000

1296

1000

179

host

355

96

0

guest 4GB Write

4GB Read

Figure 4: IOZone Disk Host vs. Guest Figure 4 shows read performance suffers more compared to write. Read performance drop 72% while write drop to 46%. KVM I/O handling for disk adds significant performance penalty as compared to memory and CPU, which performs better.

Mbps

1,000 500

941

852

232

405 157

257

between hosts

between guest in single host

between guests across node

TCP_STREAM UDP_STREAM

Figure 5: Network Bulk Data Transfer Performance Network throughput and latency test using netperf have different approach compared to other test. We measure the performance of throughput and latencies, between physical hosts, guests within the same host and between guests across physical host. These test reflects the actual performance differences of virtualization.

6

In bulk data transfer test (TCP_STREAM & UDP_STREAM), 4 common attribute may affect the performance indirectly. 1) Network-bound where speed is affected by the slowest link on the network, 2) CPU-bound The CPU utilization at the time of test where networking stack requires CPU cycle per KB of data transfer. 3) Distance of data needs to travel and 4) Packet Lost (Care and Feeding of Netperf 2.4.X, 2007) Figure 5, shows the result of bulk data transfer performance. TCP overhead on VM guest compared to host machine dropped to 75% while UDP suffers 55% drop. Virtualization adds significant overhead over throughput results. While UDP seems to be better than TCP in virtualizes environment, the result is inconclusive. One plausible explanation could be TCP protocol is more expensive in terms of resource utilization it consumed as compared to UDP. TCP is, reliable, connectionoriented, sequenced, and requires acknowledgement on both sender and the receiver (Ghori Asghar 2007). On Virtual environment, these might amplify the usage where each of the entity, VM & host competes for the resources. In a test done on Amazon EC2 infrastructure, the results of TCP & UDP are fluctuating in selected cases. EC2 categories its guest VM type by small, medium or large instances. In “small” type VM, the network performance of UDP is better than TCP while in “medium” type VM, TCP are better. It concludes that in small type VM, the guest machines are prone to processor sharing and the throughput drop is cause by this event. Nevertheless, network virtualization requires high degree of resource sharing and CPU processing, (Guohui Wang, 2010). Apart from the common attribute discussed earlier, hypervisor adds another layer of processing in order to emulate or virtualizes the network. To virtualizes network, the outgoing network must be multiplexed together before sending out to the network and incoming be de-multiplexed before sending to designated VM guest. Second, VMM must protect the VMs from each other where VM cannot read or write to other network interfaces. (Scott Rixner, 2008) To enable resource sharing and enforce security it is at expense of performance drop and increases the complexity in managing network QoS.

20,000 Between hosts

TPS

10,623 10,000

3,692

between guests in single host 2,866

between guests across node

-

Figure 6: Network Latency Figure 6 shows the latency results. The measurement is in transaction perseconds. The higher the value reflects lower latency or greater speed. Virtualization adds 65% performance overhead. Link between instances on different nodes add further latency, which drop 22% more. Latency is typically incurred in processing of network data. Latency can be temporary which last for a few seconds or persistent which depends on the source of the delays.

7

9.47

150 Host

5

Guest

Time Seconds

9.37

Rendering time

10

111.94 100.43

100

Host

50

Guest

0

0

Figure 7: Java Host vs. Guest

Figure 8: SQLite Host vs. Guest

Figure 7, the performance overhead on guest is small with only 1.07% lower than physical. Figure 8 shows 11.46% SQLite performance drop for guest OS. 5.2 Variance & Isolation Here we present the results of stressing out the physical hosts with compute, memory, read/write I/O of disk and the effects of network throughput. 5.2.1

CPU 1792

MIPS

2000

1720

1373

1 guest

1000

2 guest 4 guest

0

Figure 9: CPU Variance Chart Figure 9, we can see performance degradation by incrementally more VM in a single node. From the chart above, we can see the variance when more guests present. Performance varies between 1792 to 1373 MIPS or 20%. 1396 1370 1394 1334

MIPS

1500

1st guest

1000

2nd guest

500

3rd guest 4th guest

0

Figure 10: CPU Fairness Chart Figure 10, we would like to give insight of the performance on each of the 4 guests. Each of the guests have +/- 4% differences between each other. These results shows that, concurrent stressing the CPU, does not affect guests OS with significantly. We allocate each host CPU core to a single guest, thus no overselling test were conducted. 10000 MIPS

6077 5000

5494 Guest total 4 guest

0

Figure 11: 1 Guest with 4 CPU vs. 4 Guest with 1 CPU each

8

Figure 11, we try to present CPU consumption in a different view. We try to display the total MIPS of 4 guests with 1 core compared to 1 guest running all 4 cores. The reason of such analysis is to show how much compute power is wasted due to divisioning of the CPU cores to each guest. If we divide the 4 cores one each to the guests, the allocation of each guest will get average of 1373 MIPS or 22%. By allocating each CPU to a guest OS, a total of 9.6% lost of compute power as compared to assigning all cores to a single guest. 5.2.2

Memory

Host machine consist of 2 DIMMs of 4GB RAM and each guest have 512MB RAM. At the peak of the stress test, only 2GB RAM is utilize or 25% of actual RAM is utilize. Still the performance variance being affected greatly. 3000 MB/s

2125

2122

2081 2000

1186

1169

618

611

1000

1 Guest

1164 601

2 Guests 4 Guests

0 int Add

int Copy

int Scale

Figure 12: RAMSpeed Memory Variance Chart Figure 12, shows memory stress test. An average of 43% decreases in memory performance while running 2 guests. It suffers more with 70% performance drop while running 4 guest. Performance degradation occurs when more guest VM running on single machine. Performance varies between guest with value of 2081 – 611 Mb/s or 70%. KVM does not manage the memory assigned to the VM. In the physical machine, libvirt API is used to create the VM and a user spaces emulator called qemu-kvm to provision the VM. The libvirt will initiate the VMs with the requested hardware such as number of CPU, memory size, disk etc. Qemu-kvm will prepare the VMs address space and I/O between the guest and the physical host. Each VM will run as separate qemu-kvm process with the memory size stated up front by libivirt. Therefore, the more VM launched in a physical host machine, the higher performance degradation we will notice.

MB/s

650 600 608

621

613

608 607

633 625

600

619 613 592

578

instance 1 instance 2 instance 3

550 int Add

int Copy

int Scale

instance 4

Figure 13: RAMSpeed Memory Fairness Chart Figure 13,shows good performance isolation of memory +/- 5% differences between guest. It shows good isolation or fairness of each VM even the variance differences is high. 5.2.3

Disk

9

For disk, we only perform maximum of 2 guest up concurrently. With 4 guest test, the guest OS will hang or become unresponsive due to the nature of the test. 355

MB/s

400 200

1 Guest

96

47

8

2 Guest

0 4GB Write

4GB Read

Figure 14: IOZone Disk Variance Chart Testing for performance variance on disk further worsen the case by comparing 1 guest with 2 guest as shown in Figure 14: IOZone Disk Variance ChartError! Reference source not found.. For Iozone read test, the performance drop for 2 guests is 97% while write test suffers 52%.

MB/s

100 50

47.18

50.96

1st Guest 9.03

8.49

2nd Guest

0 4GB Write

4GB Read

Figure 15: IOZone Disk Fairness Chart

Microseconds (Hundreds)

In Figure 15 both guest OS suffers almost the same degradtion. Even thou the performance drop caused by the total number of guest OS is significantly high, the perfromance of both guest OS’s is almost divided equally show on IOZone test. 200 100

131 17

23

3

39

23

31

1 Guest

52

2 Guest

0 64MB Write

64MB Read

256MB Write 256MB Read

Figure 16: Tiobench Disk Variance Chart Tiobench perform more rigrous test compared to IOZone. Figure 16 shows the graph of performance variance, while Table 3 shows the percentage drop of such variance. Variance

Test

Fairness

WRITE 64

35%

WRITE 64

+/- 87%

WRITE 256

34%

WRITE 256

+/- 69%

READ 64

1200%

READ 64

+/- 2400%

150%

READ 256

+/- 236%

Test

READ 256

Table 3:Performance Variance

Table 4:Performance Fairness

Microseconds (Hundreds)

10 400 202 200 30

76

16

3

39

1st Guest 60

23

2nd Guest

0 64MB Write

64MB Read

256MB Write

256MB Read

Figure 17: Tiobench Disk Fairness Chart Performance isolation test for tiobench as reflected on Figure 17 and Table 4 Multithread read/write shows bad distribution resource handling. With this results, it shows one guest have higher I/O compared to another. In tiobench, multiple I/O request is done with a total of 64 threads running at one time. This test shows more rigrous test as compared to iozone.From iozone and tiobench test, read variance is much more worst compared to write operation. 5.2.4

Application

Rendering time

60 36.95 38.35

40

45.03 1 Guest 2 Guests

20

4 Guests

0

Figure 18: Java Test Figure 18 shows the results of Sunflow rendering on guests running concurrently. It shows 3.79% degradation when compared with 1 guest compared to 2 guest running. Another 20% drop when 4 guests running. Application benchmarking for Java does not reflect the performance drop as shown in CPU, memory and disk benchmarking. This test is analogous to CPU performance where the trend of performance drop is quite similar.

Time to complete

150 114

121

136 1 Guest

100

2 Guests 50

4 Guests

0

Figure 19: SQLite Test Figure 19 show the result of SQLite test to finish 125,000-query request. It shows performance variance of 21%. For performance isolation on SQLite it shows +/23%. For Java, the isolation value is 20% .

6. Discussion ELEMENTS CPU

OVERHEAD

VARIANCE

1% GOOD 20%

GOOD

FAIRNESS 4%

GOOD

11 MEMORY

5% GOOD 70%

BAD

5%

iozone Write

46%

FAIR

46%

FAIR

17% GOOD

Iozone Read

72%

BAD

98%

BAD

5%

GOOD

TiobenchWrite 1087% V.BAD 35%

FAIR

87%

BAD

TiobenchRead

GOOD

657% V.BAD 1200% V.BAD 2400% V.BAD

JAVA

1% GOOD 21%

GOOD

20% GOOD

SQL

11% GOOD 26%

GOOD

23% GOOD

Good 0-30%, Fair 31-50%, Bad 50-100% Table 3: Overall Results Table 3 shows the summary of overhead, variance and fairness of the server elements. CPU score good for all metrics. Memory on the other hand, shows bad variance. For disk Iozone test, both read/write shows good distribution of fairness between competing guests. For tiobench it gives much more regression test as compared to Iozone. All three performances metrics show bad results. In virtualization, the biggest complaint is sluggish disk I/O. VM disk will never perform like physical disk. In full virtualization, I/O channel specifically disk, will significantly degrades overall performance. To improve disk I/O, it is advisable to get high-performance disk, while waiting for the full virtualization maturing. For example, SCSI drive still outperforms even the highest-end IDE drives. Consideration on widest transfer rate, highest cache drives which IaaS provider could afford. Because of disk I/O overhead, High CPU core number or high memory will not be fully utilize. (Kenneth Hess 2009) The performance shown in our test shows the worst case of each of the server elements couple together. Even memory shows bad variance and disk shows tremendous overhead, bad variance and no proper isolation, it still does not reflect on the SQLite and Java tests. The application test cases is also showing worst-case scenarios where each of the application servers being used fully on one node. For network, we can see that UDP throughput is much better than TCP but we could not justify of such characteristic. There are suggestions to improve the network I/O throughput. For example, a dedicated NIC to VM might be advice able where highly network traffic dependent such as web server, application or terminal server. (Kenneth Hess 2009) VMM introduce additional overhead on guest VM as compared to native due to either device emulation or types of virtualization (i.e. para-virtualization or fullvirtualization). Another factor could be attributed to domain scheduling within VMM, which schedule the shared I/O devices. For example, even a single virtual machine is running on a single host, the sending or receiving network packets involves two domains; the driver and the guest. These domain must be scheduled and poor scheduling can result in increase of network latency. (Scott Rixner, 2008) Further improvement on the virtual machine can be achieved from using Virtio (libvirt: Wiki : Virtio n.d.) devices. The Virtio drivers are optimized for KVM are provides performance for disk and network of theVM.

12

Choosing the right OS I/O scheduler for the host machine and virtual machine might improve the virtual machines performances. (David Bouthcer n.d.) have suggested the NOOP scheduler improves the throughput of the virtual machines From our studies, we have found several performance issues on the server elements. We know more dynamic deployment policy for cloud is needed in order to allocate guest resources more efficiently.

7. Reference 1. 2.

3. 4.

5. 6. 7. 8. 9.

10. 11. 12.

13. 14. 15. 16.

17.

CentOS - Community Enterprise OS. 4 November 2009. http://www.centos.org/ (accessed November 18, 2009). Che Jianhua, He Qinming, Qinghua Goa, Dawei Huang. “Performance Measuring and Comparing of Virtual Machine Monitors.” IEEE/IFIP International Conference on Embedded & Ubiquitous Computing. IEEE Computer Society, 2008. 381. David Bouthcer, Abhishek Chandra. “Does Virtualization Make Disk Scheduling Passé?” Workshop on Hot Topics in Storage and File Systems (HotStorage’09). Montana. Eucalyptus Networking_v1.5.2 - Eucalyptus. http://open.eucalyptus.com/wiki/EucalyptusNetworking_v1.5.2#novlan (accessed August 15, 2009). Gaurav Somani, Sanjay Chaudhary. “Application Performance Isolation in Virtualization.” IEEE International Conference on Cloud Computing, 2009: 41-48. Hollander, Rhett M. RAMSpeed a cache and memory benchmark. November 2002. http://www.alasir.com/software/ramspeed/. IBM. “Quantitative Comparison of Xen and KVM.” 2008 Xen Summit. Boston, June 23-24, 2008. Java SciMark 2.0. 31 3 2004 . http://math.nist.gov/scimark2/ (accessed 9 10, 2009). Jones, M. Tim. “Discover the Linux Kernel Virtual Machine-Learn the KVM architecture and advantages.” IBM Developerworks. 18 April 2007. http://www.ibm.com/developerworks/linux/library/l-linux-kvm/ (accessed October 25, 2009). Kenneth Hess, Amy Newman. Practical Virtualization Solutions. Pearson, 2009. QEMU disk image utility - Linux man page. http://linux.die.net/man/1/qemu-img (accessed November 19, 2009). Qiang Li, Qinfen Hao, Limin Xiao, Zhoujun Li. “VM-based Architecture for Network Monitoring and Analysis.” The 9th International Conference for Young Computer Scientists. IEEE Computer Society, 2008. 1395. tiobench benchmark. 2 october 2002. http://linuxperf.sourceforge.net/tiobench/tiobench.php. VMWare Inc. Understanding Full Virtualization, Paravirtualization, and Hardware Assist. Palo Alto: VMWare Inc, 2007. vmware. Technical Note-Networking Performance in Multiple Virtual Machine. VMware Inc., 2007. West, John E. HPCWIre:-Benchmarking Your Cloud. 16 7 2009. http://www.hpcwire.com/features/Benchmarking-Your-Cloud-50976307.html (accessed 7 16, 2009). Why Virtualization. 08 04 2009. http://virt.kernelnewbies.org/WhyVirtualization (accessed

11 18, 2009). 18. Hewlet Packard. (2007). Care and Feeding of Netperf 2.4.X. (Hewlet Packard) Retrieved from netperf: www.netperf.org 19. Scott Rixner. “Network Virtualization Breaking the Performance Barrier” ACM Queue, January/Febuary 2008 edition. 20. Guohui Wang, T. S. Eugene Ng, "The Impact of Virtualization on Network Performance of Amazon EC2 Data Center", in IEEE INFOCOM'10, San Diego, CA, March 2010 21. Ghori. Asghar. Precio. HP Certified Systems Administrator 2nd Edition), 2007 Endeavor Technologies United States

13

Suggest Documents