Impact of Virtual Memory Managers on Performance of J2EE Applications Alexander Ufimtsev, Alena Kucharenka, Liam Murphy Performance Engineering Laboratory, School of Computer Science and Informatics University College Dublin, Belfield, D4, Ireland
[email protected], alena
[email protected],
[email protected] http://www.perfenglab.com
Abstract. We investigate the impact of Operating System’s Virtual Memory Managers (VMMs) on performance of enterprise applications. By taking various popular branches of the Linux kernel and modifying their VMM settings, one can see the effects it introduces on ECPerf J2EE Benchmark. JBoss application server is used to run ECPerf. Our tests show that even the change of one parameter in VMM can have significant performance impacts. Performance of various kernel branches is compared. Parameter sensitivity and influence of specific settings are presented.
1
Introduction
Component systems nowadays run in a layers of software and hardware. A typical J2EE system runs inside the application servers (AS), which in their turn run inside Java Virtual Machines (JVM), which run on an Operating System (OS). And OSes either run on a hardware or within another host OS. Despite its great advantages, this layered model introduces a lot of complexities. For example, the effects of changes introduced at lower layers can be non-obvious, especially if layers are not directly connected. It is hard to anticipate the performance change of J2EE application if a certain parameter in the OS is modified. This paper studies the effects of modifying kernel’s Virtual Memory Manager (VMM) settings and implementation on performance of J2EE applications. Various Linux kernel branches were examined and tested. Changes of VMM parameters were reflected in performance of ECPerf, J2EE industry-standard benchmark. Parameters that affect performance the most were identified. Influence of specific VMM settings on performance was analysed. It must be noted that no attempt was made to understand the cause of performance differences. To determine the cause of the differences, one must examine the source code of VMM patches, as well as other changes introduced to specific kernel trees. The motivation for this work is not to find out what caused the differences in performance, but rather prove that they actually happen and they should not be ignored.
The rest of the paper is organized as follows: Section 2 contains detailed description of kernel branches and VMM parameters, Section 3 describes the experimental settings, including test scenarios, hardware and software used, Section 4 contains results and analysis, Section 5 describes related work, and Section 6 concludes the paper and discusses future work.
2
Kernel branches and VMM parameters
2.1
Kernel branches
Linux kernel development process is not a centralized or unified process - anyone is free to fork the code and maintain his or her own repository. Though main development happens within the official ’vanilla’ version with releases available from a central repository1 , many Linux distributions and kernel developers maintain their own branches of it, with changed or augmented functionality. The reasons for keeping a separate branch include, but are not limited to expanding the set of supported hardware, tightening security, improving stability and performance, maintaining compatibility with older versions, and quick bugfixing. Some kernel branches serve as a playground for new experimental features, which when proved stable and useful are merged into the main kernel branch. A few kernel repositories have been examined. The main selection criteria was the presense of VMM patches. The following branches were selected: – – – – – – –
Debian 2.4; Debian 2.6; SUSE; Fedora Core; Alan Cox (-ac); Andrew Morton (-mm); Con Kolivas (-ck).
Debian 2.4 kernel was selected for comparison of speeds between 2.4 and 2.6 kernel branches. It was decided to produce two kernel versions for SUSE - one with its original configuration file and the other with the file common for all other kernel images. All kernels, except for Debian 2.4.26 and 2.6.8 have been compiled locally with gcc-3.3.5. 2.2
VMM Parameters
This study concentrates on the VMM implementation of the current stable Linux branch, 2.6. VMM parameters, typically configurable at runtime via manipulation with /proc/sys/vm interface, have been examined. A list of parameters that could possibly affect performance has been created. Based on the description of parameters, they have been split into two groups - red and blue. During the 1
ftp://ftp.kernel.org
experiments we test the hypothesis that Red group is more likely to affect VMM performance while the blue one less likely to do so. Red : – dirty background ratio - the percentage of memory that is allowed to contain ’dirty’ or unsaved data; – dirty expire centisecs - The longest # of centiseconds for which data is allowed to remain dirty; – dirty ratio - contains as a percentage of total system memory, the # of pages at which a process that is generating disk writes will itself start writing out dirty data; – dirty writeback centisecs - interval between periodical wakeups of the pdflush writeback daemons; – page-cluster - tunes the readahead of pages during swap; – swappiness - drives the swappiness decision. Blue: – lower zone protection - determines how aggressive the kernel is in defending the lowmem zones; – min free kbytes - used to force the Linux VM to keep a minimum number of memory free; – vfs cache pressure - controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects; – overcommit memory - value which sets the general kernel policy towards granting memory allocations.
3 3.1
Experimental Environment Hardware platform
The testing environment includes three x86 machines: – app server Pentium III-866 Mhz with 512 Mb RAM; – database Pentium III-800 Mhz with 512 Mb RAM; – client Pentium IV-2.2 Ghz, 1024 Mb RAM. The client machine is more or as powerful as servers to ensure it does not become a bottleneck when generating the test load. 3.2
The software environment
The following software was used for testing purposes: – – – –
operating system: Debian GNU/Linux 3.1 ’sarge’; database server : MySQL v. 5.0.7beta-1; application server : JBoss v. 4.01sp1 running on Java2SDK 1.4.2 08; client: ECPerf suit 1.1 on Java2SDK 1.4.2 08;
Debian ’sarge’ was used for all the machines. The initial Java heap size of the app. server was set to 384M B;
3.3
ECPerf and its tuning
ECperf is an Enterprise JavaBeans (EJB) benchmark meant to measure the scalability and performance of J2EE servers and containers. It stresses the ability of EJB containers to handle the complexities of memory management, connection pooling, passivation/activation, and caching. Originally developed by Sun Microsystems, ECPerf is now being developed and maintained by SPEC Corporation2 . It is currently available from SPEC under the name of SPECjAppServer2004. ECPerf Configuration: During initial experiments the following workload was identified as the one delivering the best performance: txRate = 5, runOrderEntry = 1, runM f g = 1, which is an equivalent of 40 threads: 25 for order entry, and 15 planned line. The following parameters set the steady state for 10 minutes, with 8 minute warmup and 3 minute cooldown periods, respectively: rampU p = 480, stdyState = 600, rampDown = 180 3.4
Testing Algorithm
The following pseudocode describes the testing algorithm used in this study: every kernel is installed and booted into. Then for every parameter in the red and blue list one of them is set to a specific value, the following happens three times. First, the memory is cleaned then database is wiped out and recreated, swap is turned off, and application server is restarted. Logs and traces are saved for every individual run for later analysis. for (all kernels) { boot(); for(new vmm parameter) { set vmm parameter to a new value; new average; for(int i = 0; i++; i < 3) { clean_memory(); clean_db(); turn_off_swap(); average += avg(run_ecperf()); } save_tuple(kernel,parameter,value,average); } } Values of VMM parameters were changed two, three, or four times, depending on the type of the parameter. Tests with the default values were carried out as well. Swap partitions were turned off for the duration of the test due to significant improvement in performance. 2
http://www.spec.org
4
Results and Analysis
Figure 1 shows the overall results obtained during the measurements. The results show averaged performance metric of ECPerf, in business operations per minute (Bops/min). Peak and lowest performance measurements were taken three times, while the default settings were tested ten times. An average performance improvement over standard settings was 5,19%, while average performance decline due to unintentional misconfiguration was -8,73%. The decline does not take into account min free kbytes VMM parameter, which if set to particularly high number effectively stops OS from working. It can also be noted that Debian kernels seem to be more sensitive to misconfiguration. SUSE kernel branch was performing similarly with both custom and original configuration from SUSE. Also, kernel branch maintained by Con Kolivas (-ck) demonstrated a similar drop in performance at all levels - the price one pays for improved user interactivity and context switching. Reference test subject - kernel 2.4 performed noticably slower than its 2.6 counterparts. Andrew Morton’s branch, which is an experimental playground for new kernel features demonstrated performance improvement over 2.6.11 series. This was subsequently reflected in the mainline 2.6.14 and improved even more in 2.6.15. These three kernels seem to behave much better with default settings as well. We argue that a performance improvement over 5% can be quite important considering the fact it is caused by changing just one value in VMM settings. Kernel-specific results are shown in Table 1. Table 1. Maximum Performance Improvement and Decline for Various Kernels, % Kernel Improvement Decline 2.6.11.12-ac 2.95 -4.43 Debian 2.6.8-2 14.96 -20.93 Debian 2.4.26 9.89 -10.88 Fedora 2.6.11-1.1369 4.25 -5.77 2.6.12.3-ck 6.02 -5.59 2.6.13-rc3-mm 3.00 -7.34 Debian 2.6.11 5.29 -20.00 Suse 2.6.11.4 (1) 2.70 -5.32 Suse 2.6.11.4 (2) 3.61 -6.03 Vanilla 2.6.12.3 4.04 -3.27 Vanilla 2.6.14.3 2.45 -7.15 Vanilla 2.6.15-rc5 3.16 -8.04 Average 5.19 -8.73
4.1
Error and Sensitivity Analysis
Figure 2 shows the number of failed tests and standard deviation of the results for appropriate kernels. It is quite low since the overall number of tests includes
Fig. 1. Overall Results
Fig. 2. Number of failed tests and Standard Deviation of results for tested kernels
11 kernel parameters * 3 values * 3 ECperf tests = 99. Absolute majority of the failed tests happened due to setting VMM parameter min free kbytes to 256Mb which effectively cut half of the memory off the application server. Since half of the memory suddenly became not available, JVM failed to start thus failing the test. The hypothesis with preliminary separation of kernel VMM parameters into ’red’ and ’blue’ groups proved partially true, though the ’worst offender’ was still min free kbytes. The upper line in Figure 2 shows the averaged standard deviation for the tests. It can be noted that Debian kernels seem to have a higher deviation, while the worst is Con Kolivas’s kernel. The results of overall sensitivity analysis are shown in Figure 3. It shows the kernel parameters, which have the strongest, but not necessarily the best influence on performance of our J2EE system. This piechart was obtained by adding performance deltas across different kernels to the kernel parameters that caused it, depicted in Equation 1: sensitivity =
X
(stdev(parameter, kernel))
(1)
It can be seen that dirty expire centisecs and min free kbytes have the strongest influence on performance. Changes in the other parameters affect kernel performance on a smaller scale. More details are available in Figure 4. An attempt to determine what VMM settings actually result in better performance is shown in Figure 5. The task was non-trivial since there was no clear winner - each kernel seemed to have its own ’winning’ setting. To overcome this
Fig. 3. Overall sensitivity of various VMM settings
Fig. 4. Sensitivity of various VMM settings and kernels
Fig. 5. Effects of various parameters settings on performance
difficulty, the results for each kernel have been sorted by the ECPerf Bops/min number and each of the 33 parameters has been assigned a value from 1 (for showing the worst performance) to 33 (for showing the best performance). Then the appropriate parameters and their values have been aggregated across different kernel branches. The results have been split into three categories ’likely positive’, ’undefined’, and ’negative’. Settings in ’likely positive’ group are likely to increase the performance, while the settings in ’negative’ group are very likely going to degrade your performance significantly. The rest of the settings are too close to call - they have to be tried individually on each kernel to determine what works best for that specific one.
5
Related Work
J2EE performance-related research generally deals with the software layer just beneath the application itself, e.g. the application server. Two most known studies have been done by Gorton et al [1] and Cecchet [2]. A lot of work is currently undertaken in JVM research, though it is mainly concerned with common runtime optimization and garbage collection. Also, an earlier study by the authors
deals with the impact of method input parameters on the performance of EJB applications [3].
6
Conclusions and Future Work
Multiple branches of Linux kernel were tested to determine whether changes in Virtual Memory Manager (VMM) settings affect performance of J2EE applications using ECPerf benchmark. Parameters that affect performance the most were identified. Influence of specific VMM settings on performance was analysed. We argue that performance change of above 5% by changing just one setting inside of the Virtual Memory Manager is important and cannot be overlooked, especially when dealing with performance-critical systems. There are plans for more tests including multiple parameter optimization, more application servers: WebSphere Community Edition, WebSphere Enterprise Edition. We also plan to evaluate the influence of different process and I/O schedulers on J2EE performance as well.
7
Acknowledgment
The support of the Informatics Research Initiative of Enterprise Ireland is gratefully acknowledged.
References 1. Gorton, I., Liu, A., Brebner, P.: Rigorous Evaluation of COTS Middleware Technology IEEE Computer Mar (2003) 50-55 2. Cecchet, E., Marguerite, J., Zwaenepoel, W.: Performance and Scalability of EJB Applications Proc of 17th ACM Conference on Object-Oriented Programming, Seattle, Washington, (2002). 3. Oufimtsev, A. and Murphy, L.: Method Input Parameters and performance of EJB Applications. In Proc. of the OOPSLA Middleware and Component Performance workshop, ACM (2004).