Performance Study on SharePoint Workloads in a SQL Server Environment. 2.
Executive Summary. A Microsoft ® SharePoint® Server 2010 farm hosts the core
...
Performance Study on SharePoint Workloads in a SQL Server Environment A Dell Technical White Paper
Dell │ SharePoint Solutions Engineering Ravikanth Chaganti and Jisha J August 2010
Performance Study on SharePoint Workloads in a SQL Server Environment
Executive Summary A Microsoft ® SharePoint® Server 2010 farm hosts the core platform services and applications that provide many different functions for its users. These functions include document management, version control, ease of access, and intuitive administration just to name a few. Fundamental to the architecture of a SharePoint environment is the back-end database. Microsoft SQL is specifically designed to support SharePoint and needs to be configured correctly in order to save time, effort, and money. This white paper focuses on the SQL Server I/O subsystem and the role it plays in a SharePoint environment. The goal of this research is to provide guidance and insight into optimizing the database host and the related benefits to the overall scalability and performance of a SharePoint farm. This paper provides detailed information on the factors to be considered when designing a farm and how to best configure them. Finally, this paper covers several performance metrics for various farm components and provides detailed information on how the recommended farm architecture can achieve sub one second response times. Dell is able to provide this data due to an internally developed load generation tool which was created specifically for SharePoint, and was used to conduct several different experiments that were intended to stress the SQL Server I/O subsystem. The lessons learned in this paper will help IT Managers build more efficient and effective SharePoint environments on an SQL back-end.
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. © 2010 Dell Inc. All rights reserved. Reproduction in any manner whatsoever without the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell. Dell, the DELL logo, and the DELL badge, and PowerEdge are trademarks of Dell Inc. Microsoft, Windows Server, SharePoint, and SQL Server are registered trademarks of Microsoft Corporation in the United States and/or other countries. Intel and Xeon are registered trademarks of Intel Corporation in the U.S. and/or other countries. EMC is a registered trademark of EMC Corporation. Adobe is a registered trademark of Adobe Systems Incorporated in the United States and/or other countries. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own. August 2010 2
Performance Study on SharePoint Workloads in a SQL Server Environment
Contents Executive Summary ....................................................................................................... 2 Introduction ................................................................................................................ 4 SharePoint Farm Topologies .......................................................................................... 4 Microsoft SQL Server and SharePoint 2010 ......................................................................... 4 SharePoint Farm Performance Study ................................................................................... 5 Dell SharePoint Load Generation Framework ...................................................................... 5 Content Population Tool ............................................................................................ 5 VSTS Load Testing Framework ..................................................................................... 6 Load Testing Workload Test Mix ..................................................................................... 7 Test Methodology ....................................................................................................... 9 Experimental Design .................................................................................................... 10 Test Results and Analysis .............................................................................................. 12 Conclusion ................................................................................................................ 14 References ................................................................................................................ 15
3
Performance Study on SharePoint Workloads in a SQL Server Environment
Introduction Microsoft SharePoint Server 2010 offers functionality that makes it a good choice for many different business scenarios. Typically, SharePoint is deployed in a server farm that includes a web presentation tier, an application tier, and a database tier. More detailed information about designing and building SharePoint farms intended for organizations with different scale and performance needs is available in a series of white papers available on www.dell.com/sharepoint. This white paper will examine how representative SharePoint workloads impact a Microsoft SQL Server hosting the database tier in a SharePoint server farm. Dell has developed a tool that implements common tasks associated with collaboration and document publishing workloads. This tool has been used to place loads on a SharePoint farm, allowing a detailed analysis of the impact of different workload patterns on the database tier. This white paper will focus on the impacts to the SQL Server I/O subsystem, with a goal of providing insight into when changes to the database host are likely to benefit the overall scalability or performance of a SharePoint farm.
SharePoint Farm Topologies The SharePoint server farm offers opportunities to employ a scale-out philosophy; many of the SharePoint roles can be configured to operate on multiple servers. When used in conjunction with loadbalancing techniques, the capacity, availability, and throughput of the farm can be increased relatively easily. Similarly, for deployments with modest scale or performance needs, the roles generally associated with the presentation and application tiers can be consolidated onto fewer servers. In smaller farms, the database server may be the only component which remains separate. Single-server SharePoint deployments are possible, but are recommended only for development or testing environments because such deployments are effectively locked on the single server and unable to scale. Certain key SharePoint roles cannot be relocated from this topology and deployed onto other servers. This single-server scalability limitation may be overcome by deploying a single physical server, but using separate virtual machines to house the database and farm roles. However, the performance and scalability considerations of a virtualized farm are beyond the scope of this white paper.
Microsoft SQL Server and SharePoint 2010 All of the data in a SharePoint farm is stored in content databases on a Microsoft SQL Server host. The data includes everything from simple text-based list items to large binary files that are stored in SharePoint document libraries. When the web front-end servers in a farm process a user request, they make queries to the database server in order to process the request. Therefore, the performance of the back-end database can have a significant influence in the perceived speed and quality of the entire farm. Because of this, it is important to gain a better understanding of the impact of different types of end-user requests on the SQL Server database in a SharePoint farm. In order to meet this need, the Dell SharePoint Solutions Engineering team developed a load generation tool for SharePoint and conducted several different experiments that were intended to stress the SQL Server I/O subsystem.
4
Performance Study on SharePoint Workloads in a SQL Server Environment
SharePoint Farm Performance Study Microsoft SharePoint 2010 is a versatile platform that can be used in a large variety of ways. Some SharePoint workloads work almost out of the box, while others require or allow significant customization; and still others are the result of completely custom developed applications. This flexibility results in a gazillion possible ways of using SharePoint which makes it almost impossible to accurately size servers and storage for a SharePoint farm. Also, there is no standard benchmark for sizing SharePoint workloads yet. It is very important to be able to provide the right guidance to customers when it comes to recommending infrastructure elements of a SharePoint implementation. This understanding of customer needs led to the development of the Dell SharePoint Load Generation framework used to perform load testing of a SharePoint farm.
Dell SharePoint Load Generation Framework An internally developed load generation framework had been used in understanding the performance characteristics of the SharePoint farm. This framework includes load testing of SharePoint out of the box usage profiles, such as collaboration and publishing. The Dell SharePoint load generation framework has two components—a content population tool and the Visual Studio Team Suite (VSTS) web test framework. Content Population Tool The content population tool is designed to prepare the SharePoint farm for load testing. This content population tool was designed to distribute the SharePoint content across multiple site collections.
Figure 1.
SharePoint Content Population Tool
5
Performance Study on SharePoint Workloads in a SQL Server Environment The content population tool was developed to:
Create SharePoint web applications Create site collections Add web parts to home pages Create document libraries Create SharePoint list items Upload documents, images, etc.
This tool is capable of populating hundreds of gigabytes of SharePoint content in a few hours. The size of the SharePoint content database and other aspects, such as the number of site collections, etc., vary based on the usage profile selection. A usage profile is a collection of use cases closely mapped to real world SharePoint usage. To some extent, these usage profiles were mapped to the SharePoint Capacity Planner1 and other Microsoft recommendations. Although the SharePoint Capacity Planner was intended for MOSS 2007, there are several aspects of these recommendations 2 that still apply to SharePoint 2010 out of the box workloads. The content generated and uploaded by the content population tool serves as a baseline for SharePoint 2010 load testing using the VSTS web test framework. VSTS Load Testing Framework Dell’s SharePoint load generation framework uses VSTS 2008 to perform load testing. Within VSTS, each load test directly maps to a SharePoint usage profile, and each usage profile defines a list of use cases and how many use cases are run per hour per connected user. Using VSTS 2008 helps in the rapid creation of use cases and the parameterization of those use cases. SharePoint load testing is performed using a VSTS test rig of several physical test agents (shown in Figure 2), and the results are captured in a SQL database on the test controller.
1
SharePoint Capacity Planner: http://www.microsoft.com/downloads/details.aspx?FamilyID=dbee0227-d4f7-48f8-85f0e71493b2fd87&displaylang=en 2 Microsoft SharePoint 2010 Performance and Capacity Management: http://technet.microsoft.com/en-us/library/cc262971.aspx 6
Performance Study on SharePoint Workloads in a SQL Server Environment
Figure 2.
VSTS Test Rig VSTS Test Rig
Test Controller Start Test
Agent 1
Agent 2
Agent 3
Agent 4
Agent 5
Agent 6
Agent 7
Agent 8
Agent 9
Agent 10
Run Test
NLB Cluster
Web Server
Web Server
Web Server
SharePoint Farm
Application Server
Database Server
Load Testing Workload Test Mix As mentioned earlier, the load test usage profiles were based on the SharePoint Capacity Planner and other Microsoft recommendations for SharePoint 2010. The System Center SharePoint Capacity Planner defines several usage profiles for both collaboration and document publishing workloads. These usage profiles are categorized into low, medium, and heavy usage profiles. These categories define several aspects of a usage profile, such as how many requests are sent per hour per connected user, what use cases constitute a load test, and what percentage (test mix) of each use case is used within each load test. Within the scope of this performance study white paper, the heavy collaboration usage profile was used. Table 1 shows the heavy collaboration test mix as suggested by the SharePoint Capacity Planner (SCP).
Table 1.
SCP Usage Profile Definition SCP Usage Profiles Home Page Access (%)
Heavy Collaboration 30
List Page Access (%)
20
Document/Picture Download (%)
15
Document/Picture Upload (%)
8
Search (%)
15
7
Performance Study on SharePoint Workloads in a SQL Server Environment
Total requests/hour/connected user
60
As shown in Table 1, SCP defines only a high level test mix for each usage profile. Table 2 shows a more granular translation of this SCP heavy collaboration usage profile. Several use cases were mapped to each of the categories described by SCP, and the number of use cases per hour per connected user has been assigned.
Table 2.
Dell's Test Mix for a Heavy Collaboration Profile Heavy Collaboration Test Mix
Number of tests/hr/user
Home Page Access Read Site Home Page
18
List Page Access Read Survey
6
Read Lists
6
Document/Picture Download Read Document Library
2
Read Home to Document Library
1
Read Wiki Page
2
Read Picture Library
1
Read Home to Wiki Page
2
Read Home to Picture Library
1
Document/Picture Upload Create Wiki Page
3
Upload Document
2
Search Search Site
10
List Item Insertion/Deletion Respond to Survey
2
Reply to Discussion Topic
1
Edit Wiki Page
2
Comment Home to Blog Post
1
Total tests/hour/connected user
60
It is important to note that Dell’s test mix (shown in Table 2) is not a one to one mapping to the previously described SCP and Microsoft recommendations. For example, SCP defines total requests per hour per connected user. However, within Dell’s test mix for the heavy collaboration profile, this definition translates to more than 60 requests per hour because the usage profile uses 60 tests per hour per connected user. And, one test could mean more than one request. Hence, the results published in this white paper may or may not map directly to the SharePoint Capacity Planner recommendations, but they are specific to the workload mix defined in Table 2.
8
Performance Study on SharePoint Workloads in a SQL Server Environment
Test Methodology The intent of the experiments conducted as a part of this performance study was to understand how disk I/O and memory requirements scale with the user load on the SharePoint farm. Several load test iterations were conducted with increasing user loads. For example, an initial user load of 500 virtual users was used, and the same had been incremented by 500 users until the monitored resources (disk I/O or memory) reached a bottleneck state. The data set used to build the content database included several different types of files. These file types included Microsoft Office documents and Adobe® PDF documents, as well as several image formats. Table 3 shows a distribution of file content sizes used in this performance study.
Table 3.
Data Set Average File Size
Number of Files
1KB to 500KB
34240
500KB to 1MB
5223
1MB to 10MB
13003
10MB to 70MB
125
The aggregated SharePoint content database size was approximately 53GB. For the duration of the load tests, this content database grew by almost 20%. This performance study involved load testing of an out of the box SharePoint deployment using a test mix shown in Table 2. A full content crawl was performed once at the beginning of the load tests. No subsequent crawls were performed after the load tests or during the load tests. Two metrics—Disk I/O and Memory—were used to characterize SQL performance when used in a SharePoint deployment. For performing disk I/O characterization, SQL server memory was restricted to 2GB; and SharePoint load testing was performed with increasing user loads. The farm average response time was monitored during this process and the disk backend was upgraded as required. For performing memory characterization, the disk backend was upgraded to 14 disks in a RAID 0 configuration; and load testing was performed with increasing user loads. SQL server memory was increased in increments of 2GB whenever a bottleneck caused the average farm response time to go beyond one second.
9
Performance Study on SharePoint Workloads in a SQL Server Environment
Experimental Design This section provides the detailed discussion of the results and analysis of the already discussed test strategy. This information would be helpful in understanding the load exerted by the SharePoint farm, based on the usage profile and the user load. This data may be helpful in the preliminary sizing of a SharePoint farm deployment. SharePoint workloads were executed on a SharePoint farm using Visual Studio Team Suite and recorded web tests, with varied user loads. Heavy Collaboration scenario was analyzed for this purpose. The detailed configuration of the SharePoint farm is shown in Table 4.
Table 4.
SharePoint Farm Configuration Server
Machine
SQL Server Database Server SHAREPOINT App Server
Dell™ PowerEdge™ R710 PowerEdge R610
SHAREPOINT Web Front End 1 SHAREPOINT Web Front End 2 SHAREPOINT Web Front End 3
PowerEdge R610 PowerEdge R610 PowerEdge R610
All three web front-end servers were clustered using the Network Load Balancing (NLB) feature. More information on NLB may be found here: http://technet.microsoft.com/en-us/library/bb742455.aspx. The content database and the tempdb database files were accommodated in a separate set of disks to individually analyze the database components.
10
Performance Study on SharePoint Workloads in a SQL Server Environment The detailed configuration of the SQL Server database is provided in Table 5.
Table 5.
SQL Server Configuration Components
Details Model: Dell PowerEdge R710 Processor: 2 *Quad core Intel® Xeon® Processors E5530 @ 2.40GHz, L3 8MB
Server
Memory: 16GB (8 *2GB RDIMM 1067MHz) NOTE: SQL Server Memory was restricted to 2GB to redirect most of the database requests to the storage.
Hardware
Model: Dell EMC® CX4-120 Storage
Hard drives: 146GB 15k SAS drives Flare version: 3.26.040.5.025
Network Interface Cards
Broadcom Teamed NIC (2 * Broadcom BCM5709C NetXtreme II GigE(NDIS VBD Client)) Driver: 5.0.13
Operating System Software Database
Microsoft Windows Server® 2008 R2 Enterprise Edition Microsoft SQL Server 2008 R2 x64
The overall farm response time of approximately one second was considered as a measure of performance to determine the maximum possible database load within the acceptable performance limits. When the farm reached the 1s response time limit, the content database disks were expanded to accommodate more load. The SQL memory was restricted to 1GB, to push the maximum requests to the disks. Throughout the test period, the overall utilization of the web front-end servers was monitored and verified to not exceed more than 50% of the system capacity.
11
Performance Study on SharePoint Workloads in a SQL Server Environment
Test Results and Analysis As mentioned earlier, to understand the disk I/O characteristics, the database disks were deployed on RAID 10 volumes consisting of varying numbers of physical disks. For performing I/O characterization, two disk initial configurations were deployed. Throughout the disk I/O characterization testing, SQL Server was restricted to use only 2GB of physical memory. This restriction forced most I/O requests to the disk and, hence, increased the load on the disk I/O subsystem.
Table 6.
Disk I/O Performance Number of Disks
Maximum Concurrent User Load
2 2 2 14 14
1500 2000 2500 3000 4000
Average Farm Response Time 0.96 1.04 1.1 0.58 2.73
As shown in Table 6, a two disk configuration could support up to 2500 concurrent users with an average farm response time of approximately one second. With the goal of restricting the average farm response time to below one second, a 14 disk configuration was tested. Adding more disks to the database backend supported up to 3000 concurrent users. At this point, pushing the user load beyond 3000 concurrent users resulted in a farm response time higher than 1 second. This result occurred because the underlying 2GB memory allocated to SQL server started becoming a bottleneck. For performing memory characterization, the SQL server backend and the SharePoint content database were placed on a 14 disk RAID volume. With a constant disk backend, the SQL server memory was increased in increments of 2GB to find the maximum concurrent user load supported by the SQL database backend.
Table 7.
Memory Performance Number of Disks
SQL Server Memory
14 14 14 14
2GB 4GB 4GB 6GB
Maximum Concurrent User Load 3000 4000 5000 6000
Average Farm Response Time 0.58 0.35 0.67 0.79
As shown in Table 7, the test runs were started with the SQL Server memory restricted to 2GB. This configuration with a 14 disk backend for the content database could scale up to 3000 concurrent users. The SQL server memory was then scaled up to 6GB to support 6000 concurrent users. This behavior may show that the SQL server memory alone can be scaled up to support increased user loads. However, it may not be entirely true because at one point the web front-end servers will become a bottleneck.
12
Performance Study on SharePoint Workloads in a SQL Server Environment From the preceding tables, Tables 6 and 7, the performance data shows a clear pattern where the SQL server memory could become a bottleneck rather than the underlying disk backend used to store SQL content. The SharePoint requests when getting translated to the database requests are just the read and writes from the database files. The database delivers the best it can, based on the number of requests from the application. As the number of application requests (in turn, the database requests) leave behind the database maximum capability, the application’s performance starts suffering. The key criterion is to have the SharePoint database sized optimally to meet the expected higher user load scenarios. Database memory is a key factor when planning for adequate capacity to support the existing and future user loads. Another important point to be noted is that the trend and the performance parameters are hugely influenced by the data set being used during the test period. For example, uploading a 5MB document consumes more resources compared to uploading a 100KB document. The aggregate amount of data to be operated on during any point of time should be considered while sizing the database resources, taking into account the maximum user load expected. Suppose a total of 8000 users are active on the farm during a particular period. If a majority of the users perform heavy weight activities like document and image upload or download, the farm resources may get highly consumed. The high consumption may affect the overall farm performance.
Note When you have unrestricted memory and enough physical disks for the database, backed by the processor and the network capability, even with increasing huge user loads, the database response time may be expected to remain consistent. To verify this expectation, a number of Collaboration test iterations were done with the following configuration. SQL Server Memory: Unrestricted (16GB of server RAM) Number of physical disks hosting the Content Database: 14 (with RAID10) The Content Database Response Time with varied user loads using the preceding configuration is shown in Table 8.
Table 8.
User Load Versus Content Database Response Time User Load
Database Response Time(s)
8000
0.01
10000
0.01
12000
0.012
13
Performance Study on SharePoint Workloads in a SQL Server Environment
Conclusion SharePoint allows organizations to store and manipulate unstructured data with great ease and flexibility. The performance of a SharePoint farm is hugely dependent on the working data set of all the combined users accessing SharePoint at any point of time, which requires the content database to be sized and implemented optimally to meet the organizational requirements. Based on the experiments conducted with a working data set of about 53GB and an additional effective physical disk pair in RAID10 added to the content database backend, an increased capability of the farm to handle an additional 1000 users was shown. This data may be helpful in sizing the content database backend capability based on the expected user load for the organization. However, having an oversized disk backend and undersized or restricted SQL server memory configuration will still result in poor farm performance.
14
Performance Study on SharePoint Workloads in a SQL Server Environment
References SharePoint Server Home Page http://office.microsoft.com/en-us/sharepointserver/default.aspx
Dell SharePoint Solutions www.dell.com/sharepoint Dell SQL Server 2008 Solutions www.dell.com/sql2008 Windows Server 2008 R2 www.dell.com/microsoft Microsoft Tech Blogs: SHAREPOINT Performance Counters http://blogs.msdn.com/ketaanhs/archive/2010/03/13/moss-performance-counters.aspx
Network Load Balancing Technical Overview http://technet.microsoft.com/en-us/library/bb742455.aspx
How Network Load Balancing Works http://www.isaserver.org/tutorials/basicnlbpart1.html
15