A Balanced Approach to Optimizing Storage Performance in the ... - Bitly

5 downloads 228 Views 226KB Size Report
Oct 19, 2015 - Client desktop images subject to delays from VDI “boot storms” ... vary according to the time of day
A Balanced Approach to Optimizing Storage Performance in the Data Center

How a mix of HDDs and SSDs can maximize the performance of your workloads

October 19, 2015

Mark T. Chapman Lenovo.com/systems

Lenovo White Papers

A Balanced Approach for Optimizing Workload Performance in the Data Center

Executive Overview Not long ago, data centers had but one option for real-time storage of data: hard disk drives. The only real questions were about which capacities, speeds, form-factors, and interfaces to use. HDDs were, and continue to be, the workhorses of most data centers. There are many workloads that perform extremely well with all-HDD arrays; however, the all-HDD data center is quickly becoming a thing of the past. Solid-state drives (SSDs) emerged a few years ago and have been accepted in many data centers. This is because SSDs offer benefits for specific workloads that HDDs cannot match. On the other hand, SSDs have limitations of their own and are not a panacea for all data center storage woes. There is a vast array of storage technologies available, offering a wide range of performance and pricing across all of the various products, and many ways in which to implement a storage solution. So how can IT personnel decide what is the best approach to use in their data center: all-HDD, all-SSD, or a hybrid tiered combination of the two? And if the latter, how should the combination be configured for a data center’s workloads? This paper highlights the advantages and disadvantages of each type of storage, in terms of cost, performance, capacity, and endurance. It also provides an overview of how storage can be tiered for best performance and the workloads that make the best use of SSDs and a hybrid storage infrastructure. Note: Except where specified otherwise, the information in this paper refers to technologies in general and not to particular products or vendors.

2

Lenovo White Papers

A Balanced Approach for Optimizing Workload Performance in the Data Center

An overview of the advantages / limitations of different storage media Each type of storage media has its strengths and weaknesses1. Much or all of this section may already be familiar to many readers, but it is presented for those who may not be. HDDs are a well-known factor. They have been used effectively with virtually every workload for decades. Therefore, this paper does not describe all the ways in which HDDs can be used. We will primarily mention the pros and cons of HDDs. However, because SSDs are a relatively new technology to many people we will describe the various types of SSDs and their respective strengths and weaknesses. In the following sections, we will present ways in which HDDs and SSDs can be used together in a data center to maximize the benefits of each type of storage and offset their respective limitations.

Hard disk drives HDDs are available in many combinations of speeds, interfaces, form-factors, and capacities. As a class (meaning taking all types of HDDs into account), HDDs offer a number of advantages that meet nearly all requirements for storage: • • • • • • •

High-density storage capacity (8TB or more) using 3.5-inch “nearline” storage Relatively low cost per-drive and per-gigabyte of storage (especially nearline drives) A variety of interface types, including 2Gb to 16Gb Fibre Channel (FC), 6Gb and 12Gb SAS, and 6Gb SATA Fast, balanced read/write performance, using 15k and 10k drives (especially 12Gb SAS and 16Gb FC drives) Redundancy and high availability, using RAID and hot-swap drives Ease of installation and replacement, using hot-swap drives Self-encrypting drives (SEDs) are available to provide strong data security

There also are disadvantages to using HDDs: •

1

Large numbers of HDDs in a data center create a lot of heat, require a large amount of energy (for power and cooling), and create quite a bit of noise.

The advantages and disadvantages presented here may be drive-specific. Not all advantages and disadvantages apply to all drives.

3

Lenovo White Papers



• • • •

A Balanced Approach for Optimizing Workload Performance in the Data Center

A performance bottleneck can occur for certain applications that require high read throughput, low latency, or high IOPS (I/O operations per second). The drives may not be able to keep up with the demand. Uneven failure rate. Although a category of drives—for example 3.5-inch 10k—has a predictable failure rate, individual drives can fail unexpectedly much sooner than average. Long rebuild times for RAID-5 arrays after a drive failure. HDD throughput is not increasing as quickly as HDD capacities, creating the potential for further bottlenecks in the future. And others.

Many data centers use a two- or three-tier design to work around many of these limitations. This structured approach uses a number of 15k SAS HDDs to store performance-critical (“hot”) data. The majority of the data is stored on 10k SAS or SATA HDDs in a second (“warm”) tier where performance is not as critical, and to reduce cost. An optional third tier holds the infrequently accessed (archival) or noncritical “cold” data on high-capacity nearline 7200 rpm SATA drives, to further reduce cost. Tier-3 data is typically accessed sequentially, in large blocks, where high random read/write performance is unnecessary. This approach is effective for many data centers. However, for some workloads even this is not fast enough. Other strategies incorporate SSDs into the design to create a hybrid storage infrastructure. (See Creating a hybrid storage infrastructure using HDDs and SSDs, below.)

Solid-state drives Typically, SSDs look to an OS exactly like an HDD, requiring no learning curve or software changes2 to implement SSDs in a data center. In general, an all-SSD array can replace an allHDD array directly, especially for read-intense workloads. However, for a number of reasons it may be more advantageous to implement a hybrid tiered infrastructure for the majority of workloads, to make the most effective use of both SSDs and HDDs. SSDs are available in different technologies, each with its own strengths and weaknesses. By far, the most common type of flash memory used in enterprise SSD products is NAND, which is available in two classes: SLC (single-level cell) and MLC (multi-level cell). SLC-based SSDs

2

There are exceptions, where applications may need to be changed to point to different SAN addresses.

4

Lenovo White Papers

A Balanced Approach for Optimizing Workload Performance in the Data Center

store one bit per memory cell, as opposed to MLC-based drives, which store 2, 3, or more bits per cell. These extra bits multiply the capacity of the drives compared to SLC drives. SLC drives are faster and have longer write lifecycles than MLC drives. (Every NAND cell has a known finite number of times the cell can be written to before it wears out.) On the other hand, MLC drives have greater storage capacity and lower cost per gigabyte, but are somewhat slower and with shorter lifecycles than SLC. In fact, as MLC capacity grows the lifecycle shrinks. Ironically, in some cases SSDs are too fast, in the sense that they can overload a 6Mbps SAS interface. This can mean a server is unable to take advantage of all the throughput available from the SSDs attached to it. Fortunately, the faster 12Gb SAS and 10Gb or 16Gb FC interfaces can overcome this limitation in most configurations. Yet, some SSD environments can produce enough throughput to overload even these pipelines. Relief in that regard is on the way. The first NVMe (Non Volatile Memory express) controllers should be rolling out in late 2015. These connect a drive directly to a server’s PCIe bus, rather than though a SAS or FC interface.3 This promises significantly faster bandwidth for storage transfer. How much faster? A typical SATA connection tops out at about 500 MBps throughput, and a SAS connection at 1.5 GBps. An NVMe connection on a PCIe bus can theoretically support almost 4 GBps, using a x4 PCIe 3.0 connection (or 2 GBps for PCIe 2.0), and 2x or 4x that much for a x8 or x16 PCIe channel, respectively. (Naturally, implementations may vary by server and by vendor.) Some workloads require the fastest possible read performance. Many data centers use SSDs to cache certain types of files, including logs, journals, temporary tables and hot tables, instead of using server memory. (This frees up that memory for application use). Other data centers cache all of a workload’s read-intensive files, but keep the write-intensive files on HDDs in a lower tier. In addition, fast, low-capacity SSDs are often used as OS boot drives, due to the high read performance and minimal writes required.

3

NVMe-connected SSDs should not be confused with PCIe SSDs, which are mounted on adapters in PCIe slots. NVMe SSDs are installed in drive bays, like HDDs and most SSDs.

5

Lenovo White Papers

A Balanced Approach for Optimizing Workload Performance in the Data Center

In addition, SSDs can be used as the first tier of a two-, three-, or even four-tier hybrid strategy. (See Creating a hybrid storage infrastructure using HDDs and SSDs, below.) As a class, SSDs offer a number of advantages that can best satisfy the storage requirements for read-intensive applications that stress read performance, low latency, and high IOPS. SSDs can also be used in conjunction with server DRAM as a means of supporting larger inmemory databases with extremely high performance. SSD advantages include: • • • • • • • • • • • •



4

Extremely high read performance. Extremely high IOPS performance. Extremely low latency. Extremely low cost-per-IOPS. Low energy consumption and low heat output for active drives; idle SSDs consume almost no energy and produce almost no heat. Silent operation; there are no moving parts to make noise Some SSDs offer high capacity (up to 3.84TB) combined with much higher IOPS than HDDs. Redundancy and high availability, using RAID and hot-swap drives Extremely fast rebuild times for RAID-5 arrays Ease of installation and replacement, using hot-swap drives Self-encrypting drives (SEDs) are available to provide strong data security Predictable failure rate. Using an SSD’s wear-tracking software, a user can see approximately when a drive is likely to fail (essentially, wearing out more than an acceptable number of memory cells) in time to take preventive action. A new generation (2015) of enterprise-class SSDs provide higher capacities than 10k and 15k enterprise HDDs, with better cost-per-gigabyte than 15k SAS HDDs. (For example, the recently announced 2.5-inch hot-swap Lenovo 3.84TB 6Gb SAS Enterprise-Capacity SSD has a list price that is approximately 10.5x the cost of Lenovo’s 300GB 15k 6Gb 2.5-inch HotSwap SAS HDD, but provides almost 13x the capacity.4

Prices current as of October 19, 2015. Source: www.shop.lenovo.com.

6

Lenovo White Papers

A Balanced Approach for Optimizing Workload Performance in the Data Center

There also are disadvantages to using SSDs: • •

• • • •

On a cost-per-drive or cost-per-gigabyte basis, most SSDs are still significantly more expensive than HDDs. Due to the high throughput of SSDs, 1Gb Ethernet may not be fast enough to prevent bottlenecks. Upgrading to 10GbE increases cost. 2Gbps and 4Gbps Fibre Channel likewise may require replacement with 8GB, 10Gb, or 16Gb FC. Most SSDs provide relatively low capacity (

Suggest Documents