Esos Lab. Seminar Template - Dmclab.hanyang.ac.kr

13 downloads 51 Views 2MB Size Report
Sep 25, 2012 ... Introduction. □ Background. □ Black-box Performance Modeling for SSDs. □ Evaluation. □ Conclusion. Esos Lab. SSD Seminar. 2012-09-25 ...
Mass Storage Systems and Technologies (MSST), 2011 IEEE 27th Symposium on, MSST, 2011

Performance Modeling and Analysis of Flashbased Storage Devices {H. Howei Huang and Shan Li}†, {Alex Szalay and Andreas Terzis} †† †George Washington University ††Johns Hopkins University

September 25, 2012

Contents     

Introduction Background Black-box Performance Modeling for SSDs Evaluation Conclusion

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

Introduction 

Benefit of accurate performance model  Understand the state-of-art of SSDs  Research tool for exploring the design space



Previous models for HDD cannot account for characteristics of SSD



Utilize black-box modeling approach to analyze and evaluate SSD performance  Latency, Bandwidth, Throughput



Construct black-box model using synthetic workloads and real world trace on three SSDs, as well as an SSD RAID

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

3

Black-box Performance Modeling for SSDs 

Performance tends to be highly correlated with the workload characteristics



Black-box model can be constructed in two steps  Benchmark an SSD and collect the training  Utilize the statistical methods to quantify the correlations

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

4

Basic Model   

 

wc : Workload characteristics p : Predicted performance metric f : Performance function Workload is defined as Stream of IO requests HDD workload can be characterized as    



Read-write ratio – Percentage of writes in the request Request size – Number of bytes transferred to/from the storage device Queue depth – Number of outstanding IOs Request randomness – percentage of random accesses in IO request stream

Focus on three performance metric  Latency, Bandwidth, throughput in IOs per second

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

5

Experiment Setup



Run experiments on the machines with Intel Core 2 Duo 2.93 GHz, 4GB memory, and Linux kernel 2.6 Test on three SSDs, one hard drive, as well as an SSD RAID



Training data is generated by a synthetic I/O workload Generator



Intel storage toolkit  Each I/O request is run for one minute  For each device, we run 12,000 one‐minute workloads 



In total take about 200 hours (about 8 days) to complete

Evaluate with synthetic I/O requests, and four real‐world traces from OLTP applications and a web search engine 漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

6

Basic Model 

Impact of the write ratio and queue depth on the latency, bandwidth, and throughput of the SSD



Default values when a parameter is not being analyzed     

Write ratio – 0% for read, or 100% for write Queue depth – 1 Read size – 256KB Write size – 256KB All tests use 100% randomness for both writes and reads

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

7

Basic Model 

Impact of Write Ratio on Latency, Bandwidth, and throughput  SSD outperforms HDD – especially when dealing with a lot of read requests



Write ratio has a large influence on all three performance metrics  As writes increase in workload  



the latency Bandwidth and throughput

When there are more writes, HDD outperforms some SSD_S(samsung)

150Mb/s

10ms

20MB/s

2ms 漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

8

Basic Model 

Impact of Queue depth on Latency, Bandwidth, Throughput



Queue depth significantly affects the latency But impact on bandwidth and throughput are minimal



漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

9

Extended Model 

Consider additional parameters to improve model accuracy  Read and write stride – effect of request alignments  Read and write size – SSD asymmetric in read/write performance  Read and write randomness

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

10

Extended Model 

Write size against Latency, Bandwidth, and Throughput  As size of write/read increases latency increases

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

11

Extended Model 

Impact of access patterns on latency  When write size changes under random access pattern, SSD_S suffers performance degradation

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

12

Regression Tree   

Apply statistical machine learning algorithms Use the least-squares approach to fit the linear model Construct a regression tree from the function  Recursively split the input variables into leaf nodes to minimize mean square errors  Leaf nodes provide the predicted values for dependent variables as a constant function of independent variables Best split - Minimize the mean square error among all training data at the leaf nodes

W.-Y. Loh, “Guide regression tree version 7.9,” 2009 漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

Prediction of the bandwidth as a function of the workload characteristics that are listed on the path 13

Evaluation Metric 

Mean Absolute Error (MAE) – | p - p’ |  Difference between the observed and predicated performance



Mean Relative Error (MRE) – | (p - p’) / p |  Ratio between the absolute error and the observed performance



R2 = 1 – SSE/SST is used to determine how well the performance is likely to be predicted by the model  SSE – Sum of squares due to error ∑(pi - p’i)  SST – Total sum of squares ∑(pi - p’i)2  R2 – goodness of fit measure



A better model has  Smaller MAE and MRE  R2 close to 1

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

14

Microbenchmarks Very high HDD model does not predict well

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

Increased significantly

Better accuracy

15

Model Improvements 

 

For MRE, all the devices have close to or higher than 60% improvements for three performance models For SSD_I there is 80% improvement Large increases in R2 values for SSD_I and SSD_S

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

16

Prediction Accuracy of Eight Workloads

-

漢陽大學校

2012-09-25

MRE values of rd_only and wr_only are between 0.4% and 7% Most MRE values are less than 10%

Esos Lab. SSD Seminar

17

Conclusion 

Flash‐based solid‐state drives play an important role in today’s storage systems



An accurate performance model will help



A good black‐box model can be constructed for SSDs  Regression Tree

漢陽大學校

2012-09-25

Esos Lab. SSD Seminar

18