Web Search Using Mobile Cores: Quantifying and ... - Semantic Scholar

Web Search Using Mobile Cores Quantifying and Mitigating the Price of Efficiency

Vijay Janapa Reddi

Benjamin Lee

Trishul Chilimbi

Kushagra Vaid

Engineering & Applied Science Harvard University

Electrical Engineering Stanford University

Runtime Analysis & Design Microsoft Research

Global Foundation Services Microsoft Corporation

International Symposium on Computer Architecture 22 June 2010

1

Conventional Wisdom ◦ Moore’s Law provides transistors ◦ Simple cores improve energy efficiency ◦ Parallelism recovers lost performance

2

Simple Cores ◦ Pursue aggregate throughput, energy efficiency ◦ Assume task parallelism ◦ Assume latency tolerance

3

Applications in Transition

• Conventional Enterprise ◦ Process independent requests ◦ Exhibit high memory, I/O intensity ◦ Ex: web, database, Java, mail, file servers

• Emerging Cloud ◦ Extract information, value from data ◦ Exhibit high compute intensity ◦ Ex: analytics, machine learning

4

Computational Intensity ◦ Microsoft Bing ranks pages with neural network ◦ RMS foreshadows future analytic workloads

5

Cloud Efficiency

• Challenges ◦ Migrate computation, data to cloud ◦ Choose efficient components ◦ Understand application, component interaction

• Case Study ◦ Mobile cores for efficiency, parallelism for performance? ◦ Achieve efficiency with mobile cores (Intel Atom) ◦ Quantify price of efficiency (Microsoft Bing)

6

Efficiency Atom is more energy, cost efficient than Xeon

Price of Efficiency Atom limitations impact latency, relevance, flexibility

Mitigating Price of Efficiency Atom over-provisioning should consider platform overheads

7




7

Search Architecture ◦ Rank pages using neural network ◦ Deploy on server (Xeon), mobile (Atom) processors

8

Processor Activity ◦ Compare Xeon (4-issue, OOO) and Atom (2-issue, IO) ◦ Measure µarch activity with hardware counters

9

Processor Power ◦ Compare Xeon (15W per core) and Atom (1.5W per core) ◦ Measure processor power at voltage regulator

10

Processor Efficiency ◦ Demonstrate energy, cost efficiency with Atom ◦ Measure max QPS within QoS target

11




12

Price of Efficiency • Latency ◦ Cut-off latency limits refinement opportunities ◦ Per query latency impacts quality-of-service

• Relevance ◦ Search rank orders documents ◦ Choice, ordering of results impact relevance

• Flexibility ◦ Query activity, complexity increase load ◦ Processor resources impact flexibility

13

Latency ◦ Atom increases latency average (µ) by 3× ◦ Atom increases latency variance (σ 2 )

14

Relevance ◦ Consider choice, ordering of top N documents ◦ Atom impacts relevance under all query loads

15

Flexibility ◦ Consider activity, complexity of queries ◦ Atom harms QoS for more complex queries

16

Mitigating Price of Efficiency




17



• Addressing Latency & Relevance ◦ Address µarchitectural limitations ◦ Integrate application-specific accelerators ◦ Manage heterogeneous servers

• Addressing Flexibility ◦ Over-provision Atoms ◦ Mitigate platform overheads ◦ Integrate more cores per chip

18


Platform Overheads ◦ Xeon: 4-core, 2-socket ◦ Atom: 2-core, 1-socket ⇒ Hyp-Atom: 8-core, 2-socket

19


Total Cost of Ownership (TCO) ◦ Pie slice shows breakdown of TCO $ ◦ Pie size shows throughput per TCO $

20


Case for Integration ◦ Hyp-Atom attributes more per TCO $ to servers ◦ Hyp-Atom achieves greater throughput per TCO $

21

Conclusion




22

Conclusion

Also in the paper ... • µarchitecture ◦ Processor activity from hardware counters ◦ µarchitectural bottlenecks

• Search ◦ Application phases in computation ◦ Execution time breakdown

• Mitigating Price of Efficiency ◦ µarchitectural enhancements ◦ Heterogeneous, accelerated processors

23

Conclusion

Conclusion • Emerging Cloud Applications ◦ Extract value from data ◦ Increase compute intensity

• Energy Efficiency ◦ Improve efficiency by 5× with mobile processors ◦ Exact price in latency, relevance, flexiblity

• Future Challenges ◦ Pursue efficiency given compute intensity ◦ Consider heterogeneous, accelerated processors

24

Web Search Using Mobile Cores Quantifying and Mitigating the Price of Efficiency

Vijay Janapa Reddi

Benjamin Lee

Trishul Chilimbi

Kushagra Vaid

Engineering & Applied Science Harvard University

Electrical Engineering Stanford University

Runtime Analysis & Design Microsoft Research

Global Foundation Services Microsoft Corporation

International Symposium on Computer Architecture 22 June 2010

25

Web Search Using Mobile Cores: Quantifying and ... - Semantic Scholar

Web Search Using Mobile Cores: Quantifying and ... - Semantic Scholar

Suggest Documents

Web Search Using Mobile Cores: Quantifying and ... - ISCA 2010 [PDF]

Web search using mobile cores: quantifying and ... - ACM Digital Library

Web Search Using Small Cores: Quantifying the Price ... - Duke People

Web Search - Semantic Scholar

Web Search - Semantic Scholar

Mobile Web Search Personalization Using ... - Computer Science

Granular Quantifying Traffic States Using Mobile ... - Semantic Scholar

Improving Web Site Search Using Web Server ... - Semantic Scholar

Enhancing mobile search using web search log data - Google Sites

exploratory search on the mobile web - Semantic Scholar

A Case Study on Semantic Web Search using ... - Semantic Scholar

Using BM25F for Semantic Search - Semantic Web

Using BM25F for Semantic Search - Semantic Web

Mobile and Dynamic Web Services - Semantic Scholar

Quantifying Web-Search Privacy - Reza Shokri

Semantic Search meets the Web - Semantic Scholar

Quantifying above-cloud aerosol using ... - Semantic Scholar

Quantifying Organismal Complexity using a ... - Semantic Scholar

Ranking Entities Using Web Search Query Logs - Semantic Scholar

Using Text-based Web Image Search Results ... - Semantic Scholar

Classifying Search Engine Queries Using the Web ... - Semantic Scholar

Disambiguation of People in Web Search Using a ... - Semantic Scholar

Using Web-based Search Data to Predict - Semantic Scholar

Using Web Search Engines to Improve Text ... - Semantic Scholar