ered as a service, metered and managed as a service, and purchased as a ... As the scale of the numbers of servers in a data center grows and as the servers be- ... When it's âoff-hoursâ in one area, data centers there could be used more fully ...
Thermal Challenges in Next Generation Electronic Systems, Joshi & Garimella (eds) © 2002 Millpress, Rotterdam, ISBN 90-77017-03-8
Towards planetary scale computing Technical challenges for next generation Internet computing Rich Friedrich & Chandrakant Patel Hewlett-Packard Laboratories, Palo Alto, CA, USA
EXTENDED ABSTRACT: In the not-too-distant future, billions of people, places and things could all be connected to each other and to useful services through the Internet. In this world scalable, cost-effective information technology capabilities will need to be provisioned as a service, delivered as a service, metered and managed as a service, and purchased as a service. HP refers to this world as service-centric computing. Consequently, processing and storage will be accessible via a utility where customers pay for what they need when they need it. This processing and storage utility will become as ubiquitous as electrical and water utilities are today. Within HP Labs, we are exploring the penultimate manifestation of this service-centric computing strategy -- planetary scale computing -- needed to power the service-centric computing world. In this model, large-scale data centers are networked around the world allocating processing power and data storage when and where it is needed as it is needed. Customers will pay for services based on the amount they use, much like a public utility. The technical requirements for planetary scale computing will stretch the IT infrastructure beyond its elastic limit. Current and emerging services will consume 10-100 times more processing, communications and storage as they support a range of applications that include rich media, bioinformatics, and pervasive sensors. Scalability, density and complexity hurdles must be overcome. Technology and process innovation are required to deploy and operate this infrastructure to flexibly and economically support 21st century applications. New computing and storage architectures are required. As the scale of the numbers of servers in a data center grows and as the servers become more densely packed in racks, it is natural to think of the data center as the computer. HP Labs is working on a number of approaches to these technical challenges. One is the programmable data center that provides automated “infrastructure on demand” with little or no operator intervention. For example, retail Web sites typically need much more processing and storage during the year-end holidays than they need the rest of the year, but the Internal Revenue Service needs more computing resource in April. These organizations might benefit from securing additional computing and storage when they need it, rather than to acquire it and have most of it remain idle the rest of the time. Furthermore, data center physical resources such as power and cooling must also be managed more efficiently than today. Key research topics include next generation large-scale computing, communications and storage architectures. In a programmable data center, the infrastructure is physically wired once, but can be rewired programmatically, to meet the changing needs of customers and services. This includes linking these large data centers so that resources can be optimized for a region, a nation or around the world. When it’s “off-hours” in one area, data centers there could be used more fully by providing services for users in other parts of the world, and vice-versa. Managing this flexible and robust infrastructure is also critical. Researchers are defining the “data center operating system” that will automatically control the infrastructure. Introducing a virtualization layer between services and physical resources is the key here since it can minimize complexity and enable automation. The data center OS will consist of two parts: a service control 3
component that brokers service demand with resource capacity, and a resource control component that will intelligently provision physical resources, including power and cooling, via the virtualization layer. These components will work together to describe, install, configure, deploy, monitor and assure services on a global scale. Additional challenges face the designers of the next generation Internet computers. In particular we will examine the thermal and mechanical issues of future integrated circuits, densely packaged racks of servers, and large scale data centers that contain as many as 50,000 servers in a 100,000 square foot building. The thermo-mechanical challenges stem from high power density at chip level and the need to package the systems in a dense configuration. As an example, today’s data center, with 1000 racks, over 30,000 square feet, requires 10 MW of power to power the computing infrastructure. A 100,000 square foot planetary data scale data center of tomorrow will require 50 MW of power for the computing infrastructure. The cooling for this 50 MW planetary scale data center will consume an additional 25 MW of power. Such a data center, with five thousand 10kW racks, would cost ~$45 million per year (@ $100/MWh) just to power the servers and $22 million per year to power the cooling for the data center. Of course the energy expense would be much larger in an energy crisis, like one faced by California in the summer of 2001. The role of thermal and mechanical architecture is critical to ensure efficient power management, cooling and physical data center design. These trends motivate the following thermal mechanical research questions: • What innovative thermo-mechanical designs are required for the next generation of chips, racks and data centers? How should thermo-mechanical design influence large-scale Internet server and storage architectures? • What are the total energy costs over the lifetime of a product? This should include the original manufacture of thermal mechanical apparatus, on-going operational and final disposal costs. • What thermo-mechanical principles can be employed to improve the operation and control of large-scale data centers? • What techniques will allow us to use and re-use energy more efficiently? • What is the ideal data center design from a thermo-mechanical perspective? In addition to thermo-mechanical design, one must consider the power delivery means to the data center. With the proliferation of data centers all over the world, the high power demands will require us to devise a distributed energy source scheme to meet the power needs. Energy delivered from conventional sources to urban areas did not account for a world of pervasive computing. The use of energy from conventional sources will have great impact on conventional users of power. Therefore, use of energy from a multiplicity of alternate sources, and indeed re-use of energy from waste heat generated from data centers, is an important research question. In summary, the world is moving towards a service-centric view of computing and storage. Innovations in the IT infrastructure are required to increase scalability, improve density, minimize complexity and minimize cost. Furthermore, new approaches to power management and cooling optimization are required to meet the demands of these 21st century computing centers.
4
R.Friedrich & C.Patel