Lenovo Big Data Reference Architecture for Hortonworks Data Platform

58 downloads 227 Views 772KB Size Report
performance as a big data management node in the Hortonworks ... and control via higher-level cloud orchestration and se
ARCHITECTURE BRIEF Big Data

Lenovo Big Data Reference Architecture for Hortonworks Data Platform A big data analytics solution that is easy to implement HIGHLIGHTS  Provides guidance for deploying Lenovo systems with a Hortonworks Data Platform to coordinate the processing of the data across a massively parallel environment.  Includes the latest data center equipment available such as the Lenovo x3650 M5 and x3550 M5 servers, Lenovo RackSwitch Ethernet switches and Lenovo XClarity.  Supports entry through high-end configurations and the ability to easily scale as big data grows

Big data is more than a challenge. It is an opportunity to find new insights in data to make your business more agile and to answer questions that were previously beyond reach. Today, Hortonworks uses the latest big data technologies such as the massive map-reduce scale-out capabilities of Hadoop to open the door to a world of possibilities. This Lenovo Big Data Reference Architecture for Hortonworks Data Platform is certified by Hortonworks and provides a thoroughly tested and integrated solution that combines the benefits of leading-edge technologies with mature, enterpriseready features. Starting with a preconfigured hardware platform that is Hortonworks certified helps your team to be up and running analytics quickly. Hortonworks allows organizations to run large-scale, distributed analytics jobs on clusters of cost-effective server hardware. This infrastructure can be leveraged to tackle very large data sets by breaking up the data into “chunks” and coordinating the processing of the data across a massively parallel environment. Hortonworks deployed on this Lenovo configuration provides superior performance, reliability, and scalability. This architecture supports entry through high-end configurations and the ability to easily scale as the use of big data grows. A choice of infrastructure components provides the flexibility to meet broad range of big data analytics requirements

WWW.LENOVO.COM

1

ARCHITECTURE BRIEF Lenovo Big Data Reference Architecture for Hortonworks Data Platform

Why Lenovo and Hortonworks Lenovo enables the deployment of the Hortonworks Data Platform infrastructure to coordinate the processing of the data across a massively parallel environment

Apache Hadoop® is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly gain insight from massive amounts of structured and unstructured data. The Hortonworks Data Platform, powered by Apache Hadoop, is a highly scalable and fully open source platform for storing, processing and analyzing large volumes of data. It is designed to deal with data from many sources and formats in a very quick, easy and cost-effective manner. The Hortonworks Data Platform consists of the essential set of Apache Hadoop projects including MapReduce, Hadoop Distributed File System (HDFS), HCatalog, Pig, Spark, Hive, HBase, ZooKeeper and Ambari. These projects have been integrated and tested as part of the Hortonworks Data Platform release process and installation and configuration tools have also been included. From an architectural perspective, the use of Hadoop as a complement to existing data systems is very compelling: an open source technology designed to run on large numbers of commodity servers. Hadoop provides a low cost scaleout approach to data storage and processing and is proven to scale to the needs of the very largest web properties in the world. Hortonworks is dedicated to enabling Hadoop as a key component of the data center, and having partnered deeply with some of the largest data warehouse vendors we have observed several key opportunities and efficiencies Hadoop brings to the enterprise.

2

ARCHITECTURE BRIEF Lenovo Big Data Reference Architecture for Hortonworks Data Platform

The Lenovo M5 servers, like the powerful two-socket x3650 M5 and x3550 M5, enhance performance and reduce power consumption of big data clusters. Purpose built for big data workloads, the 2U two-socket x3650 M5 server supports industry leading data storage capacity, the latest Intel Xeon E5-2600 v4 high performance compute processing power, flash storage options, and energy efficient features. The core Hortonworks configuration leverages this server as a data node for scale-out clusters. The versatile, two-socket 1U x3550 M5 rack server delivers outstanding performance as a big data management node in the Hortonworks configuration. This compact, easy-to-use server features a pay-as-you-grow design to help lower costs and manage risks. The data network is a private cluster data interconnect among nodes used for data access, moving data across nodes within a cluster, and ingesting data into the Hortonworks cluster. 10 GB Ethernet is included to meet today’s performance needs, with nodes dual-connected to a pair of Lenovo RackSwitch G8272 10 GB switches for high I/O bandwidth and fault tolerance. Regarding Storage, each server node in the configuration has internal, direct attached storage, with no external storage required. In situations calling for additional storage, the design approach allows for the use of higher capacity drives for a storage rich environment, or an increase in the number of nodes for a linearly scalable compute and storage solution. Hadoop systems management, software provisioning, monitoring, and management are handled via Apache Ambari, which is provided as part of Hortonworks HDP. Apache Ambari runs on the HDP master nodes. Hardware management is addressed with the Lenovo XClarity Administrator, which runs on the dedicated systems management node. Lenovo XClarity is an agentless centralized resource management solution that that is aimed at reducing complexity, speeding response, and enhancing the availability of Lenovo server systems. The solution seamlessly integrates into Lenovo M5 rack servers. Through an uncluttered, dashboard-driven GUI, XClarity provides automated discovery, monitoring, firmware updates, pattern-based configuration management, hypervisor operating system deployments. Lenovo XClarity also features extensive REST APIs that provide deep visibility and control via higher-level cloud orchestration and service management software tools. The example shown on this page shows a typical single rack deployment of this configuration. This configuration may be customized to best suit the workloads running in your environment. To accelerate time to value, Lenovo has service offerings and expertise to implement this configuration and to accommodate customization.

3

ARCHITECTURE BRIEF Lenovo Big Data Reference Architecture for Hortonworks Data Platform

Why Lenovo Lenovo is a leading provider of x86 servers for the data center. Featuring rack, tower, blade, dense and converged systems, the Lenovo server portfolio provides excellent performance, reliability and security. Lenovo also offers a full range of networking, storage, software, solutions, and comprehensive services supporting business needs throughout the IT lifecycle. With options for planning, deployment, and support, Lenovo offers expertise and services needed to deliver better service-level agreements and generate greater enduser satisfaction.

For More Information To learn more about the Lenovo Big Data Reference Architecture for Hortonworks Data Platform, contact your Lenovo Business Partner or visit: lenovo.com/systems/solutions.

Configuration Reference Numbers: Half Rack: BDAHWKSHF62 (as described in this document) Full Rack: BDAHWKSFR62 Hortonworks HDP SKU: HDP-EPL-NODE-1Y (available from Hortonworks)

© 2016 Lenovo. All rights reserved. Availability: Offers, prices, specifications and availability may change without notice. Lenovo is not responsible for photographic or typographical errors. Warranty: For a copy of applicable warranties, write to: Lenovo Warranty Information, 1009 Think Place, Morrisville, NC, 27560, Lenovo makes no representation or warranty regarding third-party products or services. Trademarks: Lenovo, the Lenovo logo, System x, ThinkServer are trademarks or registered trademarks of Lenovo. Microsoft and Windows are registered trademarks of Microsoft Corporation. Intel, the Intel logo, Xeon and Xeon Inside are registered trademarks of Intel Corporation in the U.S. and other countries. Other company, product, and service names may be trademarks or service marks of others. CRN: BDAHWKSXX62 10/2016

4