Integrating Apache Spark Into PBS-Based HPC Environments Troy Baer
[email protected]
Paul Peltz
[email protected]
Junqi Yin
[email protected]
National Institute for Computational Sciences University of Tennessee Oak Ridge, Tennessee, USA
Edmon Begoli Joint Institute for Computational Sciences University of Tennessee Oak Ridge, Tennessee, USA
[email protected] ABSTRACT This paper describes an effort at the University of Tennessee’s National Institute for Computational Sciences (NICS) to integrate Apache Spark into the widely used TORQUE HPC batch environment. The similarities and differences between the execution of a Spark program and that of an MPI program on a cluster are used to motivate how to implement Spark/TORQUE integration. An implementation of this integration, pbs-spark-submit, is described, including demonstrations of functionality on two HPC clusters and a large shared-memory system.
Categories and Subject Descriptors D.4.7 [Organization and Design]: Batch processing systems
Keywords batch processing, NICS, PBS, TORQUE, Apache Spark, data analytics
1.
INTRODUCTION
Large scale data analysis, more commonly referred to as “Big Data”, is an increasingly important field in high performance computing (HPC). The popular Apache Hadoop [1] analytics framework is a tightly-integrated software stack that has at times proven difficult to deploy on general purpose HPC systems. However, Hadoop is now being superceded in many cases by a newer in-memory analytics framework called Apache Spark, due to the latter’s improved performance, flexibility, and wider variety of programming interfaces. Spark and Hadoop analytics are mostly being
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from
[email protected].
implemented on dedicated commodity clusters, but the data storage and computational requirements are expected to increase as the workload and dataset sizes increase. Due to economies of scale, it will be far more efficient if Apache Spark users can be given a usable environment on mainstream parallel computing resources at existing HPC centers, where both machines and people are already in place to assure that top-notch resources and support are readily available. As the Big Data and HPC worlds collide, it is becoming paramount that these worlds integrate with and leverage each others’ abilities. This paper describes an effort at the University of Tennessee’s National Institute for Computational Sciences (NICS) to integrate Spark into HPC environments using the popular TORQUE batch system.
2. BACKGROUND 2.1 Apache Spark Apache Spark [32] is an open source cluster computing framework for analytic processing of large data sets. Its ability to cache data sets in memory makes it well suited for large data analysis, especially on systems with large memory space. Spark programs typically run on a cluster where working data sets are loaded from the file system, cached, and computed upon repeatedly. Programmers can write programs for the Spark runtime environment using Java, Python, or Scala, and Spark programs can be executed in either a standard batch execution mode or using an interactive shell. As shown in Figure 1, Spark itself consists of Spark Core and Resilient Distributed Datasets (RDD) components. Spark Core is a distributed execution engine, while RDD is a distributed data structure allowing the loading and manipulation of very large datasets in memory across many nodes. In its clustered mode, Spark can use any of several cluster managers. In addition to the core components, Spark offers a number of specialized analytic libraries, including GraphX [10] for graph analytics, MLlib [11] for machine learning, Spark SQL [12] for processing of tabular data via SQL semantics, and Spark Streaming [14] for processing of the streaming data.
systems share a common user interface, the IEEE POSIX.2d batch environment standard [16]. Integrating cluster-based analytics software such as Hadoop or Spark with an HPC batch environment like TORQUE has a number of potential benefits. First, it allows the analytics software to coexist with and leverage the investments made for more traditional HPC applications, rather than requiring a completely separate analytics platform with attendant costs for hardware, maintenance, power, cooling, and management staff. Second, running cluster-based analytics software in an HPC batch environment allows for relatively straightforward resource usage accounting and allocation management, which neither YARN nor Mesos currently provides. Third, it allows for automated workflows that encompass both analytics applications and traditional HPC applications such as simulation and visualization. Finally, HPC scheduling software such as Maui [26] and its successor Moab [6] are more mature and sophisticated than the schedulers available in YARN and Mesos [23]; for instance, Maui and Moab support the concepts of advance reservations, backfill, and quality of service (QoS) levels.
Figure 1: Apache Spark Architecture
2.1.1
Spark Cluster Managers and Standalone Mode
In its cluster mode [3], the Spark runtime environment consists of a driver program, a cluster manager, and workers on compute nodes. The cluster manager is responsible for allocating and starting the workers as well as coordinating communication between the driver program and workers. The driver program is then able to dispatch tasks to the workers. Spark is designed to be cluster manager agnostic, supporting three by default: YARN [30], Mesos [25], and standalone mode [13]. In standalone mode, the user is responsible for starting the Spark manager and worker daemons through whatever mechanism they choose. This allows Spark to support a wide variety of underlying cluster managers without having to implement support for each one specifically. For instance, a cluster manager for Amazon EC2 compatible cloud services has been implemented using standalone mode.
2.2
HPC Clusters and Batch Environments
HPC clusters are widely deployed in academia and research laboratories, and the deployment and management of them at scales up to thousands of nodes is well understood. These systems tend to be general purpose computing platforms that support a wide variety of workloads. As a result, these systems typically already have a batch environment that provides equivalent functionality to what Spark needs in a cluster manager. For instance, one of the most commonly used types of HPC batch system is the Portable Batch System (PBS) family, consisting of three code bases – OpenPBS [29], PBS Pro [8], and TORQUE [15] – derived from the original Portable Batch System developed at NASA Ames Research Center in the 1990s [24]. As the name implies, the PBS family are modular and portable to a wide variety of platforms, including most proprietary and open source UNIX variants as well as Linux. The PBS family of batch
One potential objection to running Hadoop or Spark in an HPC batch environment is interactivity. In particular, Spark includes a program called spark-shell intended for interactive Spark processing. However, roughly equivalent functionality can be provided using the PBS family’s interactive job capability with qsub -I.
2.2.1
HPC Cluster Parallel Program Startup
Most HPC sites prefer that parallel programs be launched using facilities provided by the batch environment. For instance, both the Ohio Supercomputer Center’s mpiexec implementation [31] and MPICH2’s Process Management Interface (PMI) [20] will use the Task Management Application Programming Interface (TM API) provided by PBS variants such as TORQUE and PBS Pro. This results in an arrangement very similar to the one described above for Spark cluster mode, with mpiexec in place of the driver program, pbs_server in place of the cluster manager, pbs_moms in place of the workers, and MPI programs as tasks.
2.2.2
Hadoop on HPC Clusters
Integrating analytics software with an HPC batch environment is not unprecedented. Projects such as Hadoop On Demand [4] and myHadoop [27] provided ways to launch Hadoop programs inside PBS jobs; however, these have largely become deprecated with the advent of Hadoop YARN. More recently, NICS has deployed Hadoop2 on Beacon, an Intel-based Cray Cluster Solution system which is described in more detail in Section 4.2. The Hadoop cluster configuration is similar to that recommended by Hortonworks [17], and the site parameters have been tuned to outperform myHadoop (using default settings) on the HiBench [5] TeraSort benchmark with 4 data nodes. Similar to myHadoop, users can request a Hadoop cluster through PBS and customize configurations. Moreover, with YARN as the general resource manager under PBS, the supported data processing framework is not limited to MapReduce [22] applications. For example, the Hive [2] SQL framework has been demonstrated to run on top of Hadoop, and this should be extensible to include other common tools such as Spark.
3.
STARTING SPARK SERVICES AND PROGRAMS INSIDE PBS JOBS
With all of the above in mind, the authors set out to develop a program that would allow users to run one or more Spark programs inside the context of a PBS batch or interactive job. The design goal was that running a Spark program inside a batch job should be as easy as running an MPI program in a batch job or running a Spark program on a dedicated Spark cluster. This requires hiding as much of the gory details of the underlying parallel program startup as possible from the end user. The resulting program, pbs-spark-submit [18], was implemented in Python. The source code for the latest version of the program can be found at http://svn.nics.tennessee.edu/ repos/pbstools/trunk/bin/pbs-spark-submit. The flow of execution in pbs-spark-submit is as follows: First, it verifies that it is being executed inside a PBS job and that the environment variable SPARK_HOME is set. Second, it sets default values for a number of environment variables if they are not already set. Third, it determines what shared and node-local directories to use. Fourth, it parses its command line options and updates settings accordingly. Fifth, it parses any Java property files found in its configuration directory. Sixth, it launches the Spark master and worker daemons using the standalone mode scripts if needed. Finally, it executes the user’s Spark driver program using spark-submit. The pbs-spark-submit program makes two major assumptions about its environment. First, it assumes that its working directory is on a shared file system that is accessible by all of the nodes allocated to the job; this file system can be NFS, Lustre, or any other roughly POSIX-compliant file system. Second, it assumes that the Spark master process can be run on the “mother superior” node of the job – that is, the first node allocated to the job, and the node where the job script is executed. The program attempts to enforce this by setting the environment variables SPARK_MASTER_IP and SPARK_MASTER to refer to the mother superior node for the job. By default, the Spark master and workers will attempt to place their logs in $SPARK_HOME/logs. However, that is typically neither possible nor scalable in a shared installation directory that is not world-writable, so pbs-spark-submit defaults to logging in its current working directory unless the environment variable SPARK_LOG_DIR is set. pbs-spark-submit goes to some lengths to ensure that a properly configured environment is presented to the Spark master, workers, and driver program. As discussed above, it requires that the environment variable SPARK_HOME is set before it will run, as it relies on that variable to find the Spark software. In addition, pbs-spark-submit will set default values to the following environment variables if they are not already set: • SPARK_CONF_DIR (default is current working directory) • SPARK_LOCAL_DIRS (default is $TMPDIR or /tmp) • SPARK_LOG_DIR (default is current working directory)
• SPARK_MASTER_PORT (default is 7077) One of the key components of pbs-spark-submit is the Launcher class, which is responsible for launching the worker daemons on compute nodes. There are three Launcher implementations currently available, using pbsdsh, exec, and ssh as the underlying launch mechanism respectively. The pbsdsh and ssh launchers are intended for use on clusters, while the exec launcher is intended for use on standalone shared-memory systems. The default launcher is determined using the environment variable SPARK_LAUNCHER, though this can be overridden using command line arguments. The pbsdsh and exec launchers are preferred, as these will result in the worker daemons being children of the pbs_mom daemon for resource usage monitoring, signal delivery, and process cleanup purposes. By default, pbs-spark-submit will attempt to initialize the master and worker daemons on every invocation. However, for the case of running several Spark programs in sequence in a job, these repeated initializations are unnecessary. To allow for this use case, the command line option --no-init will cause the program to skip the initialization step and proceed directly to starting the Spark program.
4. EXAMPLES 4.1 Test Jobs Figures 2, 3, 4, and 5 show job scripts for running Spark Pi, Pagerank, SQL, and Wordcount Python example programs, respectively. These scripts are largely identical except the arguments to pbs-spark-submit. They also assume the availability of a spark environment module [7] that sets PATH, SPARK_HOME, and SPARK_LAUNCHER appropriately, such as the one shown in Figure 6. The example job scripts should also be more or less portable to any cluster system using TORQUE and environment modules; however, on shared memory and NUMA systems, the nodes resource request may need to be changed to ncpus. #PBS -N spark-pi #PBS -j oe #PBS -l nodes=2:ppn=1 #PBS -l walltime=1:00:00 echo PBS_NUM_NODES=$PBS_NUM_NODES cd $PBS_O_WORKDIR module load spark pbs-spark-submit \ $SPARK_HOME/examples/src/main/python/pi.py \ 800 Figure 2: Example Spark Pi Job
4.2
Systems Used
For the work described here, the integration of Spark with TORQUE was investigated on three systems at NICS: Beacon, Nautilus, and Photon. Beacon [21] is a Cray CS300-AC supercomputer equipped with Intel Xeon Phi coprocessors. The system consists of 48 compute nodes, 6 I/O nodes, 2 login nodes, and 2 management nodes joined by an FDR InfiniBand interconnect.
#PBS -N spark-pagerank #PBS -j oe #PBS -l nodes=2:ppn=1 #PBS -l walltime=1:00:00 echo PBS_NUM_NODES=$PBS_NUM_NODES cd $PBS_O_WORKDIR module load spark pbs-spark-submit \ $SPARK_HOME/examples/src/main/python/pagerank.py \ $SPARK_HOME/data/mllib/pagerank_data.txt \ 10
#PBS -N spark-wordcount #PBS -j oe #PBS -l nodes=2:ppn=1 #PBS -l walltime=1:00:00 echo PBS_NUM_NODES=$PBS_NUM_NODES cd $PBS_O_WORKDIR module load spark pbs-spark-submit \ $SPARK_HOME/examples/src/main/python/wordcount.py \ /usr/share/dict/words Figure 5: Example Spark Wordcount Job
Figure 3: Example Spark Pagerank Job #PBS -N spark-sql #PBS -j oe #PBS -l nodes=2:ppn=1 #PBS -l walltime=1:00:00 echo PBS_NUM_NODES=$PBS_NUM_NODES cd $PBS_O_WORKDIR module load spark pbs-spark-submit \ $SPARK_HOME/examples/src/main/python/sql.py
#%Module # spark/1.2.1-bin-hadoop2.4 proc ModulesHelp { } { puts stderr "Sets up Apache Spark." } conflict spark set version 1.2.1-bin-hadoop2.4 setenv SPARK_HOME /opt/spark/$version setenv SPARK_LAUNCHER pbsdsh prepend-path PATH /opt/spark/$version/bin
Figure 4: Example Spark SQL Job
Figure 6: Example Spark Environment Module
Each compute node is equipped with 2 Intel Xeon E5-2670 8-core processors, 4 Intel Xeon Phi coprocessors 5110P, 256 GB of RAM, and 960 GB of SSD storage. Each I/O node provides access to an additional 4.8 TB of SSD storage, presented as a high performance Lustre file system. Additionally, Beacon has access to NICS’ site-wide Medusa Lustre file system. Beacon uses TORQUE as its batch environment with Moab as the scheduler. The large amount of memory per node makes this platform an ideal candidate for Spark.
due to the limitations of the interconnects in the SM15000 (1Gb or 10Gb), data ingress and egress are expensive and therefore not reccomended. Currently, Photon is being configured for exploratory data analysis, data discovery, and experimentation, and research and development of Sparkbased configurations and libraries.
Nautilus [28] is a cluster of SGI Ultraviolet non-uniform memory access (NUMA) shared-memory systems. This cluster consists of one UV1000 system with 128 Intel Xeon X7550 8-core processors (1,024 cores total) and 4 TB of memory as well as four UV10 systems with 4 Intel Xeon E7-4820 8-core processors (32 cores total) and 128 GB of memory each. Nautilus also has access to NICS’ site-wide Medusa Lustre file system. Nautilus uses TORQUE as its batch environment with Moab as the scheduler. Like Beacon, the large amount of memory per node makes Nautilus an ideal candidate for Spark. Photon is a SeaMicro SM15000 Fabric Compute System. It is a high-density, low-power compact cluster consisting of 64 compute nodes with AMD Opteron processors (8 cores) and 16 GB of RAM, as well as a 64 disk storage array used as a network file system (NFS) for the compute nodes. The system has a proprietary high-speed network and storage interconnect called Freedom Fabric that advertises itself as a 1.28 terabit-per-second low latency fabric; MPI performance tests indicate that the interconnect’s performance is roughly comparable to that of QDR InfiniBand. The storage array can also be reconfigured to provide a dedicated disk per compute node if desired. Additionally, Photon has access to NICS’ site-wide Medusa Lustre file system, but
4.3
pbs-spark-submit Example Results
Figures 7, 8, 9, and 10 show the measured run time performance for the Spark Pi, Pagerank, SQL, and Wordcount examples in using the Apache Spark 1.2.1-bin-hadoop2.4 binary distribution, respectively. Note that the four examples come with the Apache Spark distribution, and the result are obtained by running the scripts shown in Figures 2, 3, 4, and 5 with only changes to the number of nodes. In all cases, Beacon is roughly twice as fast as Photon at the same core count; this is because Beacon has twice as many cores per node as well as more powerful processor cores. However, the flip side of this is that Photon is much smaller/denser and requires much less power and cooling than Beacon – the entire Photon system is about the same size as one 4-node blade chassis in Beacon. Meanwhile, the Nautilus UV1000 node consistently underperforms the other systems on all tests other than Pagerank, which is not surprising given that it is by far the oldest of the systems being used. In the case of Pagerank, both Beacon and Photon anti-scale (that is, performance gets worse as more cores are used) while the Nautilus UV1000 node stays more or less constant; this is likely caused by the Pagerank algorithm being sensitive to communication latency performance, which is best on the SGI UV1000 (shared memory) and worst on the AMD SeaMicro SM15000 (Freedom Fabric interconnect, roughly comparable to QDR InfiniBand). It should also be noted that these are small scale exam-
45
35 30 25 20 15 10 5 0
Beacon Nautilus-UV10 Nautilus-UV1000 Photon
16 Run time (s) -- lower is better
Run time (s) -- lower is better
18
Beacon Nautilus-UV10 Nautilus-UV1000 Photon
40
14 12 10 8 6 4 2
0
200
400
600
800
1000
0
1200
0
200
400
Total Cores
600
800
Figure 7: Spark Pi Example Performance
45
50
Beacon Nautilus-UV10 Nautilus-UV1000 Photon
40 Run time (s) -- lower is better
Run time (s) -- lower is better
1200
Figure 9: Spark SQL Example Performance
Beacon Nautilus-UV10 Nautilus-UV1000 Photon
60
40 30 20 10 0
1000
Total Cores
35 30 25 20 15 10 5 0
0
200
400
600
800
1000
1200
0
200
400
600
800
1000
1200
Total Cores
Total Cores
Figure 8: Spark Pagerank Example Performance
Figure 10: Spark Wordcount Example Performance
ple programs, rather than realistic applications with large datasets. These results are presented to demonstrate the functionality of pbs-spark-submit up to the largest sizes possible on the available resources, but they are not necessarily representative of the scaling or performance of largerscale Spark applications.
ue. Similar contraints on memory usage are not currently implemented but intended for future work.
5.
SCHEDULING CONSIDERATIONS
Supporting Spark applications does require a small degree of added consideration from a batch scheduling perspective. By default, Spark assumes that it is the only application running on its worker nodes, which can present problems on systems where two or more jobs are allowed to share the same host. There are at least two ways to handle this. One approach is to avoid having Spark jobs share nodes with other jobs, either by setting the system’s default node allocation policy to disallow node sharing entirely or by setting up a queue for Spark jobs that does not allow node sharing. Another approach is to constrain the resources accessible to the Spark jobs using cpusets, cgroups, or some other resource constraint mechanism. The exec launcher in pbs-spark-submit goes to some effort to minimize its impact on a shared compute node. It determines if it is running inside a cpuset or cgroup, or if the environment variable PBS_NP is set; if either of those is the case, it limits the worker(s) it launches to use that number of cores by setting the environment variable SPARK_WORKER_CORES to that val-
6.
KNOWN ISSUES AND LIMITATIONS
pbs-spark-submit has a few known problems and limitations. One significant issue is that the default Spark logging settings are extremely verbose, which can make it difficult to find actual results intermingled among large amounts of log chatter. The best workaround for this behavior is to have a file called log4j.properties in the conf directory for the execution that lowers the default logging levels, such as the one shown in Figure 11. Another limitation of pbs-spark-submit on NUMA and shared-memory systems, as well as on clusters where jobs can share nodes, is that the program defaults to using port 7077 as the Spark master listening port. If multiple Spark jobs’ masters on the same host try to listen on that port at the same time, only one of them can do so, with the rest failing. This can be worked around by manually setting the master port to a unique value for each job using the environment variable SPARK_MASTER_PORT.
7.
CONCLUSIONS
Using pbs-spark-submit, Apache Spark programs can be executed in jobs on PBS-based HPC cluster and sharedmemory environments with relative ease. Running multiple
# Change this to set Spark log level log4j.logger.org.apache.spark=WARN # Silence akka remoting log4j.logger.Remoting=WARN # Ignore messages below warning level from Jetty, # because it’s a bit verbose log4j.logger.org.eclipse.jetty=WARN # Turn up DAGScheduler verbosity to get timing info # NOTE: lines broken for page width log4j.logger.org.apache.spark.scheduler.\ DAGScheduler=INFO log4j.logger.org.apache.spark.scheduler.\ TaskSetManager=WARN Figure 11: Recommended Spark Log4J Settings
example Spark applications at scale on a variety of hardware architectures, from commodity clusters to high-end NUMA platforms, it can be seen that its limitations are modest and easily worked around. This gives hope that Spark can coexist with more traditional HPC applications on large systems which process a variety of workloads. As such, it is important that the HPC community ready itself for the data deluge by implementing integration tools for Spark such as pbs-spark-submit. The pbs-spark-submit program is freely available under the GNU General Public License version 2. It is part of the larger PBS Tools [19] collection of utilities for PBS-compatible batch environments.
8.
FUTURE WORK
The work described above is an initial effort in integrating Apache Spark with the TORQUE batch environment, and there are several avenues for future efforts in this area. First, implementing an equivalent to spark-shell for use in interactive jobs, while likely straightforward, has not been done. Second, all of the results presented here are for TCP/IP over Ethernet or shared memory; supporting TCP/IP over Infiniband needs to be investigated and supported if possible. Third, and likely most importantly, a performance comparison between HDFS, HDFS-on-Lustre (using the open source connector from Seagate [9]), Lustre’s POSIX interface, and NFS on the same (at at least very similar) hardware would be of great benefit to the entire HPC and Big Data communities.
9.
ACKNOWLEDGMENTS
This material is based upon work performed using computational resources supported supported by the National Science Foundation under Grant Number 1137097, as well as by the University of Tennessee and Oak Ridge National Laboratory’s Joint Institute for Computational Sciences (http://www.jics.utk.edu/). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, the University of Tennessee, Oak Ridge National Laboratory, or the Joint Institute for Computational Sciences. The authors would like to acknowledge the assistance of
NICS staff in the preparation of this document, in particular Vince Betro.
10.
REFERENCES
[1] Apache Hadoop. https://hadoop.apache.org/. [2] Apache Hive. https://hive.apache.org/. [3] Cluster mode overview. https://spark.apache.org/docs/latest/clusteroverview.html. [4] Hadoop On Demand documentation. http: //hadoop.apache.org/core/docs/r0.17.2/hod.html. [5] HiBench. https://github.com/intel-hadoop/HiBench. [6] Moab HPC suite basic edition. http://www.adaptivecomputing.com/products/hpcproducts/moab-hpc-basic-edition/. [7] Modules – software environment management. http://modules.sourceforge.net/. [8] PBS Professional: Job scheduling and commercial-grade HPC workload management. http://www.pbsworks.com/Product.aspx?id=1. [9] Seagate/hadoop-on-lustre. https://github.com/Seagate/hadoop-on-lustre. [10] Spark GraphX. https://spark.apache.org/graphx/. [11] Spark MLlib. https://spark.apache.org/mllib/. [12] Spark SQL. https://spark.apache.org/sql/. [13] Spark standalone mode. https://spark.apache.org/docs/latest/sparkstandalone. [14] Spark streaming. https://spark.apache.org/streaming/. [15] TORQUE resource manager. http://www.adaptivecomputing.com/products/ open-source/torque/. [16] Draft standard for information technology – Portable R draft technical Operating System Interface (POSIX ) standard: Base specifications, issue 7. 2008. [17] Exploring the next generation of Big Data solutions with Hadoop 2, 2014. http://hortonworks.com/wpcontent/uploads/2014/02/RHEL_Big_Data_HDPReference_Architechure_FINAL.pdf. [18] Troy Baer. Man page of pbs-spark-submit. https://www.nics.tennessee.edu/~troy/pbstools/ man/pbs-spark-submit.1.html. [19] Troy Baer. PBS tools. https://www.nics.tennessee.edu/~troy/pbstools/. [20] Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Jayesh Krishna, Ewing Lusk, and Rajeev Thakur. PMI: A scalable parallel process-management interface for extreme-scale systems. In Recent Advances in the Message Passing Interface, pages 31–41. Springer, 2010. [21] R. Glenn Brook, Alexander Heinecke, Anthony Costa, Paul Peltz, Vincent Betro, Troy Baer, Michael Bader, and Pradeep Dubey. Beacon: Exploring the deployment and application of Intel Xeon Phi coprocessors for scientific computing. Computing in Science & Engineering, (1):1–1. [22] Jeffrey Dean and Sanjay Ghemawat. MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107–113, 2008.
[23] Adam Diaz. How YARN changed Hadoop job scheduling. Linux Journal, June 2014. http://www.linuxjournal.com/content/how-yarnchanged-hadoop-job-scheduling. [24] Robert L Henderson. Job scheduling under the Portable Batch System. In Job scheduling strategies for parallel processing, pages 279–294. Springer, 1995. [25] Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy H Katz, Scott Shenker, and Ion Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI, volume 11, pages 22–22, 2011. [26] David Jackson, Quinn Snell, and Mark Clement. Core algorithms of the Maui scheduler. In Job Scheduling Strategies for Parallel Processing, pages 87–102. Springer, 2001. [27] Sriram Krishnan, Mahidhar Tatineni, and Chaitanya Baru. myHadoop – Hadoop-on-demand on traditional HPC resources. San Diego Supercomputer Center Technical Report TR-2011-2, University of California, San Diego, 2011. [28] A. F. Szczepanski, Jian Huang, T. Baer, Y. C. Mack, and S. Ahern. Data analysis and visualization in high-performance computing. Computer, 46(5):84–92, 2013. [29] OpenPBS Team. A batching queuing system. http://www.openpbs.org/. [30] Vinod Kumar Vavilapalli, Arun C Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, et al. Apache Hadoop YARN: Yet Another Resource Negotiator. In Proceedings of the 4th Annual Symposium on Cloud Computing, page 5. ACM, 2013. [31] Pete Wyckoff and Doug Johnson. Mpiexec. https://www.osc.edu/~djohnson/mpiexec/. [32] Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. Spark: cluster computing with working sets. In Proceedings of the 2nd USENIX conference on Hot topics in cloud computing, pages 10–10, 2010.