service needs to have some daemon preconfigured on all the machines, it can only work in the first area. To overcome the resource shortage inside the cluster, ...
RNS: Remote Node Selection for HPC Clusters Seyedeh Leili Mirtaheri, Ehsan Mousavi Khaneghah, Siavash Ghiasvand, Mohammad Norouzi Arab, Ashkan Shirpour and Mohsen Sharifi School of Computer Engineering Iran University of Science and Technology Tehran, Iran {msharifi, mirtaheri, emousavi}@iust.ac.ir {noroozi, ghiyasvand, a_shirpour}@comp.iust.ac.ir
Abstract Scalability of distributed high performance computing clusters in different administrative domains is critically dependent on the deployment of a proper resource discovery mechanism that can discover resources residing in different wide administrative domains. In this paper, we present a new method for remote cluster node selection called RNS that such a required resource discovery mechanism may use. We show analytically that RNS is superior to existing resource discovery mechanisms in response time and utilization of free resources.
One solution to improve the performance and scalability of HPC clusters is to allow them to utilize out-of-cluster accessible and available distributed resources whose types may be different from the inside-of-the cluster resources i.e. are heterogonous. Such distributed clusters as any other distributed system promise higher performance and scalability. They however need a proper resource discovery mechanism that can discover heterogeneous resources residing in different wide administrative domains. A resource discovery mechanism is proper if it can satisfy as much as possible the dynamic resource requirements of every local process transparently irrespective of whether resources are local or remote.
1. Introduction The performance of a traditional high performance computing (HPC) cluster comprising of a number of homogenous computers interconnected by a dedicated local area network is limited by the amount of available resources in the locality of the cluster. Better said, traditional HPC clusters are limited in both performance and scalability. Some clusters [1] just ignore requests for resources higher than available in the cluster, while some others [2] keep some limited number of resources out of the cluster as reserved and use it in case there are requests for higher number of resources. Both approaches fail to remove performance and scalability limitations.
The main challenge on the way of a proper resource discovery mechanism is that resources are heterogeneous and are dynamically added or crossed out from the set of available resources of the overall distributed system. Processes request for resources dynamically too implying that no prior knowledge exists on resource requirements of processes. In this paper, we present a new method for remote cluster node selection called RNS that a proper resource discovery mechanism may use. 1
RNS does not require any agents on remote machines.
this, we introduce two areas. The first area contains member machines of a cluster system and the second one includes accessible systems that are not member of the cluster system.
We have organized the rest of paper as follows. Section 2 presents a brief background on resource discovery and its operations. Section 3 reports notable related works. Section 4 presents and comparatively evaluates our proposed RNS method. Section 5 concludes the paper.
In the first area, everything is almost preconfigured and foreseen. All system structures are determined and configured to satisfy the requirements of the cluster system. As a result, daemons are installed on every machine. These systems usually are homogenous or at least have little heterogeneity.
2. Background and Definitions The duty of a resource discovery service in distributed systems is to find suitable resources for current requests generated in the system. Such a service is dependent on the daemons (agents) already configured on the member machines of the system [3, 4]. These daemons look after local resources so that they can provide accurate replies to the requests made by other daemons within the system.
The second area discusses a completely unknown and heterogeneous world in which any resource of any type can exist. These resources are managed by different operating systems. Also, the member machines of such a system are not preconfigured and subsequently their required daemons are not installed on the machines. As a result, a uniform approach cannot be used for interacting with all of the machines. However, as mentioned earlier, since a resource discovery service needs to have some daemon preconfigured on all the machines, it can only work in the first area. To overcome the resource shortage inside the cluster, we need to use of outof-cluster resources.
The resource discovery approaches use different structures to support daemons to communicate. Three well-known structures are centralized, decentralized, and distributed [5, 6, 7, 8, 9]. However, regardless of the structure based on which a resource discovery service’s daemons are formed, there is protocol according to which these daemons communicate. Therefore, it can be stated that the most important matter in communication of daemons is the common language they use [5]. This limits different resource discovery services’ daemons in making communications with each other. The domain of a resource discovery service is thus defined by its daemons and the machines running daemons.
Because a resource discovery service is unable to work beyond the borders of a cluster system [10], a new service is required to handle requests using out-of-cluster resources. To implement resource discovery mechanisms on out-of-cluster resources, RNS must register those resources as members and then apply those mechanisms. RNS is only responsible for finding potential out-ofcluster resources and after selecting them some other mechanisms must be activated in order for the resources to be configured according to the requirements of the cluster system.
A resource discovery service performs well if resource requests can be satisfied using resources of the member machines of a cluster system. But what happens when all these resources are in use? How the resource discovery service can handle new requests? To answer these questions, we need to determine more accurately the area of work of the resource discovery service. To do
3. Related Work There are few research works on the use of outof-cluster resources. Some [2] use reserved 2
resources and some [1] ignore out-of-capacity requests. Only some researches like Mosix-2 propose instant joining of two cluster systems to each other to share their resources. The MOSIX Reach The Clouds Technology (MRC in short) determines how two MOSIX clusters can join to share their resources [11]. After joining, any node in the cluster can manage the cluster and dispatch its jobs. Therefore, the mentioned challenge is solved by adding available resources in other clusters. However, as stated before, some pre-configurations must be performed and both clusters must work under the same administrative domain.
discovered machines regarding the requirements specified in the received message. In RNS, first a message containing some information about the required resource is sent to one of the daemons of RNS located on the edge nodes of the cluster; an edge node is a node that is connected to the outside world. The receiving daemon analyzes the message and if relevant information exists it returns the information of the machine for joining it to the cluster. On the other hand, if there is no relevant information, the scanning process finds other neighboring out-ofcluster machines. In the next phase, communication with these machines starts. This phase is subdivided into two smaller phases. In the first sub-phase, the operating system of the remote machine is determined and in the second sub-phase the appropriate method for communication based on its operating system is chosen. In the final phase, the most suitable remote machine is chosen so that membership operations are performed. Also, any relevant information is cached so that further decisions are made faster.
Although a similar work on cluster systems (respecting the steps and their order) are not researched in depth, many works have been conducted on each single step. Finding neighboring machines are so important that an IP protocol packet is devoted to it [12]. Two mechanisms for recognizing neighboring machines are introduced in [13] and [14]. These mechanisms work in P2P networks. On the other hand, many works have been conducted on recognizing resources of remote machines without using agents and many industrial tools exist in this field [15]. However, all of these tools use standards accepted and used by the world community. Among these tools we can refer to IBM Tivoli Monitoring [16] as two commercial tools and HP SiteScope [17] and Zabbix [18] and Nagios [19] as two open source tools.
RNS can act in a pro-active or passive manner. The previous scenario used the passive manner. In pro-active mode, before RNS daemons are sent messages, the scanning phase starts to handle requests faster. The first phase of RNS is the only phase directly involved with the resource discovery mechanism. Therefore, it is better for the RNS service to have a well-defined interface so that it can communicate with as many resource discovery mechanisms as possible. In the second phase, it can make use of many mechanisms available in this field. In the third phase, using an alternating structure, this service brings the ability to communicate with many operating systems. Based on the operating system recognized at this phase, many tools and protocols like SNMP, SSH, DCOM, SLP and WMI can be utilized to
4. The Proposed RNS Method RNS has 4 phases that can be executed independently each. These phases include: request reception indicating the need for a resource, finding neighboring out-of-cluster machines, communicating with the discovered machines in order to etch out their resource information, and finally selecting one of the
3
fetch information on remote machine’s resources. Figure 1 depicts the relation between these 4 phases.
Recieve resource shortage request
Find neighboring machines
Choosing the best machine
Gathering machine's resource information
The first method is responsible until a resource within the cluster is available. After all resources become unavailable, new requests are ignored. The second method differs from the previous method in the case all resources within a cluster are unavailable. At this situation it makes use of the reserved resources already preconfigured for such a situation. The third method or RNS equally behaves about the requests that can be answered using inter cluster resources. Requests that cannot be handled using inter cluster resources are sent to the RNS service so that some out-of-cluster resource are found for them based on the mechanisms already described. One of the main parameters for determining the ability of a cluster system is its response time to requests. Therefore, we have chosen this parameter to compare our method with notable existing methods.
Figure 1: The relation between different phases involved in the RNS method
The pseudocode of RNS is presented below.
Figure 2 shows the response time regarding the number of requests generated. As depicted, the resource reservation method and RNS behave the same as the first method in the first interval (up to R1). Continuing, the first method ignores any more requests. Therefore, response time of this method goes to infinity as more requests arrive. However, the resource reservation uses reserved resources and can handle coming requests. The RNS method must find and add resources to the cluster from outside the cluster and it’ll take much longer time to complete. Therefore, during the second interval, RNS is slower than the resource reservation method. By the beginning of the third interval all of the resources of the cluster and reseved resources are no longer available. In this situation, the first method still ignores requests and the second method has no available resources and must wait until some resource within the cluster itself or from the set of reserved resources become available. Therefore, its response time increases as more and more requests are generated in the system. However, since RNS does not limit itself to a
RNS Mechanism /* A new request received */ Analyze_The_Request() IF(Results_From_Previous_Searches_Are_Enough) THEN /* RNS already has sufficient information and there is no need to search again */ Compare_Available_Resources_And_Choose_ The_Best() ELSE /* Do discovery phases one by one */ Scan_Neighborhood() Detect_Neighbors_OS() Gather_Neighbors_Resource_Info() /* choose the best result */ Compare_Results_And_Choose_The_Best() END IF /* the selected node is introduced to cluster for further configurations */ Introduce_The_Selected_Machine_To_Cluster() IF (Status==Passive) THEN // Daemon goes to standby mode until new request arrives Exit_And_Wait_For_New_Requests() ELSE IF (Status==ProActive) THEN // Daemon returns to its previous state and continues scanning any reachable neighbor Exit_And_Search_For_New_External_Resources() END IF
5. Evaluation
4
preconfigured set of resources, it never faces the problem of not having enough resources to handle requests. Although RNS shows lower performance during the second interval, it can compensate the lower performance in the third interval and yield a high performance overall.
longer responds and therefore there are no free resources. In this situation, the second method makes use of a set of reserved resources. The oscillations shown in this method’s curve represent that the number of resources joining and leaving the cluster changes dynamically. Furthermore, as the number of requests exceeds the capacity of the resources of the cluster and the reserved set, no resources will be available to this method any longer. RNS is not limited to a preconfigured set of resources, therefore after all resources within the cluster become busy; it adds new resources from outside of the cluster. Therefore, it keeps oscillating and never stops responding requests. 6. Conclusion In this paper we showed that our proposed remote cluster node selection called RNS responds well to cluster requests when cluster went out of resources. We also showed that we can prevent system failures in accomplishing requests and also prevent resources from being overloaded. We showed that RNS allows the use of external resources in the cluster system without doing any pre-reservation with high response time. Compared to other resource discovery methods, RNS has higher dynamicity, can interact with different operating systems, and does not need any agents on remote machines with which to communicate. These attributes has allowed the use of all out-of-cluster resources while keeping their utilization high. On the other hand, since no agent is used on remote machines and also its ability to interact with different operating systems, scalability is highly provided by RNS.
Figure 2: Comparison between the three methods in response time.
Figure 3: Comparison between the three methods in the use of free resources
At first, all of the resources within the cluster are free. As more requests are generated in the system, the number of available resources decreases. The three methods show the same behavior until there is no resource available inside the cluster system. After all the internal resources become busy, the first method no
References [1] E. Pournaras, G. Exarchakos, and N. Antonopoulos, "Load-driven neighbourhood reconfiguration of Gnutella overlay," Computer Communications, vol. 31, no. 13, Aug. 2008.
5
[2] D. Wischik, M. Handley, and M. B. Braun, "The resource pooling principle," ACM SIGCOMM Computer Communication Review, vol. 38, no. 5, Oct. 2008.
[10] I.T. Foster, "Globus Toolkit Version 4: Software for Service-Oriented Systems", presented at J. Comput. Sci. Technol., pp.513-520, 2006.
[3] D. Oppenheimer, J. Albrecht, D. Patterson, and A. Vahdat, "Design and implementation trade-offs for widearea resource discovery," ACM Transactions on Internet Technology, vol. 8, no. 4, pp. 113-124, Sep. 2008.
[11] A. Barak. (2011, May) MOSIX:Cluster and multicluster management. [Online]. http://knol.google.com/k/amnon-barak/mosix/qibu8ltfp5fh/5 [12] T. Narten, E. Nordmark, W. Simpson, and H. Soliman, "Neighbor Discovery for IP version 6 (IPv6)," Network Working Group RFC 4861, 2007.
[4] M. L. Massie, B. N. Chun, and D. E. Culler, "The Ganglia Distributed Monitoring System: Design, Implementation And Experience," Parallel Computing, vol. 30, no. 7, pp. 817-840, Jul. 2004. [5] R. Raman, M. Livny, M. Solomon, “Matchmaking: Distributed Resource Management for High Throughput Computing”, in Proceedings of HPDC, pp.140-140, 1998.
[13] C. Mastroianni, D. Talia, and O. Verta, "A P2P Approach for Membership Management and Resource Discovery in Grids," in Proceedings of the International Conference on Information Technology: Coding and Computing , Washington, DC, USA, 2005.
[6] D. Oppenheimer, J. Albrecht, D. Patterson, and A. Vahdat, "Scalable Wide-Area Resource Discovery," EECS Department, University of California, Berkeley, California, United States of America, Technical Report UCB/CSD-041334, 2004.
[14] P. Karwaczyński, D. Konieczny, J. Moçnik, and M. Novak, "Dual proximity neighbour selection method for peer-to-peer-based discovery service," in Proceedings of the 2007 ACM symposium on Applied computing, New York, NY, USA, 2007.
[7] C. Mastroianni, D. Talia, and O. Verta, “A super-peer model for resource discovery services in large-scale grids”, Future Generation Computer Systems, pp.1235–1248, 2005.
[15] M. Sharifi, S. L. Mirtaheri, E. Mousavi Khaneghah, A Dynamic Framework for Integrated Management of All Types of Resources in P2P Systems, The Journal of Supercomputing, 52(2), 149–170, 2010.
[8] S. Ding, J. Yuan, J. Ju, and L. Hu, “A heuristic algorithm for agent-based grid resource discovery”, Intl. Conf. on eTechnology, e-Commerce and e-Service, Hong Kong, pp.222–225, 2005.
[16] T. Bhe, K. Inayama, C. Lister, M. Parlione, and M. Vesich, Ibm tivoli monitoring version 5.1.1 creating resource models and providers. Riverton, NJ, USA: IBM Corp, 2003.
[9] D. Talia, P. Trunfio, and J. Zeng, “Peer-to-peer models for resource discovery in large-scale grids: a scalable architecture”, High Performance Computing for Computational Science – VECPAR 2006, pp.66–78, 2007.
[17] H. Team, "Going beyond simple monitoring with HP SiteScope," Hewlett-Packard Company White Paper, 2010. [18] R. Olups, Zabbix 1.8 Network Monitoring. USA: Packt Publishing, 2010. [19] W. Barth, Nagios: System and Network Monitoring. San Francisco, CA, USA: No Starch Press, 2008.
6