Submission Mechanism for Scientific Cloud Computing,. SciInterface. .... Information details among all the nodes in a cluster, whenever a new user is added into ...
SciInterface:A Web-Based Job Submission Mechanism for Scientific Cloud Computing Vineeth Simon Arackal1, Aman Arora2, Deepanshu Saxena3,Arunachalam B4,Prahlada Rao B B5 Centre for Development of Advanced Computing, C-DAC Knowledge Park, No.1 Old Madras Road, Byappanahalli, Bangalore-560038, INDIA
(1vineeth,2amana,3deepanshus,4barunachalam,5prahladab)@cdac.in Abstract—Scientific Cloud Computing has emerged rapidly in the last few years. Scientific Cloud Computing, which targets high performance communities, has caused a shift over of the scientific applications running on conventional physical infrastructure to a more scalable and flexible virtual infrastructure managed by the cloud providers. Scientific Cloud Computing has brought up a lot of advantages (pay-per-use, resource on-demand, high-availability, flexibility, scalability) over the traditional computing models. Traditional computing models, in order to use scientific applications, require in-house infrastructure (servers, storage etc.) and skill to manage this infrastructure, whereas the Scientific Cloud Computing allows the user to focus on applications without worrying about the infrastructure and installations required. Scientific clouds need proper access mechanisms to improve the adoption of such facilities by scientific communities and enhance the usage of scientific clouds. It is very important to have a proper access mechanism in place for Scientific Clouds so as to improve the adoption of such facilities and enhance the usage of cloud. In this paper, the authors present a Web-Based Job Submission Mechanism for Scientific Cloud Computing, SciInterface. SciInterface provides a graphical user-friendly interface to the users for interacting with the Scientific Cloud and a facility to submit scientific jobs in to the cloud for processing. The prominent features of SciInterface are user privileges mapping, job submission, job monitoring and output logs. The authors, in this paper, discuss the architecture and features of SciInterface, along with the challenges faced during the development. Keywords—SciInterface,Scientific Cloud,IaaS, Virtual Cluster and Cloud Computing
I.
INTRODUCTION
Cloud Computing [1] aims to provide computer software and hardware with a pay per use mechanism to the users. This can enhance the possibility of utilization of resources [2] by the users who do not have direct access to the expensive hardware and software. This becomes even more important for the scientific community, which requires supercomputing facilities to solve their problems. The ability to pay only for the exact usage enables more users to have access to supercomputing [3] facilities through the cloud computing interface. Centre for Development of Advanced Computing (CDAC) [4] has developed a scientific cloud facility in Bangalore, India, with the aim of making supercomputing
facilities available to more users. Scientific Cloud [5] is an Infrastructure as a Service (IaaS) [6] facility which provides HPC virtual clusters [7] as well as virtual machines on demand. In most of the cloud solutions, the infrastructure acquired by the end-users normally doesn't have any graphical interface, which leads to a lag in the usage of infrastructure. To overcome this problem, C-DAC has come up with a graphical user interface for interaction with the virtual machines or virtual clusters provisioned using Scientific Cloud. This graphical user interface is web-based and it eases the user interaction and provides a mechanism to submit jobs to the infrastructure obtained. The terms SciInterface and Job Submission Access Mechanism for Scientific Cloud Computing (JSAMSCC) are used interchangeably in this paper. II.
RELATED WORKS
Cloud computing empowers IT through new on-demand service models and new levels of IT efficiency. Cloud computing is providing different kind of services, under SaaS many other web-based portal are developed to fulfill the user requirement like Data service portal for Application Integration [8], Cloud service Portal for Mobile Device etc. The web based environment for submitting jobs to the cloud are very few as compared to the grid environment. There are many organizations which provide different cloud computing services for job submission based on user or organization needs. The Oracle Corporation has Sun Cloud ondemand cloud computing service [9]. Sun Cloud contains the resources which may be either user data or executable. It provides all the resources while submitting the jobs. Red Cloud is also a cloud computing service which is provided on subscription basis and is developed by CAC (Cornell University Center for Advance Computing), which has two offerings one is basic offering Red Cloud and the other is Red Cloud With MATLAB [10]. The Red Cloud with MATLAB can be installed manually or automatically by the user to execute their MATLAB. Moab is another web application which provides job submission facility. It is used to control the workloads in a cloud environment. End user can submit jobs through browser which is handled by the Moab Workload Manager and associate resource manager [11, 12].
III.
C-DAC SCIENTIFIC CLOUD
Scientific Cloud is a cloud solution, developed with an aim to bring cloud benefits to the scientific community. Scientific Cloud provides Infrastructure and Storage as a Service. Infrastructure provided by the Scientific Cloud is best suited for scientific applications. Scientific applications can be broadly categorized as compute intensive applications, which require huge number of cores and devote most of the time in computations, and data-intensive applications that deal with huge amount of data and devote most of the time in I/O and manipulation of data. Scientific Cloud provides two types of infrastructure, MPI-enabled virtual clusters to cater the needs of high computation applications, and Hadoop-based [13] virtual clusters for data intensive applications. The virtual clusters provided are of varying configurations to meet the specific needs of different users. They are categorized as small, medium and large. Small sized clusters have 1 GB RAM and 1 VCPU, medium sized clusters have 2 GB RAM and 2 CPUs and the large sized clusters have 4 GB RAM and 4 VCPUs. The number of machines (nodes) required in a cluster may vary with the applications. So facility to select the number of nodes per cluster has been provided in Scientific Cloud, and a maximum of 8 nodes per cluster can be selected in the current version of Scientific Cloud. As there is a need to interact with the virtual cluster, we have developed a web-based user interface SciInterface, to access the virtual cluster instantiated through Scientific Cloud.
SciInterface comes hosted in every virtual cluster in the form of a Web Archive File (WAR File) and it makes use of all the nodes in the particular cluster. SciInterface internally makes a lot of configuration changes like updating Network Information details among all the nodes in a cluster, whenever a new user is added into the virtual cluster. It also takes care of enabling the MPI environment for every user and also manages Network File System (NFS) configurations in the Cluster. Monitoring of the virtual clusters acquired by the end user is an important task. Monitoring allows user to check the performance of the infrastructure under different load conditions. The various parameters of performance are cpu idle time, free disk space, number of processes running, free cache etc. In Scientific Cloud, user can monitor the cluster performance through a web-based interface which displays all the performance parameters in the form of graphs. IV.
SCIINTERFACE ARCHITECTURE
Architecture of SciInterface has been divided into two parts, System Level and the User Level, where system level describes the architecture in context of server-side implementations, and the User-Level describes the architecture in usability context focusing mainly on the user's interaction with the tool. System Level focuses on the service provider's prospective describing the set-ups, configurations and other practices which have been followed while developing the SciInterface. System level architecture may serve the purpose of re-establishing such a facility in other cloud environments too. The HPC [14] virtual clusters/machines obtained from Scientific Cloud can be of varying configurations like smallsized (1 GB RAM with 1 VCPU), medium-sized (2 GB RAM with 2 VCPUs) and large-sized (4 GB RAM with 4 VCPUs).
Fig. 1. C-DAC Scientific Cloud Architecture
Fig. 2. SciInterface Architecture
A virtual cluster is an interconnect of a set of virtual machines performing a common task. In a cluster all the machines are termed as nodes. The node interacting with the user is head node, which manages the rest of the nodes termed as slave nodes. This approach enhances the performance of parallel applications in terms of execution time.
contacts the different worker/slave nodes and presents the required information to the user. The Output as well as the error can be viewed in the browser itself, and corresponding changes can be made to run the job properly. The output/error files can be downloaded to the local machines too in the text format.
Virtual cluster during boot-time deploys SciInterface as well as other required Libraries and dependencies needed to run the tool. Virtual cluster's head node hosts the tool using Tomcat [15] as a web server/container. The head nodes as well as the worker/slave nodes have library support to compile and run parallel jobs. So in this sense, the infrastructure provided in the form of cluster houses the platform to run parallel jobs. The cluster has a resource scheduler in the form of Torque [16] which spawns the parallel jobs submitted (to it via PBS [17] scripts) across different worker/slave nodes. The jobs submitted by the user are stored in the users' home directory and a PBS script is generated on the fly based on the type of job, and other particulars selected among the various option provided in the tool. This PBS script is then submitted to the Torque at the virtual cluster's head node .Based on the PBS script the torques provisions the no of processors, etc. needed by the job. To persist the user data across system reboots or tomcat reboots, we have used file as a database, where we store the user and job details in the form of serialized objects, which are backed up every time user logs out or closes the application. User level architecture focuses on the user's interaction with the SciInterface. It describes full flow of the actions which user can perform with the SciInterface. The full work-flow is mention with the help of Figure 3.The user logs into the virtual cluster head node through any browser, using SciInterface. The user logged in has privileges as per his/her account in the particular virtual cluster .E.g. If a root user logs in, user can run any task, because root has all the permission. So if a root user logs in, he can add other non-privileged users to use the virtual cluster and run the jobs over the cluster. Login mechanism is fully UNIX authentication based, so only those users who have an account in the cluster can login through the portal. Adding a user here means creating a user in the cluster with the same name and dynamically configuring the cluster for Network Information Services (NIS) [18], because once a new user is added his/her information should be distributed across each of the nodes. Also MPI [19] is configured for that particular user before user logs in and starts submitting jobs. The jobs submitted to the cluster via SciInterface can take time to execute. So, after submitting the job unique id is allotted to every job for user's reference. This id can be used to monitor the status of jobs at any point of time. The job details which are provided includes status of the job (whether it's running or its over), time taken to complete the job, errors if any, and the output. The job details and status of the job is gathered by querying Torque at the head node which internally
Fig. 3. SciInterface Flow-chart
V.
FEATURES OF SCIINTERFACE
The following are the key features of SciInterface: a) Authentication and User Privileges. b) Job Submission through Torque Job Scheduler. c) Upload application specific files. d) Job monitoring. e) View or Download logs. f) Unprivileged (Non-root) User Access. A. Authentication and User Privileges: Since SciInterface is an interface to virtual cluster, user needs to have an account in the machine in order to access it. Logging into the tool means logging into the machine. SSH authentication is used to check if a user exists on that particular machine or not, and corresponding privileges are assigned to the user. B. Job Submission through Torque Job Scheduler SciInterface can be used to submit sequential and homogeneous parallel jobs through Torque scheduler to Scientific Cloud as shown in Figure 4. Most of the Homogeneous Parallel jobs are MPI (Message Passing Interface) based applications. Torque maintains the job in queue and schedules it. The job will be running in the computing nodes and the Output/Error data will be stored in the Home directory of the user. Applications in the domains of Bioinformatics, Climate Modeling and NAS Benchmark are some of the example jobs that were successfully executed through this interface.
C. Upload application specific files. Multiple files may be needed at run-time based on the job types like standard input files, other parameter files etc. To move the file from user's terminal to the virtual cluster, uploading facility has been provided in the SciInterface with some specific file types like standard input files, executables etc. E.g. If a C program is uploaded for execution, and it requires to read a file during run-time ,then that particular file can be uploaded along with a C program. D. Job monitoring SciInterface provides status, information of submitted jobs which is accessed from Torque scheduler. After submitting the job to the cloud interface, user can monitor status of the same at any point of time by using this option. User can select the unique job id assigned to each job and get the status of the job as shown in Figure 5. The status can be running, queued, completed or failed. The users also have the option to get the detailed logs generated by the job which may help in debugging the issues, if any. C. View or Download logs. This feature allows the user to view the output/error files generated after completion of job. It also facilitates the user to download the output/error files in their local desktop. D. Unprivileged (Non-root) User Access SciInterface allows owner/admin of the virtual clusters to create new users in the virtual cluster using web interface. Based on the requirement admin can create any number of new users and facilitate them to submit jobs and optimize usage of cloud resources in the Scientific Cloud. Adding a user through SciInterface is different form adding a user manually on the Linux machine because if user is added via tool then only the
Fig. 4. Job Submission Page (After Login) Fig. 5. Job Information Page
user can submit the job, as a lot of required configurations and environments settings are performed by the tool at run time. VI. PERFORMANCE ANALYSIS OF SCIENTIFIC APPLICATIONS IN SCIENTIFIC CLOUDENVIRONMENT There is a need to analyze the performance of scientific applications in cloud environment under varying setups and configurations. To check how scientific applications behave in a normal cloud environment and the Scientific Cloud Environment, an experimental setup has been created consisting of a virtual machine and a set of virtual clusters. The standard MPI Jacobi application which is used for solving the system of Linear Equations using Jacobi Method [20] has been used for performance analysis .The studies have been done by comparing the results on a single virtual machine (named VM) with 1 GB RAM and 1 VCPU and clusters having set of 2, 4 and 8 virtual machines of equivalent configuration. The single virtual machine is equivalent to a conventional cloud environment, whereas the Scientific Cloud is powered with scale up and down virtual clusters. Figure 6 depicts the execution time taken by both the virtual machine and the Scientific Cloud provisioned virtual clusters in the form of a graph. Figure 7 shows the comparison results in the form of a table.
It is clear from the graph, when number of computing resources scale up, the Scientific Applications perform better. The time taken by a conventional Virtual machine and the Scientific Cloud clusters decrease to almost half its value as we add one more computing node to the cloud setup. VII. CHALLENGES FACED WHILE DEVELOPING SCIINTERFACE A. Cross Browser Compatibilities Users may use different browsers and each browser varies in the implementation of certain functionalities and standards set by World Wide Web Consortium [23].As our tool is a webbased tool and has to be accessed via browser, developing the code which works smoothly and equally well on all the browsers was a challenge. To overcome this challenge we used the practices like avoiding certain tags which cause problems across browsers and writing conditional checks in JavaScript code to run different versions of some code on different browsers. e.g. if(browser is Internet Explorer) { run Internet Explorer specific code } elseif(browser is Google Chrome) { run Google Chrome specific code }
else { other browsers and their specific code. } This way we overcame most of the incompatibilities.
Fig. 6. Performance Analysis Graph TABLE I.
PERFORMANCE ANALYSIS TABLE
S.No.
Computing Nodes
Execution time(seconds)
1.
1
585
2.
2
309
3.
4
172
4.
8
91
B. Size of SciInterface Since SciInterface has to be packed inside every virtual cluster provided to the user, we wanted to keep the size of the tool minimal. We avoided use of any heavy Java Frameworks like Hibernate [24], Struts [25] etc and the class path was configured keeping the least libraries possible. Since every virtual cluster has its individual copy of SciInterface, so the data which SciInterface needs to persist is not huge, as it will include only the job details and user details of a set of users using that particular virtual cluster. To persist the data like job details and user details across system restarts, there was a need to use database .But tradeoff between the size of information to store and the size of database itself was a problem. Avoiding the database to take up space on the virtual cluster, we used the concept of Serialization [26] to deflate the user objects and store them in a file, instead of using databases. We created an Application Context as a Singleton [27] Object which is common to the whole application. This Object worked as a database and it held all the information including job details, user details etc .The Object was loaded every time user
logs in and as soon as the user logs out or closes the application, the Application Context object was deflated into file system.
[6]
Pei Fan, Zhenbang Chen, Ji Wang, Zibin Zheng , "Online Optimization of VM Deployment in IaaS Cloud" published in IEEE 18th InternationalConference on Parallel and Distributed Systems, Singapore 2012
C. Providing Public Access to the Private Virtual Clusters Virtual clusters provisioned by Scientific Cloud actually run on servers which are internal to C-DAC. All the virtual clusters after booting up are assigned a private IP (Internet Protocol) addresses. Since the addresses are private, it cannot be accessed from outside. To allow the access of virtual clusters to the users, all the virtual clusters need to be on public IPs but as the model of Scientific Cloud depicts, it is mostly suited for running Scientific applications by a group of researchers and need not be publicly available to the whole world as a web application. So to overcome the problem, we have set up a gateway machine as shown in figure 2 which is public and using port forwarding, we are transferring traffic and allowing access to the different virtual clusters. Free ports on the gateway are mapped to the ports of individual clusters where the particular SciInterface runs. The users are provided with gateway IP and a port, using which they can access their virtual cluster.
[7]
Ralph Butler, Zach Lowry, and Chrisila C. Pettey, "Virtual Clusters" published in Systems Engineering, ICSEng 2005. 18th International Conference, 2005. Jeaha Yang, Rangachari Anand, Stacy Hobson , Juhnyoung Lee, Yuan Wang and Jing Min Xu , " Data Service Portal for Application Integration in Cloud Computing" published in Emerging Technologies for a Smarter World (CEWIT), 8th International Conference & Expo , New York, 2011 "SunCloud", http://en.wikipedia.org/wiki/Sun_Cloud
E.g. port no 10000 of gateway (with IP w.x.y.z) is mapped to port 80 of a virtual cluster .Now, user can access the SciInterface by using w.x.y.z:10000 as a URL. VIII. CONCLUSIONS AND FUTURE WORK We have discussed a web-based job submission mechanism for scientific clouds in this paper. We have shown that, with its support for the virtual machines and virtual clusters, the tool can be extremely effective in enhancing the adoption of scientific clouds among the high performance computing community. We plan to increase the utility of this tool by supporting compilation of jobs and by providing support for workflows like NGS (Next Generation Sequencing) [28]. Job compilation facility can help the user by providing a web-based graphical user interface for compilation of jobs. Similarly, workflow management systems can enhance the utility of SciInterface by supporting applications with complex dependencies. IX. [1]
REFERENCES
Definition of Cloud Computing by NIST, National Institute of Standards and Technology's http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
[8]
[9]
[10] "Red Cloud with MATLAB" , http://www.cac.cornell.edu/wiki/index.php?title=Red_Cloud_with_MA TLAB&oldid=1894 [11] Delivering and Managing HPC Cloud with Moab, Adaptive Computing, http://www.adaptivecomputing.com/products/hpc-products/moab-hpcsuite-enterprise-%20edition/hpc-cloud-overview/ [12] MOAB Workload Manager, http://www.hpc.fsu.edu/index.php?option=com_content&view=article&i d=67 [13] Guanghui Xu, Feng Xu*, Hongxu Ma ,Deploying and Researching Hadoop in Virtual Machines, published in Automation and Logistics (ICAL), IEEE International Conference, 2012 [14] Daniel Chavarr´ıa-Miranda, Zhenyu Huang," High-Performance Computing (HPC): Application & Use in the Power Grid" published in Power and Energy Society General Meeting, San Diego, 2012 [15] Apache Tomcat, http://tomcat.apache.org/ [16] TORQUE Resource Manager, by Adaptive Computing, http://www.adaptivecomputing.com/products/open-source/torque [17] Portable Batch System, http://en.wikipedia.org/wiki/Portable_Batch_System [18] Network Interface Services, http://en.wikipedia.org/wiki/Network_Information_Service [19] Message Passing Interface, http://en.wikipedia.org/wiki/Message_Passing_Interface [20] A simple Jacobi iteration, http://www.mcs.anl.gov/research/projects/mpi/tutorial/mpiexmpl/src/jac obi/C/main.html [21] Vineeth Simon Arackal, Arunachalam B, Bijoy M B, Prahlada Rao B B, Kalasagar B, Sridharan R, Subrata Chattopadhyay, "An Access Mechanism for Grid GARUDA", Internet Multimedia Services Architecture and Applications (IMSAA), IEEE International Conference , Bangalore,2009, [22] Grdiway , http://en.wikipedia.org/wiki/GridWay
[2]
Young Choon lee , Albert Y. Zomaya, Utilization of Resoucres http://link.springer.com/article/10.1007%2Fs11227-010-0421-3
[23] World Wide Web Consortium, which control web standards etc. http://www.w3.org/
[3]
Malony, A.D , "Supercomputing around the world" published in Supercomputing '92., Proceedings , Minneapolis ,1992 ,Page(s) :126129
[24] Hibernate Relational Persistence .NET,http://www.hibernate.org/
[4]
Centre For Development For Advance Computing, India, http://cdac.in/
[5]
Payal Saluja, Prahlada Rao, Ankit Mittal, Rameez Ahmad, "CDAC Scientific Cloud: On Demand Provisioning of Resources for Scientific Applications", as regular paper in PDPTA'12: The 18th International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, Nevada, USA , 16-19 July 2012
For
JAVA
and
[25] Struts , Web Framework, http://struts.apache.org/ [26] Serialization in JAVA, http://docs.oracle.com/javase/6/docs/api/java/io/Serializable.html [27] Head First Design Pattern , by O'Reilly Media pp. 169-180. [28] Jorge S Reis-Filho, Next Generation http://www.biomedcentral.com/content/pdf/bcr2431.pdf
Sequencing