CS6712 Grid and Cloud Computing Laboratory.pdf - Google Drive

14 downloads 209 Views 3MB Size Report
CS6712 Grid and Cloud Computing Laboratory.pdf. CS6712 Grid and Cloud Computing Laboratory.pdf. Open. Extract. Open with
Website: https://larshansolutions.blogspot.in http://liveinternetjobs.blogspot.com http://auquestion.blogspot.in Contact:

[email protected]

CS6712 GRID AND CLOUD COMPUTING LABORATORY

LTPC0032

OBJECTIVES: The student should be made to: Be exposed to tool kits for grid and cloud environment. Be familiar with developing web services/Applications in grid framework Learn to run virtual machines of different configuration. Learn to use Hadoop LIST OF EXPERIMENTS: GRID COMPUTING LAB Use Globus Toolkit or equivalent and do the following: 1. Develop a new Web Service for Calculator. 2. Develop new OGSA-compliant Web Service. 3. Using Apache Axis develop a Grid Service. 4. Develop applications using Java or C/C++ Grid APIs 5. Develop secured applications using basic security mechanisms available in Globus Toolkit. 6. Develop a Grid portal, where user can submit a job and get the result. Implement it with and without GRAM concept. CLOUD COMPUTING LAB Use Eucalyptus or Open Nebula or equivalent to set up the cloud and demonstrate. 1. Find procedure to run the virtual machine of different configuration. Check how many virtual machines can be utilized at particular time. 2. Find procedure to attach virtual block to the virtual machine and check whether it holds the , urlPatterns={"/TutorialServlet"})

public class TutorialServlet extends HttpServlet { @EJB private ConverterBean converterBean; /** * Processes requests for both HTTP GET * and POST methods. * @param request servlet request * @param response servlet response * @throws ServletException if a servlet-specific error occurs * @throws IOException if an I/O error occurs */ protected void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=UTF-8"); PrintWriter out = response.getWriter(); try { out.println(""); out.println(""); out.println("Servlet TutorialServlet"); out.println(""); out.println(""); request.login("TutorialUser", "TutorialUser"); BigDecimal result = converterBean.dollarToYen(new BigDecimal("1.0")); out.println("Servlet TutorialServlet result of dollarToYen= " + result + ""); out.println(""); out.println(""); } catch (Exception e) { throw new ServletException(e); } finally { request.logout(); out.close(); } } } This code sample shows how to use the authenticate method: package com.sam.test; import java.io.*;

import javax.servlet.*; import javax.servlet.http.*; public class TestServlet extends HttpServlet { protected void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=UTF-8"); PrintWriter out = response.getWriter(); try { request.authenticate(response); out.println("Authenticate Successful"); } finally { out.close(); } }

Writing GSI enabled Web Services Install Tomcat according to the instructions and check that it works. Installing the software Install Tomcat 

Install Tomcat according to the instructions and check that it works.

Deploy Axis on Tomcat 

Install Axis according to the instructions and check that it works.

Note that a bug in Tomcat means that any jars containing java.* or javax.* classes will not be executed if there are in the webapps/ tree. Instead, copy the jars to Tomcat's common/lib directory. In Axis alpha 3 this applies to axis.jar and wsdl4j.jar; in Axis beta 1 this applies to jaxrpc.jar and wsdl4j.jar. Install libraries to provide GSI support for Tomcat   

Copy cog.jar, cryptix.jar, iaik_ssl.jar, iaik_jce_full.jar, iaik_javax_crypto.jar to Tomcat's common/lib directory. Check that log4j-core.jar and xerces.jar (or other XML parser) are in Tomcat's common/lib directory. Copy gsicatalina.jar to Tomcat's server/lib directory.

Deploy GSI support in Tomcat 

Edit Tomcat's conf/server.xml Add GSI Connector in section: If you are testing under a user account, make sure that the proxy or certificates and keys are readable by Tomcat. For testing purposes you can use user proxies or certificates instead of host certificates e.g.: If you do test using user proxies, make sure the proxy has not expired!

Add a GSI Valve in the section:

Install libraries to provide GSI support for Axis 

Copy gsiaxis.jar to the WEB-INF/lib directory of your Axis installation under Tomcat.

Set your CLASSPATH correctly 



You should ensure that the following jars from the axis/lib directory are in your classpath: o axis.jar o clutil.jar o commons-logging.jar o jaxrpc.jar o log4j-core.jar o tt-bytecode.jar o wsdl4j.jar You should also have these jars in your classpath: o gsiaxis.jar o cog.jar o xerces.jar (or other XML parser)

Start the GSI enabled Tomcat/Axis server 

Start up Tomcat as normal

Check the logs in Tomcat's logs/ directory to ensure the server started correctly. In particular check that:    

apache_log.YYYY-MM-DD.txt does not contain any GSI related error messages catalina.out contains messages saying "Welcome to the IAIK ... Library" catalina_log.YYYY-MM-DD.txt contains messages saying "HttpConnector[8443] Starting background thread" and "HttpProcessor[8443][N] Starting background thread" localhost_log.YYYY-MM-DD.txt contains a message saying "WebappLoader[/axis]: Deploy JAR /WEB-INF/lib/gsiaxis.jar"

Writing a GSI enabled Web Service Implementing the service The extensions made to Tomcat allow us to receive credentials through a transport-level security mechanism. Tomcat exposes these credentials, and Axis makes them available as part of the MessageContext. Alpha 3 version Let's assume we already have a web service called MyService with a single method, myMethod. When a SOAP message request comes in over the GSI httpg transport, the Axis RPC despatcher will look for the same method, but with an additional parameter: the MessageContext. So we can write a new myMethod which takes an additional argument, the MessageContext. This can be illustrated in the following example: package org.globus.example; import org.apache.axis.MessageContext; import org.globus.axis.util.Util; public class MyService { // The "normal" method public String myMethod(String arg) { System.out.println("MyService: http request\n"); System.out.println("MyService: you sent " + arg); return "Hello Web Services World!"; } // Add a MessageContext argument to the normal method public String myMethod(MessageContext ctx, String arg) { System.out.println("MyService: httpg request\n"); System.out.println("MyService: you sent " + arg); System.out.println("GOT PROXY: " + Util.getCredentials(ctx)); return "Hello Web Services World!"; } } Beta 1 version In the Beta 1 version, you don't even need to write a different method. Instead the Message Context is put on thread local store. This can be retrieved by calling MessageCOntext.getCurrentContext():

package org.globus.example; import org.apache.axis.MessageContext; import org.globus.axis.util.Util; public class MyService { // Beta 1 version public String myMethod(String arg) { System.out.println("MyService: httpg request\n"); System.out.println("MyService: you sent " + arg); // Retrieve the context from thread local MessageContext ctx = MessageContext.getCurrentContext(); System.out.println("GOT PROXY: " + Util.getCredentials(ctx)); return "Hello Web Services World!"; } } Part of the code provided by ANL in gsiaxis.jar is a utility package which includes the getCredentials() method. This allows the service to extract the proxy credentials from the MessageContext. Deploying the service Before the service can be used it must be made available. This is done by deploying the service. This can be done in a number of ways: 1. Use the Axis AdminClient to deploy the MyService classes. 2. Add the following entry to the server-config.wsdd file in WEB-INF directory of axis on Tomcat: 3. 4. 5. 6. Writing a GSI enabled Web Service client As in the previous example, this is very similar to writing a normal web services client. There are some additions required to use the new GSI over SSL transport:   

Deploy a httpg transport chain Use the Java CoG kit to load a Globus proxy Use setProperty() to set GSI specifics in the Axis "Property Bag":

o



globus credentials (the proxy certificate) o authorisation type o GSI mode (SSL, no delegation, full delegation, limited delegation) Continue with the normal Axis SOAP service invokation: o Set the target address for the service o Provide the name of the method to be invoked o Pass on any parameters required o Set the type of the returned value o Invoke the service

Here's an example which can be used to call the service you wrote in the last section: package org.globus.example; import org.apache.axis.client.Call; import org.apache.axis.client.Service; import org.apache.axis.encoding.XMLType; import org.apache.axis.configuration.SimpleProvider; import org.apache.axis.utils.Options; import org.apache.axis.AxisFault; import org.apache.axis.SimpleTargetedChain; import org.apache.axis.transport.http.HTTPSender; import org.globus.axis.transport.GSIHTTPSender; import org.globus.axis.transport.GSIHTTPTransport; import org.globus.axis.util.Util; import org.globus.security.auth.SelfAuthorization; import org.globus.security.GlobusProxy; import javax.xml.rpc.namespace.QName; import javax.xml.rpc.ParameterMode; public class Client { public static void main(String [] args) { Util.registerTransport(); try { Options options = new Options(args); String endpointURL = options.getURL();

String textToSend; // Parse the arguments for text to send args = options.getRemainingArgs(); if ((args == null) || (args.length < 1)) { textToSend = ""; } else { textToSend = args[0]; } // Set up transport handler chains and deploy SimpleProvider provider = new SimpleProvider(); SimpleTargetedChain c = null; c = new SimpleTargetedChain(new GSIHTTPSender()); provider.deployTransport("httpg", c); c = new SimpleTargetedChain(new HTTPSender()); provider.deployTransport("http", c); // Load globus proxy GlobusProxy proxy = GlobusProxy.getDefaultUserProxy(); // Create a new service call Service service = new Service(provider); Call call = (Call) service.createCall(); // Set globus credentials call.setProperty(GSIHTTPTransport.GSI_CREDENTIALS, proxy); // Set authorization type call.setProperty(GSIHTTPTransport.GSI_AUTHORIZATION, new SelfAuthorization(proxy)); // Set gsi mode call.setProperty(GSIHTTPTransport.GSI_MODE, GSIHTTPTransport.GSI_MODE_LIMITED_DELEG); // Set the address of the service (from cmd line arguments) call.setTargetEndpointAddress( new java.net.URL(endpointURL) ); // Set the name of the method we're invoking

call.setOperationName(new QName("MyService", "myMethod")); // Setup a target parameter call.addParameter( "arg1", XMLType.XSD_STRING, ParameterMode.PARAM_MODE_IN); // Set the return type call.setReturnType( XMLType.XSD_STRING ); // Invoke the method, passing in the value of "arg1" String ret = (String) call.invoke( new Object[] { textToSend } ); // Print out the returned value System.out.println("MyService returned: " + ret); } catch (Exception e) { if ( e instanceof AxisFault ) { ((AxisFault)e).dump(); } else e.printStackTrace(); } } } You can invoke this client by running: java org.globus.example.Client -l httpg://127.0.0.1:8443/axis/servlet/AxisServlet "Hello!" assuming that you are running the client on the same machine (localhost) as the Axis/Tomcat server, and that you've installed Axis in the webapps/axis directory of Tomcat. If you examine logs/catalina.out you should see the messages from the Client received by the service, as well as the proxy credentials. Descriptions of the GSI extensions to Tomcat and Axis 1. 2. 3. 4. 5.

Build a server-side SOAP service using Tomcat and Axis Create connection stubs to support client-side use of the SOAP service Build a custom client-side ClassLoader Build the main client application Build a trivial compute task designed to exercise the client ClassLoader

Build the SOAP service The SOAP service I build in this article is the closest thing to a management layer that this framework will have. The SOAP service provides a way for our grid computing application to pull the classes it needs from the SOAP server. While my example service simply delivers a single specific jar file, this service's actual production version would likely have access to multiple jar files (each containing a different computing task), and it would contain additional logic to control which JAR was delivered to whom. The first step in providing the SOAP service is to set up the SOAP infrastructure. I chose Tomcat as the servlet container/HTTP server because it is an open source project and proves to be extremely reliable and easy to use. I chose Axis as the SOAP services provider because it too is open source, supports an easy-to-use drag-and-drop service installer, and comes with a tool that creates SOAP client-side stubs from WSDL (Web Services Description Language) files (a feature I exploit later). After downloading and installing Tomcat 4.0.6 and Axis 1.0, I wrote the SOAP service class GridConnection. This service fetches a known jar file, loads the file into a byte array, and returns the byte array to the caller. The following code is the entire file GridConnection.java: //// GridConnection.java // import java.util.*; import java.io.* ; public class GridConnection { public byte[] getJarBytes () { byte[] jarBytes = null ; try { FileInputStream fi = new FileInputStream("/Users/tkarre/MySquare/build/MySquare.jar"); jarBytes = new byte[fi.available()]; fi.read(jarBytes); fi.close() ; } catch(Exception e) {} return jarBytes ; } }

Result Thus the above application basic security in Globus executed successfully.

EX.NO:6

Develop Grid API’s using C++

AIM: To write a program for developing Grid API’s using C++. Algorithm: The Simple API for Grid Applications (SAGA) is a family of related standards specified by the Open Grid Forum to define an application programming interface (API) for common distributed computing functionality. The SAGA specification for distributed computing originally consisted of a single document, GFD.90, which was released in 2009. The SAGA API does not strive to replace Globus or similar grid computing middleware systems, and does not target middleware developers, but application developers with no background on grid computing. Such developers typically wish to devote their time to their own goals and minimize the time spent coding infrastructure functionality. The API insulates application developers from middleware. The specification of services, and the protocols to interact with them, is out of the scope of SAGA. Rather, the API seeks to hide the detail of any service infrastructures that may or may not be used to implement the functionality that the application developer needs. The API aligns, however, with all middleware standards within Open Grid Forum Implementations Since the SAGA interface definitions are not bound to any specific programming language, several implementations of the SAGA standards exist in different programming languages. Apart from the implementation language, they differ from each other in their completeness in terms of standard coverage, as well as in their support for distributed middleware. SAGA C++ SAGA C++ was the first complete implementation of the SAGA Core specification, written in C++. Currently the C++ implementation is not under active development. Job submission A typical task in a distributed application is to submit a job to a local or remote distributed resource manager. SAGA provides a high-level API called the job package for this. The following two simple examples show how the SAGA job package API can be used to submit a Message Passing Interface (MPI) job to a remote Globus GRAM resource manager.

C++ Program: #include int main (int argc, char** argv) { namespace sa = saga::attributes; namespace sja = saga::job::attributes; try { saga::job::description jd; jd.set_attribute (sja::description_executable, "/home/user/hello-mpi"); jd.set_attribute (sja::description_output, "/home/user/hello.out"); jd.set_attribute (sja::description_error, "/home/user/hello.err"); // Declare this as an MPI-style job jd.set_attribute (sja::description_spmd_variation, "mpi"); // Name of the queue we want to use jd.set_attribute (sja::description_queue, "checkpt"); jd.set_attribute (sja::description_spmd_variation, "mpi"); // Number of processors to request jd.set_attribute (sja::description_number_of_processes, "32"); saga::job::service js("gram://my.globus.host/jobmanager-pbs"); saga::job::job j = js.create_job(jd); j.run() } catch(saga::exception const & e) { std::cerr VMware Workstation). Linux hosts: In a terminal window, enter the command vmware & Note: On Linux hosts, the Workstation installer adds an entry to the Start menu for VMware Workstation. However, this menu entry is located in different submenus, depending on your Linux distribution. For example: SUSE Linux 9.1 — Start > System > More Programs > VMware Workstation Red Hat Enterprise Linux AS/WS Release 3 — Start > System Tools > More System Tools > VMware Workstation

2. If this is the first time you have launched VMware Workstation and you did not enter the serial number when you installed the product (an option available on a Windows host), you are prompted to enter it. The serial number is on the registration card in your package or in the email message confirming your electronic distribution order. Enter your serial number and click OK. The serial number you enter is saved and VMware Workstation does not ask you for it again. For your convenience, VMware Workstation automatically sends the serial number to the VMware Web site when you use certain Web links built into the product (for example, Help > VMware on the Web > Register Now! and Help > VMware on the Web > Request Support). This allows us to direct you to the correct Web page to register and get support for your product. 3. Start the New Virtual Machine Wizard. When you start VMware Workstation, you can open an existing virtual machine or create a new one. Choose File > New > Virtual Machine to begin creating your virtual machine. 4. The New Virtual Machine Wizard presents you with a series of screens that you navigate using the Next and Prev buttons at the bottom of each screen. At each screen, follow the instructions, then click Next to proceed to the next screen.

5. Select the method you want to use for configuring your virtual machine.

If you select Typical, the wizard prompts you to specify or accept defaults for the following choices: The guest operating system The virtual machine name and the location of the virtual machine's files

The network connection type Whether to allocate all the space for a virtual disk at the time you create it Whether to split a virtual disk into 2GB files If you select Custom, you also can specify how to set up your disk — create a new virtual disk, use an existing virtual disk or use a physical disk — and specify the settings needed for the type of disk you select. There is also an option to create a legacy virtual disk for use in environments with other VMware products. Select Custom if you want to Make a legacy virtual machine that is compatible with Workstation 4.x, GSX Server 3.x, ESX Server 2.x and VMware ACE 1.x. Store your virtual disk's files in a particular location Use an IDE virtual disk for a guest operating system that would otherwise have a SCSI virtual disk created by default Use a physical disk rather than a virtual disk (for expert users) Set memory options that are different from the defaults Assign more than one virtual processor to the virtual machine 6. If you selected Typical as your configuration path, skip to step 7. If you selected Custom as your configuration path, you may create a virtual machine that fully supports all Workstation 5 features or a legacy virtual machine compatible with specific VMware products.

This screen asks whether you want to create a Workstation 5 virtual machine or a legacy virtual machine. See Legacy Virtual Disks for more information. 7. Select a guest operating system.

This screen asks which operating system you plan to install in the virtual machine. Select both an operating system and a version. The New Virtual Machine Wizard uses this information to Select appropriate default values, such as the amount of memory needed Name files associated with the virtual machine Adjust settings for optimal performance Work around special behaviors and bugs within a guest operating system If the operating system you plan to use is not listed, select Other for both guest operating system and version. Note: Workstation supports 64-bit guest operating systems only in Workstation versions 5.5 and later, and only on host machines with supported processors. For the list of processors Workstation supports for 64-bit guest operating systems, see Support for 64-Bit Guest Operating Systems. Caution: Do not attempt to install a 64-bit operating system after selecting a 32-bit guest operating system type here. The remaining steps assume you plan to install a Windows XP Professional guest operating system. You can find detailed installation notes for this and other guest operating systems in the VMware Guest Operating System Installation Guide, available from the VMware Web site or from the Help menu. 8. Select a name and folder for the virtual machine. The name specified here is used if you add this virtual machine to the VMware Workstation Favorites list. This name is also used as the name of the folder where the files associated with this virtual machine are stored. Each virtual machine should have its own folder. All associated files, such as the configuration file and the disk file, are placed in this folder.

Windows hosts: On Windows 2000, Windows XP and Windows Server 2003, the default folder for this Windows XP Professional virtual machine is C:\Documents and Settings\\My Documents\My Virtual Machines\Windows XP Professional. On Windows NT, the default folder is C:\WINNT\Profiles\\Personal\My Virtual Machines\Windows XP Professional. Linux hosts: The default location for this Windows XP Professional virtual machine is /vmware/winXPPro, where is the home directory of the user who is currently logged on. Virtual machine performance may be slower if your virtual hard disk is on a network drive. For best performance, be sure the virtual machine's folder is on a local drive. However, if other users need to access this virtual machine, you should consider placing the virtual machine files in a location that is accessible to them. For more information, see Sharing Virtual Machines with Other Users. 9. Specify the number of processors for the virtual machine. The setting Two is supported only for host machines with at least two logical processors. Note: The following are all considered to have two logical processors: A single-processor host with hyperthreading enabled A single-processor host with a dual-core CPU A multiprocessor host with two CPUs, neither of which are dual-core or have hyperthreading enabled If the host does not have at least two logical processors, assigning two virtual processors is neither supported nor recommended: a warning message will appear. You can disregard this message and assign two virtual processors to the virtual machine, but, once you have finished creating the virtual machine, you will not be able to power it on unless you move it to a host machine with at least two logical processors. For more about Workstation support for virtual Symmetric Multiprocessing (SMP), see Using Two-Way Virtual Symmetric Multiprocessing (Experimental). 10. If you selected Typical as your configuration path, skip to step 11. If you selected Custom as your configuration path, you may adjust the memory settings or accept the defaults, then click Next to continue. In most cases, it is best to keep the default memory setting. If you plan to use the virtual machine to run many applications or applications that need high amounts of memory, you may want to use a higher memory setting. For more information, see Virtual Machine Memory Size.

Note: You cannot allocate more than 2GB of memory to a virtual machine if the virtual machine's files are stored on a file system such as FAT32 that does not support files greater than 2GB. 11. Configure the networking capabilities of the virtual machine. If your host computer is on a network and you have a separate IP address for your virtual machine (or can get one automatically from a DHCP server), select Use bridged networking. If you do not have a separate IP address for your virtual machine but you want to be able to connect to the Internet, select Use network address translation (NAT). NAT allows you to share files between the virtual machine and the host operating system. For more details about VMware Workstation networking options, see Configuring a Virtual Network. 12. If you selected Typical as your configuration path, click Finish and the wizard sets up the files needed for your virtual machine. If you selected Custom as your configuration path, continue with the steps below to configure a disk for your virtual machine. 13. Select the type of SCSI adapter you want to use with the virtual machine. An IDE and a SCSI adapter are installed in the virtual machine. The IDE adapter is always ATAPI. You can choose a BusLogic or an LSI Logic SCSI adapter. The default for your guest operating system is already selected. All guests except for Windows Server 2003, Red Hat Enterprise Linux 3 and NetWare default to the BusLogic adapter. The LSI Logic adapter has improved performance and works better with generic SCSI devices. The LSI Logic adapter is also supported by ESX Server 2.0 and higher. Keep this in mind if you plan to migrate the virtual machine to another VMware product. Your choice of SCSI adapter does not affect your decision to make your virtual disk an IDE or SCSI disk. However, some guest operating systems — such as Windows XP — do not include a driver for the Buslogic or LSI Logic adapter. You must download the driver from the LSI Logic Web site. Note: Drivers for a Mylex (BusLogic) compatible host bus adapter are not obvious on the LSI Logic Web site. Search the support area for the numeric string in the model number. For example, search for "958" for BT/KT-958 drivers. See the VMware Guest Operating System Installation Guide for details about the driver and the guest operating system you plan to install in this virtual machine.

14. Select the disk you want to use with the virtual machine. Select Create a new virtual disk. Virtual disks are the best choice for most virtual machines. They are quick and easy to set up and can be moved to new locations on the same host computer or to different host computers. By default, virtual disks start as small files on the host computer's hard drive, then expand as needed — up to the size you specify in the next step. The next step also allows you to allocate all the disk space when the virtual disk is created, if you wish. To use an existing operating system on a physical hard disk (a "raw" disk), read Configuring a Dual-Boot Computer for Use with a Virtual Machine. To install your guest operating system directly on an existing IDE disk partition, read the reference note Installing an Operating System on a Physical Partition from a Virtual Machine. Note: Physical disk configurations are recommended only for expert users. Caution: If you are using a Windows Server 2003, Windows XP or Windows 2000 host, see Do Not Use Windows 2000, Windows XP and Windows Server 2003 Dynamic Disks as Physical Disks. To install the guest operating system on a IDE physical disk, select Existing IDE Disk Partition. To use a SCSI physical disk, add it to the virtual machine later with the virtual machine settings editor. Booting from a SCSI physical disk is not supported. For a discussion of some of the issues involved in using a SCSI physical disk, see Configuring Dual- or MultipleBoot SCSI Systems to Run with VMware Workstation on a Linux Host. 15. Select whether to create an IDE or SCSI disk. The wizard recommends the best choice based on the guest operating system you selected. All Linux distributions you can select in the wizard use SCSI virtual disks by default, as do Windows NT, Windows 2000, and Windows Vista. All Windows operating systems except Windows NT, Windows 2000, and Windows Vista use IDE virtual disks by default; NetWare, FreeBSD, MS-DOS and other guests default to IDE virtual disks. 16. Specify the capacity of the virtual disk. Enter the size of the virtual disk that you wish to create. You can set a size between 0.1GB and 950 GB for a SCSI virtual disk. The option Allocate all disk space now gives somewhat better performance for your virtual machine. If you do not select Allocate all disk space now, the virtual disk's files start small and grow as needed, but they can never grow larger than the size you set here.

Note: Allocate all disk space now is a time-consuming operation that cannot be cancelled, and requires as much physical disk space as you specify for the virtual disk. Select the option Split disk into 2GB files if your virtual disk is stored on a file system that does not support files larger than 2GB. 17. Specify the location of the virtual disk's files. 18. Click Finish. The wizard sets up the files needed for your virtual machine. https://www.vmware.com/support/ws55/doc/ws_newguest_setup_simple_steps.html OUTPUT:

Result: Thus the above application is Login into Open stack portal, in instances, create virtual machines executed successfully.

EX No: 2

CREATION OF VIRTUAL BLOCK TO VIRTUAL MACHINE

Aim: To Find procedure to attach virtual block to the virtual machine and check whether it holds the data even after the release of the virtual machine. Algorithum Objective: Installation and Configuration of Justcloud. Requirement: Justcloud exe File THEORY: Professional Cloud Storage from JustCloud is Simple, Fast and Secure. Just Cloud will automatically backup the documents, photos, music and videos stored on your computer, to the cloud so you are never without files again. Installation : 1. Download linkhttp://www.justcloud.com/download/

Software

this

2. By following these steps you will download and install the JustCloud software application on this computer. This software will automatically start backing up files from your computer and saving them securely in an online cloud user account. Your free account gives you 15MB storage space or 50 files for 14 days. Once installed a sync folder will be added to your desktop for you to easily drag and drop files you wish to backup.

Result: The above application allots virtual block and virtual machine is successfully developed.

EX No:3

Install a C compiler in the virtual machine and execute a sample program.

Aim : To develop a C program using compiler in the virtual machine. Algorithm: Let's start with a simple "Hello Cloud9" example. 1. From your dashboard, click 'create new workspace' and then select 'create new workspace'. 2. Enter a catchy workspace name, visibility: open (proud to share your creations), hosting: hosted and choose a 'custom' workspace type. Click 'create'. 3. The workspace is being prepared and when done, select the project in the dashboard and click 'start editing'. 4. The workspace is opened, right click the project tree and select 'new file'. Name it 'helloCloud9.cc'. 5. Open the file by double clicking it in the file tree. Copy / paste the following code in the file: int main() { cout > /var/lib/nova/.ssh/config 9. chmod 600 /var/lib/nova/.ssh/id_rsa /var/lib/nova/.ssh/authorized_keys 10. Repeat steps 2-4 on each node. Note The nodes must share the same key pair, so do not generate a new key pair for any subsequent nodes. 11. From the first node, where you created the SSH key, run: 12. ssh-copy-id -i nova@remote-host This command installs your public key in a remote machine’s authorized_keys folder. 13. Ensure that the nova user can now log in to each node without using a password: 14. # su nova 15. $ ssh *computeNodeAddress* 16. $ exit 17. As root on each node, restart both libvirt and the Compute services: 18. # systemctl restart libvirtd.service 19. # systemctl restart openstack-nova-compute.service

1. To list the VMs you want to migrate, run: $ nova list 2. After selecting a VM from the list, run this command where VM_ID is set to the ID in the list returned in the previous step: $ nova show VM_ID 3. Use the nova migrate command. $ nova migrate VM_ID

4. To migrate an instance and watch the status, use this example script: #!/bin/bash # Provide usage usage() { echo "Usage: $0 VM_ID" exit 1 } [[ $# -eq 0 ]] && usage # Migrate the VM to an alternate hypervisor echo -n "Migrating instance to alternate host" VM_ID=$1 nova migrate $VM_ID VM_OUTPUT=`nova show $VM_ID` VM_STATUS=`echo "$VM_OUTPUT" | grep status | awk '{print $4}'` while [[ "$VM_STATUS" != "VERIFY_RESIZE" ]]; do echo -n "." sleep 2 VM_OUTPUT=`nova show $VM_ID` VM_STATUS=`echo "$VM_OUTPUT" | grep status | awk '{print $4}'` done nova resize-confirm $VM_ID echo " instance migrated and resized." echo; # Show the details for the VM echo "Updated instance details:"

nova show $VM_ID # Pause to allow users to examine VM details read -p "Pausing, press to exit."

Result: Thus the above program for virtual machine migration using Linux commands are executed successfully.

EXNo:5

Find procedure to install storage controller and interact with it.

Aim: To develop a storage controller and interact with it. Algorithum: Create a container 1. 2. 3. 4. 5.

Log in to the dashboard. Select the appropriate project from the drop down menu at the top left. On the Project tab, open the Object Store tab and click Containers category. Click Create Container. In the Create Container dialog box, enter a name for the container, and then click Create Container.

You have successfully created a container. Upload an object 1. 2. 3. 4. 5.

Log in to the dashboard. Select the appropriate project from the drop down menu at the top left. On the Project tab, open the Object Store tab and click Containers category. Select the container in which you want to store your object. Click Upload Object. The Upload Object To Container: dialog box appears. is the name of the container to which you are uploading the object.

6. Enter a name for the object. 7. Browse to and select the file that you want to upload. 8. Click Upload Object. You have successfully uploaded an object to the container. Manage an object To edit an object 1. 2. 3. 4. 5.

Log in to the dashboard. Select the appropriate project from the drop down menu at the top left. On the Project tab, open the Object Store tab and click Containers category. Select the container in which you want to store your object. Click the menu button and choose Edit from the dropdown list.

The Edit Object dialog box is displayed. 6. Browse to and select the file that you want to upload. 7. Click Update Object. To create a metadata-only object without a file You can create a new object in container without a file available and can upload the file later when it is ready. This temporary object acts a place-holder for a new object, and enables the user to share object metadata and URL info in advance. 1. 2. 3. 4. 5.

Log in to the dashboard. Select the appropriate project from the drop down menu at the top left. On the Project tab, open the Object Store tab and click Containers category. Select the container in which you want to store your object. Click Upload Object. The Upload Object To Container: dialog box is displayed. is the name of the container to which you are uploading the object.

6. Enter a name for the object. 7. Click Update Object. To create a pseudo-folder Pseudo-folders are similar to folders in your desktop operating system. They are virtual collections defined by a common prefix on the object’s name. 1. 2. 3. 4. 5.

Log in to the dashboard. Select the appropriate project from the drop down menu at the top left. On the Project tab, open the Object Store tab and click Containers category. Select the container in which you want to store your object. Click Create Pseudo-folder. The Create Pseudo-Folder in Container dialog box is displayed. is the name of the container to which you are uploading the object. 6. Enter a name for the pseudo-folder. A slash (/) character is used as the delimiter for pseudo-folders in Object Storage. 7. Click Create.

Result: Thus the above programme is executed with storage controller.

EX NO:6

Find procedure to set up the one node Hadoop cluster.

AIM To develop a procedure to set up the one node Hadoop cluster. Algorithm: Hadoop installation Now Download Hadoop from the official Apache, preferably a stable release version of Hadoop 2.7.x and extract the contents of the Hadoop package to a location of your choice. We chose location as “/opt/” Step 1: Download the tar.gz file of latest version Hadoop ( hadoop-2.7.x) from the official site . Step 2: Extract(untar) the downloaded file from this commands to /opt/bigdata root@solaiv[]# cd /opt root@solaiv[/opt]# sudo tar xvpzf /home/itadmin/Downloads/hadoop-2.7.0.tar.gz root@solaiv[/opt]# cd hadoop-2.7.0/ Like java, update Hadop environment variable in /etc/profile boss@solaiv[]# sudo vi /etc/profile #--insert HADOOP_PREFIX HADOOP_PREFIX=/opt/hadoop2.7.0 #--in PATH variable just append at the end of the line PATH=$PATH:$HADOOP_PREFIX/bin #--Append HADOOP_PREFIX at end of the export statement export PATH JAVA_HOME HADOOP_PREFIX save the file using by pressing “Esc” key followed by :wq! Step

3: Source the /etc/profile boss@solaiv[]# source /etc/profile Verify Hadoop installation boss@solaiv[]# cd $HADOOP_PREFIX boss@solaiv[]# bin/hadoop version

3.1) Modify the Hadoop Configuration Files

In this section, we will configure the directory where Hadoop will store its configuration files, the network ports it listens to, etc. Our setup will use Hadoop Distributed File System,(HDFS), even though we are using only a single local machine. Add the following properties in the various hadoop configuration files which is available under $HADOOP_PREFIX/etc/hadoop/ core-site.xml, hdfs-site.xml, mapred-site.xml & yarn-site.xml Update Java, hadoop path to the Hadoop environment file

boss@solaiv[]# cd $HADOOP_PREFIX/etc/hadoop boss@solaiv[]# vi hadoop-env.sh

Paste following line at beginning of the file export JAVA_HOME=/usr/local/jdk1.8.0_05 export HADOOP_PREFIX=/opt/hadoop-2.7.0 Modify the core-site.xml boss@solaiv[]# cd $HADOOP_PREFIX/etc/hadoop boss@solaiv[]# vi core-site.xml Paste following between tags fs.defaultFS hdfs://localhost:9000

Modify the hdfs-site.xml boss@solaiv[]# vi hdfs-site.xml Paste following between tags

dfs.replication 1 /configuration> YARN configuration - Single Node modify the mapred-site.xml boss@solaiv[]# cp mapred-site.xml.template mapred-site.xml boss@solaiv[]# vi mapred-site.xml Paste following between tags mapreduce.framework.name yarn Modify yarn-site.xml boss@solaiv[]# vi yarn-site.xml

Paste following between tags yarn.nodemanager.aux-services mapreduce_shuffle Formatting the HDFS file-system via the NameNode The first step to starting up your Hadoop installation is formatting the Hadoop files system which is implemented on top of the local file system of our “cluster” which includes only our local machine. We need to do this the first time you set up a Hadoop cluster.

Do not format a running Hadoop file system as you will lose all the data currently in the cluster (in HDFS) root@solaiv[]# cd $HADOOP_PREFIX root@solaiv[]# bin/hadoop namenode -format Start NameNode daemon and DataNode daemon: (port 50070) root@solaiv[]# sbin/start-dfs.sh To know the running daemons jut type jps or /usr/local/jdk1.8.0_05/bin/jps Start ResourceManager daemon and NodeManager daemon: (port 8088) root@solaiv[]# sbin/start-yarn.sh

To stop the running process root@solaiv[]# sbin/stop-dfs.sh To know the running daemons jut type jps or /usr/local/jdk1.8.0_05/bin/jps Start ResourceManager daemon and NodeManager daemon: (port 8088)

root@solaiv[]# sbin/stop-yarn.sh

Result: Thus the above application for setting up the one Hadoop cluster is executed successfully.

EX.NO:7

Mount the one node Hadoop cluster using FUSE.

AIM: To develop a Mount the one node Hadoop cluster using FUSE. Algorithm: 1. FUSE (Filesystem in Userspace) enables you to write a normal user application as a bridge for a traditional filesystem interface. 2. The hadoop-hdfs-fuse package enables you to use your HDFS cluster as if it were a traditional filesystem on Linux. It is assumed that you have a working HDFS cluster and know the hostname and port that your NameNode exposes. To install fuse-dfs on Ubuntu systems: sudo apt-get install hadoop-hdfs-fuse To set up and test your mount point: mkdir -p hadoop-fuse-dfs dfs://: You can now run operations as if they are on your mount point. Press Ctrl+C to end the fusedfs program, and umount the partition if it is still mounted. Note: -fuse-dfs uses the HADOOP_CONF_DIR configured at the time the mount command is invoked. -fuse-dfs may exit immediately because ld.so can't find libjvm.so. To work around this issue, add /usr/java/latest/jre/lib/amd64/server to the LD_LIBRARY_PATH. To clean up your test: $ umount You can now add a permanent HDFS mount which persists through reboots. To add a system mount: 1. Open /etc/fstab and add lines to the bottom similar to these: hadoop-fuse-dfs#dfs://: allow_other,usetrash,rw 2 0 For example: hadoop-fuse-dfs#dfs://localhost:8020 /mnt/hdfs fuse allow_other,usetrash,rw 2 0 2. Test to make sure everything is working properly:

fuse

$ mount Your system is now configured to allow you to use the ls command and use that mount point as if it were a normal system disk.

Result: Thus the above application is executed successfully.

EX No 8.

Write a program to use the API's of Hadoop to interact with it.

AIM: To develop a API of hadoop Algorithm Modify the Hadoop Configuration Files In this section, we will configure the directory where Hadoop will store its configuration files, the network ports it listens to, etc. Our setup will use Hadoop Distributed File System,(HDFS), even though we are using only a single local machine. Add the following properties in the various hadoop configuration files which is available under $HADOOP_PREFIX/etc/hadoop/ core-site.xml, hdfs-site.xml, mapred-site.xml & yarn-site.xml

Update Java, hadoop path to the Hadoop environment file

boss@solaiv[]# cd $HADOOP_PREFIX/etc/hadoop boss@solaiv[]# vi hadoop-env.sh

Paste following line at beginning of the file export JAVA_HOME=/usr/local/jdk1.8.0_05 export HADOOP_PREFIX=/opt/hadoop-2.7.0 Modify the core-site.xml boss@solaiv[]# cd $HADOOP_PREFIX/etc/hadoop boss@solaiv[]# vi core-site.xml Paste following between tags fs.defaultFS hdfs://localhost:9000

Modify the hdfs-site.xml boss@solaiv[]# vi hdfs-site.xml Paste following between tags dfs.replication 1 YARN configuration - Single Node modify the mapred-site.xml boss@solaiv[]# cp mapred-site.xml.template mapred-site.xml boss@solaiv[]# vi mapred-site.xml

Paste following between tags mapreduce.framework.name yarn Modfiy yarn-site.xml

boss@solaiv[]# vi yarn-site.xml Paste following between tags yarn.nodemanager.aux-services mapreduce_shuffle

Formatting the HDFS file-system via the NameNode The first step to starting up your Hadoop installation is formatting the Hadoop files system which is implemented on top of the local file system of our “cluster” which includes only our local machine. We need to do this the first time you set up a Hadoop cluster. Do not format a running Hadoop file system as you will lose all the data currently in the cluster (in HDFS)

Result: Thus the above application is executed successfully.

Ex No: 9. Write a word count program to demonstrate the use of Map and Reduce tasks Aim: To develop a word count program to demonstrate the use of Map Algorithm Required software for Linux includes 1. Java™ must be installed. Recommended Java versions are described at HadoopJavaVersions. 2. ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons. 3. f your cluster doesn’t have the requisite software you will need to install it. 4. For example on Ubuntu Linux: 5. $ sudo apt-get install ssh 6. $ sudo apt-get install rsync 7. Unpack the downloaded Hadoop distribution. In the distribution, edit the file etc/hadoop/hadoop-env.sh to define some parameters as follows: 8. # set to the root of your Java installation 9. export JAVA_HOME=/usr/java/latest 10. Try the following command: 11. $ bin/hadoop 12. e following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory. 13. $ mkdir input 14. $ cp etc/hadoop/*.xml input 15. $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+' 16. $ cat output/* Program: import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount { public static class TokenizerMapper extends Mapper{ private final static IntWritable one = new IntWritable(1);

private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } } Input/output $ bin/hadoop fs -ls /user/joe/wordcount/input/ /user/joe/wordcount/input/file02 $ bin/hadoop fs -cat /user/joe/wordcount/input/file01 Hello World Bye World

/user/joe/wordcount/input/file01

$ bin/hadoop fs -cat /user/joe/wordcount/input/file02 Hello Hadoop Goodbye Hadoop Run the application: $ bin/hadoop jar wc.jar WordCount /user/joe/wordcount/input /user/joe/wordcount/output

Result: Thus the above application is executed successfully.

Website: https://larshansolutions.blogspot.in http://liveinternetjobs.blogspot.com http://auquestion.blogspot.in Contact:

[email protected]

You like my Website Please Share it and Following

Thanking you Welcome

Website: https://larshansolutions.blogspot.in http://liveinternetjobs.blogspot.com http://auquestion.blogspot.in Contact: EX. NO. : 1

[email protected]

INSTALLATION OF VIRTUAL MACHINE

Date: Aim: To find procedure to run the virtual machine of different configuration. Check how many virtual machines can be utilized at particular time. KVM: In computing, virtualization refers to the act of creating a virtual (rather than actual) version of something, including virtual computer hardware platforms, operating systems, storage devices, and computer network resources. Kernel-based Virtual Machine (KVM) is a virtualization infrastructure for the Linux kernel that turns it into a hypervisor. Steps for KVM Installation: 1. To run KVM, you need a processor that supports hardware virtualization. So check that your CPU supports hardware virtualization egrep -c '(vmx|svm)' /proc/cpuinfo If 0 - CPU doesn't support hardware virtualization 1 - CPU support hardware virtualization 2. To see if your processor is 64-bit, egrep -c ' lm ' /proc/cpuinfo If 0 - printed, it means that your CPU is not 64-bit. 1- is 64-bit. 3. $ ls /lib/modules/3.16.0-3 generic/kernel/arch/x86/kvm/kvm kvm-amd.ko kvm-intel.ko kvm.ko 4. $ ls /dev/kvm /dev/kvm 5. Install Necessary Packages using the following commands, 1

qemu-kvm libvirt-bin bridge-utils virt-manager qemu-system 6. Creating VM’s virt-install --connect qemu:///system -n hardy -r 512 -f hardy1.qcow2 -s 12 -c ubuntu14.04.2-server-amd64.iso --vnc --noautoconsole --os-type linux --os-variant ubuntuHardy

Output: 1. New virtual machine is created using KVM:

2

Conclusion: Thus the virtual machine of different configuration is created successfully. 3

EX. NO. : 2

INSTALLATION OF C COMPILER

Date: Aim: To find the procedure to install a C Compiler in the Virtual Machine and execute a C program. Steps: 1. To install the C Compiler in the guest os, install the following package. gcc 2. Write a sample program using gedit/vim editor. 3. Compile the C program using the compiler installed. gcc sample_c_program.c –o output 4. Run the object file and get the output.

Conclusion: Thus the C Compiler is installed successfully and executed a sample C program.

4

EX. NO. : 3

INSTALLATION OF VIRTUAL MACHINE

Date: Aim: To find procedure to install storage controller and interact with it. KVM: In computing, virtualization refers to the act of creating a virtual (rather than actual) version of something, including virtual computer hardware platforms, operating systems, storage devices, and computer network resources. Kernel-based Virtual Machine (KVM) is a virtualization infrastructure for the Linux kernel that turns it into a hypervisor. Steps for KVM Installation: 1. To run KVM, you need a processor that supports hardware virtualization. So check that your CPU supports hardware virtualization egrep -c '(vmx|svm)' /proc/cpuinfo If 0 - CPU doesn't support hardware virtualization 1 - CPU support hardware virtualization 2.

To see if your processor is 64-bit, egrep -c ' lm ' /proc/cpuinfo If 0 - printed, it means that your CPU is not 64-bit. 2- is 64-bit.

3. $ ls /lib/modules/3.16.0-3 generic/kernel/arch/x86/kvm/kvm kvm-amd.ko kvm-intel.ko kvm.ko 4.

$ ls /dev/kvm /dev/kvm

5. Install Necessary Packages using the following commands, qemu-kvm libvirt-bin bridge-utils virt-manager qemu-system 5

6. Creating VM’s virt-install --connect qemu:///system -n hardy -r 512 -f hardy1.qcow2 -s 12 -c ubuntu14.04.2-server-amd64.iso --vnc --noautoconsole --os-type linux --os-variant ubuntuHardy

Output: 2. New virtual machine is created using KVM:

6

Conclusion: Thus the storage controller is inatlled successfully in virtual machine. 7

EX. NO. : 4

VIRTUAL MACHINE MIGRATION

Date: Aim: To show the virtual machine migration based on the certain condition from one node to the other. Steps to Migrate the Virtual Machine: 1. Open virt-manager

2. Connect to the target host physical machine Connect to the target host physical machine by clicking on the File menu, then click Add Connection.

8

3. Add connection The Add Connection window appears.

Enter the following details: Hypervisor: Select QEMU/KVM. Method: Select the connection method. Username: Enter the username for the remote host physical machine. Hostname: Enter the hostname/IP address for the remote host physical machine. Click the Connect button. An SSH connection is used in this example, so the specified user's password must be entered in the next step.

9

4. Migrate guest virtual machines Open the list of guests inside the source host physical machine (click the small triangle on the left of the host name) and right click on the guest that is to be migrated (guest1-rhel6-64 in this example) and click Migrate.

In the New Host field, use the drop-down list to select the host physical machine you wish to migrate the guest virtual machine to and click Migrate.

10

A progress window will appear.

virt-manager now displays the newly migrated guest virtual machine running in the destination host. The guest virtual machine that was running in the source host physical machine is now listed in the Shutoff state.

Conclusion: Thus the virtual machine is migrated from one node to another node successfully.

11

EX. NO.: 5

VIRTUAL BLOCK ATTACHMENT

Date: Aim: To find the procedure to attach virtual block to the virtual machine and check whether it holds the data even after the release of the virtual machine. Steps: 1. Make sure that you have shut down your virtual machine. 2. Select your VM and then click Edit settings. 3. Select the Hardware tab and then click Add. 4. Select Hard Disk from the list of device types and then click Next. 5. Choose Create a new virtual disk. 6. Specify the disk size. 7. Choose Thick Provision Lazy Zeroed. 8. Choose Specify a datastore or datastore cluster: and then click Browse 9. Select your datastore from the provided list and then click OK. 10. Click Next to accept the default advanced options. (By default, the new disk will be included in full VM snapshots. To keep them consistent, we recommend that you leave the Independent option unselected.) 11. Click Finish to proceed with adding the disk. 12. Click OK once the new hard disk has been added. This may take some time, depending on how much storage you're adding.

Conclusion: Thus the new virtual block is successfully added to existing virtual machine. 12

EX. NO.: 6

HADOOP SETUP AND INSTALLATION

Date: Aim: To find the procedure to set up the one node Hadoop cluster. HADOOP: Apache Hadoop is an open-source software framework for storage and large-scale processing of data-sets on clusters of commodity hardware. Hadoop is an Apache top-level project being built and used by a global community of contributors and users. It is licensed under the Apache License 2.0. The Apache Hadoop framework is composed of the following modules: 

Hadoop Common – contains libraries and utilities needed by other Hadoop modules



Hadoop Distributed File System (HDFS) – a distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster.



Hadoop YARN – a resource-management platform responsible for managing compute resources in clusters and using them for scheduling of users' applications.



Hadoop MapReduce – a programming model for large scale data processing.

Installation Steps: 1. Install Java Check the Java version in the system. “java –version” 2. Open the “/etc/profile” file and Add the following line as per the version to set a environment for Java $ sudo vi /etc/profile #--insert JAVA_HOME JAVA_HOME=/opt/jdk1.8.0_05 #--in PATH variable just append at the end of the line PATH=$PATH:$JAVA_HOME/bin #--Append JAVA_HOME at end of the export statement export PATH JAVA_HOME $ source /etc/profile 3. Install SSH using the command 13

$ sudo apt-get install openssh-server openssh-client 4. Generate an SSH key for the user. Then Enable password-less SSH access $ ssh localhost $ ssh-keygen $ exit 5. Hadoop installation:  Download the tar.gz file of latest version Hadoop ( hadoop-2.7.x) from the official site .  Extract(untar) the downloaded file from commands $ sudo tar zxvf hadoop-2.7.0.tar.gz $ cd hadoop-2.7.0/ 6. Update Hadoop environment variable in /etc/profile $ sudo vi /etc/profile #--insert HADOOP_PREFIX HADOOP_PREFIX=/opt/hadoop-2.7.0 #--in PATH variable just append at the end of the line PATH=$PATH:$HADOOP_PREFIX/bin #--Append HADOOP_PREFIX at end of the export statement export PATH JAVA_HOME HADOOP_PREFIX Source the /etc/profile $ source /etc/profile Verify Hadoop installation $ cd $HADOOP_PREFIX $ bin/hadoop version 7. Update Java, hadoop path to the Hadoop environment file $HADOOP_PREFIX/etc/Hadoop $ vi core-site.xml Paste following between tags in core-site.xml fs.defaultFS hdfs://localhost:9000 14

$ vi hdfs-site.xml Paste following between tags in hdfs-site.xml dfs.replication 1 $ cp mapred-site.xml.template mapred-site.xml $ vi mapred-site.xml Paste following between tags in mapred-site.xml mapreduce.framework.name yarn $vi yarn-site.xml Paste following between tags in yarn-site.xml yarn.nodemanager.aux-services mapreduce_shuffle 8. Formatting the HDFS file-system via the NameNode $bin/hadoop namenode –format 9. Start NameNode daemon and DataNode daemon: (port 50070) $ sbin/start-dfs.sh 10. Start ResourceManager daemon and NodeManager daemon: (port 8088) $ sbin/start-yarn.sh 11. To stop the running process $ sbin/stop-dfs.sh $ sbin/stop-yarn.sh

15

Output: Hadoop installation:

Create the HDFS directories:

Conclusion: Thus the one node Hadoop cluster is installed successfully.

16

EX. NO.: 7

HADOOP CLUSTER USING FUSE

Date: Aim: To mount the one node Hadoop cluster using FUSE. Steps: Download the cdh3 repository from the internet. $ wget http://archive.cloudera.com/one-click-install/maverick/cdh3-repository_1.0_all.deb Add the cdh3 repository to default system repository. $ sudo dpkg -i cdh3-repository_1.0_all.deb Update the package information using the following command. $ sudo apt-get update Install the hadoop-fuse. $ sudo apt-get install hadoop-0.20-fuse Once fuse-dfs is installed, go ahead and mount HDFS using FUSE as follows: $ sudo hadoop-fuse-dfs dfs://:

Conclusion: Thus the one node Hadoop cluster is mounted using FUSE successfully. 17

EX. NO.: 8

MAP AND REDUCE – WORD COUNT

Date: Aim: To write a word count program to demonstrate the use of Map and Reduce tasks. Mapreduce: MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster. A MapReduce program is composed of a Map() procedure that performs filtering and sorting and a Reduce() method that performs a summary operation. 





"Map" step: Each worker node applies the "map()" function to the local data, and writes the output to a temporary storage. A master node ensures that only one copy of redundant input data is processed. "Shuffle" step: Worker nodes redistribute data based on the output keys (produced by the "map()" function), such that all data belonging to one key is located on the same worker node. "Reduce" step: Worker nodes now process each group of output data, per key, in parallel.

Steps: Source Code: import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount { public static class TokenizerMapper extends Mapper{ 18

private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); 19

System.exit(job.waitForCompletion(true) ? 0 : 1); } } 1. Set Environmental Variables: export JAVA_HOME=/usr/java/default export PATH=${JAVA_HOME}/bin:${PATH} export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar 2. Compile the source file to jar file, $ bin/hadoop com.sun.tools.javac.Main WordCount.java $ jar cf wc.jar WordCount*.class 3. Run the Application $ bin/hadoop jar wc.jar WordCount /user/joe/wordcount/input /user/joe/wordcount/output Output: $ bin/hadoop fs -cat /user/joe/wordcount/output/part-r-00000` Bye 1 Goodbye 1 Hadoop 2 Hello 2 World 2`

Conclusion: Thus the word count program to demonstrate the Map and Reduce task is done successfully.

20

EX. NO.: 9

API’S OF HADOOP

Date: Aim: To write a program to use the API's of Hadoop to interact with it. Steps: 1. Start NameNode daemon and DataNode daemon: (port 50070) $ sbin/start-dfs.sh 2. Start ResourceManager daemon and NodeManager daemon: (port 8088) $ sbin/start-yarn.sh 3. Make the HDFS directories required to execute MapReduce jobs: $ bin/hdfs dfs -mkdir /user 4. Copy the input files into the distributed filesystem: $ bin/hdfs dfs -put /* /input Run some of the examples provided: $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples2.5.1.jar wordcount /input /output1 5. Examine the output files: View the output files on the distributed filesystem: $ bin/hdfs dfs -cat /output/*

21

Conclusion: Thus the program to use the API of Hadoop is implemented successfully.

Website: https://larshansolutions.blogspot.in http://liveinternetjobs.blogspot.com http://auquestion.blogspot.in Contact:

[email protected]

You like my Website Please Share it and Following

Thanking you Welcome 22

Suggest Documents