An Investigation into the Use of OpenStack for Submarine Mission Systems: Installation and Configuration Guide (Volume 2)
August 2015
M. Ali Babar, David Silver, Ben Ramsey CREST - Centre for Research on Engineering Software Technologies, University of Adelaide, Australia. Recommended Citation Details: M. Ali Babar, D. Silver, B. Ramsey, An Investigation into the Use of OpenStack for Submarine Mission Systems: Installation and Configuration Guide, Technical Report, CREST, University of Adelaide, Adelaide, Australia, 2015.
1
Table of Contents Table of Figures .................................................................................................................................................................. 5 1 Introduction ................................................................................................................................................................... 6 2 OpenStack Private Cloud Software ............................................................................................................................... 6 3 Building Private Cloud with OpenStack for Submarine Mission Systems ................................................................... 7 3.1 Provided Hardware ......................................................................................................................................... 7 3.2 Deployed Software ......................................................................................................................................... 8 3.3 OpenStack Services ........................................................................................................................................ 8 3.4 Deployment Layout ........................................................................................................................................ 8 3.5 BladeCenter Hardware Preparation ................................................................................................................ 9 3.5.1 Java Plugin Setup .............................................................................................................................. 9 3.5.2 IMM and Firmware Upgrades .......................................................................................................... 9 3.5.3 Parts Explosion ............................................................................................................................... 10 3.5.4 Detected Hardware.......................................................................................................................... 10 3.5.5 Network Adapters ........................................................................................................................... 10 3.6 Deploying OpenStack on Virtual Hardware ................................................................................................. 10 3.7 Switch Module for IBM BladeCenter .......................................................................................................... 10 3.7.1 Ports ................................................................................................................................................ 11 3.7.2 Firmware Upgrade .......................................................................................................................... 11 3.7.3 Industry Standard Command Line Interface (ISCLI) ..................................................................... 11 3.7.4 VLANs ............................................................................................................................................ 12 3.7.5 Adding the Second Switch Module ................................................................................................ 13 3.7.6 AMM Configuration ....................................................................................................................... 13 4 Ubuntu 14.04 Desktop or Server Installation and Configuration................................................................................ 13 4.1 Installation .................................................................................................................................................... 13 4.2 Post-Installation ............................................................................................................................................ 14 4.3 Networking ................................................................................................................................................... 14 4.4 Package Upgrades......................................................................................................................................... 14 4.5 OpenSSH ...................................................................................................................................................... 14 4.6 General/Base Packages ................................................................................................................................. 14 4.6.1 Additional Base Operating System Packages and NTP Configuration .......................................... 14 5 Clonezilla .................................................................................................................................................................... 15 5.1 Installation .................................................................................................................................................... 15 5.2 Base Operating System ................................................................................................................................. 15 5.3 Package Installation ...................................................................................................................................... 15 5.4 DRBL Server Interactive Configuration....................................................................................................... 15 5.5 DRBL Client Push Interactive Configuration .............................................................................................. 15 5.6 Collecting Client MAC Addresses ............................................................................................................... 17 5.7 Configuring New Clients .............................................................................................................................. 18 5.8 Fix Subnet Masks ......................................................................................................................................... 18 5.8.1 Fix Clonezilla Disk Detection ......................................................................................................... 19 5.9 Verifying the Configuration ......................................................................................................................... 19 5.10 Client UEFI Settings ................................................................................................................................... 20 5.11 DRBL Commands ...................................................................................................................................... 20 5.11.1 Restarting Services........................................................................................................................ 20 5.11.2 Controlling DRBL Interactively ................................................................................................... 20 5.11.3 DRBL Command Line .................................................................................................................. 20 5.11.4 Saving An Image........................................................................................................................... 20 5.12 Baremetal Provisioning with Clonezilla ..................................................................................................... 21 5.12.1 Saved Configuration ..................................................................................................................... 22 5.13 Standard Image Requirements .................................................................................................................... 22 5.13.1 Standardised User Accounts ......................................................................................................... 22 5.13.2 Hypervisor Hosts .......................................................................................................................... 22 6 VMware ....................................................................................................................................................................... 23 6.1 Downloading ................................................................................................................................................ 23 6.2 ESXi Installation........................................................................................................................................... 23 6.3 Boot Loader Configuration ........................................................................................................................... 23 6.4 DHCP Configuration .................................................................................................................................... 24 6.5 ESXi SSH Access ......................................................................................................................................... 24
2
6.6 ESXi Host Authorised SSH Key Installation ............................................................................................... 24 6.7 Management Tools ....................................................................................................................................... 24 6.7.1 vSphere management clients........................................................................................................... 24 6.7.2 Windows vSphere Client Installation ............................................................................................. 25 6.7.3 Documentation ................................................................................................................................ 25 6.8 Virtual Networking ....................................................................................................................................... 25 6.9 Guest Installation .......................................................................................................................................... 25 6.10 vCenter Server Installation ......................................................................................................................... 25 6.11 vSphere Web Client .................................................................................................................................... 27 6.12 Creating a Datacenter ................................................................................................................................. 27 6.13 Clusters ....................................................................................................................................................... 27 6.13.1 Creating a Cluster ......................................................................................................................... 27 6.13.2 Adding a Host to the Cluster ......................................................................................................... 27 6.14 Port Group Naming and Switch Security Features ..................................................................................... 28 6.15 For every ESXi host: .................................................................................................................................. 28 6.16 For the master ESXi host, where the compute node and vCenter Server run: ........................................... 28 6.17 NFS Datastore............................................................................................................................................. 28 6.17.1 Creating a Shared NFS Datastore ................................................................................................. 28 6.17.2 NFS Server Configuration ............................................................................................................ 28 6.17.3 NFS Datastore Configuration ....................................................................................................... 29 6.17.4 To mount this datastore on other ESXi hosts, in the vSphere Web Client: .................................. 29 6.17.5 Create a Datastore Cluster to Contain the NFS Datastore ............................................................ 30 6.17.6 Deleting the Local Datastore ........................................................................................................ 30 6.18 Powering Cycling the vCenter Server ........................................................................................................ 31 6.19 Power Cycling the ESXi Master or the VMware Compute Node Virtual Machine.................................. 31 6.19.1 To power the master ESXi host back on:...................................................................................... 31 6.20 Testing OpenStack Operation on the vSphere Cluster ............................................................................... 31 6.21 vSphere CLI ................................................................................................................................................ 32 6.21.1 vSphere CLI Installation on Ubuntu 14.04 ................................................................................... 32 6.22 Using the vSphere CLI ............................................................................................................................... 32 7 PCI Passthrough .......................................................................................................................................................... 33 7.1 HS22 Blade Configuration ........................................................................................................................... 33 7.2 ESXi Host Configuration.............................................................................................................................. 33 7.3 ESXi Guest Configuration of PCI Passthrough ............................................................................................ 34 8 DevStack Installation .................................................................................................................................................. 34 8.1 Multi-Node DevStack on IBM HS22 (Type 7870) Blades .......................................................................... 34 8.1.1 Overview ......................................................................................................................................... 34 8.1.2 SSH Keys ........................................................................................................................................ 35 8.1.3 Saved Configuration ....................................................................................................................... 35 8.1.4 "stack" User Setup .......................................................................................................................... 37 8.1.5 DevStack Checkout As User "stack" .............................................................................................. 37 8.1.6 Configuration and Running stack.sh ............................................................................................... 37 8.1.7 Controller Saved Configuration ...................................................................................................... 38 8.1.8 /etc/network/interfaces .................................................................................................................... 38 8.1.9 /opt/stack/devstack/local.conf ......................................................................................................... 38 8.1.10 /opt/stack/devstack/local.sh .......................................................................................................... 41 8.1.11 /opt/stack/devstack/access.sh ........................................................................................................ 41 8.1.12 /opt/stack/devstack/images.sh ....................................................................................................... 41 8.1.13 /opt/stack/devstack/remove-node.sh ............................................................................................. 42 8.1.14 /etc/rc.local .................................................................................................................................... 43 8.1.15 Saved Configuration for Compute Nodes ..................................................................................... 43 8.1.16 /opt/stack/devstack/local.conf ....................................................................................................... 43 8.1.17 /etc/rc.local .................................................................................................................................... 45 8.1.18 /opt/stack/devstack/restack.sh ....................................................................................................... 45 8.1.19 /opt/stack/devstack/re-enable.sh ................................................................................................... 46 8.1.20 /etc/dhcp/dhclient-exit-hooks.d/hostname .................................................................................... 46 8.1.21 /etc/network/interfaces .................................................................................................................. 46 8.2 DevStack All-In-One Single VM ................................................................................................................. 46 8.2.1 Overview ......................................................................................................................................... 46 8.2.2 VM Preparation ............................................................................................................................... 47
3
8.2.3 "stack" User Setup .......................................................................................................................... 47 8.2.4 DevStack Checkout As User "stack" .............................................................................................. 47 8.2.5 local.conf localrc Customisation ..................................................................................................... 47 8.2.6 Installation....................................................................................................................................... 48 8.2.7 Horizon Testing .............................................................................................................................. 48 8.2.8 Command Line Testing................................................................................................................... 49 8.3 DevStack Multi-Node VMs .......................................................................................................................... 50 8.3.1 Overview ......................................................................................................................................... 50 8.3.2 References ....................................................................................................................................... 50 8.3.3 VM Preparation ............................................................................................................................... 50 8.3.4 "stack" User Setup .......................................................................................................................... 51 8.3.5 DevStack Checkout As User "stack" .............................................................................................. 51 8.3.6 File Format ...................................................................................................................................... 52 8.3.7 Networks ......................................................................................................................................... 52 8.3.8 Controller local.conf ....................................................................................................................... 53 8.3.9 Compute Node local.conf ............................................................................................................... 55 8.3.10 Controller Network Configuration ................................................................................................ 55 8.3.11 local.sh File ................................................................................................................................... 56 8.3.12 Instance Internet Forwarding ........................................................................................................ 56 8.3.13 net.ipv4.ip_forward = 1 Running stack.sh .................................................................................... 56 8.3.14 Access Rules ................................................................................................................................. 57 8.3.15 SSH Key........................................................................................................................................ 57 8.3.16 Access Rules and SSH Key Script ................................................................................................ 57 8.3.17 Additional Images ......................................................................................................................... 58 8.3.18 Test Instance ................................................................................................................................. 58 8.3.19 Launching ..................................................................................................................................... 58 8.3.20 Testing........................................................................................................................................... 59 8.3.21 External Access ............................................................................................................................. 59 8.3.22 Neutron Network Configuration ................................................................................................... 59 8.3.23 Access Rules ................................................................................................................................. 59 8.3.24 IP Forwarding ............................................................................................................................... 60 8.3.25 Instance MTUs .............................................................................................................................. 60 8.3.26 Instance Operating System ........................................................................................................... 60 8.3.27 Resolving Internet Hosts ............................................................................................................... 60 8.3.28 Disable Transmission Checksumming .......................................................................................... 61 9 OpenStack Development GUI ..................................................................................................................................... 61 9.1 Installation .................................................................................................................................................... 61 9.2 Configuration ................................................................................................................................................ 62 9.3 OpenDDS Modeling ..................................................................................................................................... 62 9.3.1 OpenDDS Modeling Installation .................................................................................................... 62 9.3.2 OpenDDS Modeling Project Creation ............................................................................................ 62 9.4 Component Hosts ......................................................................................................................................... 62 9.5 Firewall Requirements .................................................................................................................................. 63 9.6 Component Host Image Preparation............................................................................................................. 63 9.6.1 Software Installation ....................................................................................................................... 63 9.6.2 Testing............................................................................................................................................. 63 9.6.3 Snapshot Creation ........................................................................................................................... 64 9.6.4 Baremetal Host Configuration ........................................................................................................ 64 10 PCI Passthrough ........................................................................................................................................................ 65 10.1 Installation Procedures ............................................................................................................................... 65 10.2 Host Preparation ......................................................................................................................................... 65 10.3 KVM Host Configuration ........................................................................................................................... 65 10.4 OpenStack Nova Compute Configuration .................................................................................................. 66 10.5 OpenStack Flavor Creation ........................................................................................................................ 67 10.6 Instance Kernel ........................................................................................................................................... 67 10.7 Testing ........................................................................................................................................................ 68 10.8 GPGPU Benchmarks .................................................................................................................................. 68 10.8.1 Ubuntu 14.04 Installation.............................................................................................................. 68 10.8.2 Initial Testing ................................................................................................................................ 69 10.9 Roy Longbottom's CUDA Mflops Benchmark .......................................................................................... 69
4
10.10 NVIDIA bandwidthTest Sample .............................................................................................................. 69 11 Hypervisor Suitability Comparison ........................................................................................................................... 70 11.1 Phoronix Test Suite Installation ................................................................................................................. 70 11.2 Cyclictest installation ................................................................................................................................. 70 11.2.1 Iperf Installation ............................................................................................................................ 70 11.2.2 Pre-Benchmark Tuning ................................................................................................................. 71 Appendix A: Troubleshooting ........................................................................................................................................... 72 A.1: Hardware ........................................................................................................................................................ 72 A.1.1: HS22 BladeCenter ............................................................................................................................ 72 A.2: DevStack ........................................................................................................................................................ 72 A.2.1: Stack.sh ............................................................................................................................................. 72 A.2.2: Networking ....................................................................................................................................... 73 A.3: GPGPU ........................................................................................................................................................... 73 A.3.1: Benchmarking ................................................................................................................................... 73 Appendix B: Glossary ........................................................................................................................................................ 75 Appendix C: References .................................................................................................................................................... 77
Table of Figures Figure 1: Conceptual Architecture of how the different services of OpenStack interact. ................................................... 7 Figure 2: OpenStack deployment layout with networking setup. ........................................................................................ 9
5
1 Introduction This is an installation guide for OpenStack Private Cloud for Submarine Systems. For the context of this document and project details, the final version of the main report, “Building a Private Cloud with OpenStack: Technological Capabilities and Limitations, of the project should be read.
2 OpenStack Private Cloud Software The OpenStack Cloud software package is a collection of projects [1], which when combined creates an Open Source IaaS Private Cloud system. An IaaS cloud allows users to be able to request computing resources as required. For example a user is able to request a certain amount of CPU cores, RAM, and storage space, with a particular operating system image installed on top of the resources. However a strict definition of IaaS means that a user can request either a physical host or a virtual machine from a cloud. In lieu of version numbers, OpenStack releases are assigned code names in ascending order, alphabetically, of their first letter such as Austin, Bexar, Cactus, and Diablo. This project utilised the Juno series of releases. The current stable release at the time of writing is Kilo (April 30, 2015). With regards to cloud deployment the OpenStack cloud platform is a private cloud, that is a cloud that runs on the users own hardware, i.e., within the users own data center. Other deployment options are public cloud or a hybrid/federated cloud. A public cloud deployment means running the cloud platform on a service providers’ hardware. A hybrid, or federated cloud deployment is a combination of public and private cloud deployments. The use case of such a cloud is to be able to run confidential workloads on your own hardware, and workloads where scalability is the major factor on the public cloud components. While some of the projects under the OpenStack header are optional and are used only if the use case requires it, there are some projects, which need to be included for building a functional Cloud platform. These compulsory projects in the OpenStack header include Keystone (Identity Management), Nova (Computation Engine), Glance (Image Management), and one of nova-network or Neutron (Network Management). The choice of nova-network and neutron is one of requirements. There are some projects that require the new Neutron network management project in order to be functional. Other legacy projects may require the nova-network, which has a simpler use model, and can run on more limited hardware than Neutron. Other optional projects include Ironic (Bare-metal Management), Heat (Orchestration Engine), Horizon (Web GUI), Magnum (Containers as a Service), Ceilometer (Cloud Telemetry), Trove (Database as a Service), Sahara (Big Data), Swift (Object Storage), and Cinder (Block Storage). It is common practice to use at least one of Swift and Cinder to be able to attach storage capabilities to the computation resources. The Keystone identity service provides the authentication model of the cloud as well as manages where the endpoints of each of the services are located, so that they can easily communicate with each other as required. There are several key concepts in regards to the identification model as seen in the following table: Table 1:
Key Concepts for the Identification Model
Concept
Definition
User Tenant
A user of the cloud system. A project on the cloud, allows separation of allocated resources.
Role
A mapping of a set of users to the operations that they can perform.
Domain
Defines administration boundaries for identity management. i.e. allows the creation of users and projects within a particular domain. A group of users that can have a role assigned to it to allow access management to be simplified.
Group
The Nova Compute service allows the users to launch virtual or physical instances dependent on the users requirements as defined by an image and a flavor. The image is the operating system image that the user requires. The flavor is the machine specifications that include such factors as CPU cores required, memory allocated, and storage requirements. Each node allocated as a compute node runs the nova-compute service, which acts as OpenStack's interface to a hypervisor running on that node. The nova-compute service abstracts away the differences between different types of hypervisor (e.g., QEMU/KVM or VMware ESXi). On each node, the service is configured with a driver that communicates with the specific type of hypervisor running on that node. The hypervisor provider can be a choice of 6
several different options. Only one hypervisor driver can be placed on a single node. Some of the more common options for the hypervisor driver include Xen [2], QEMU/KVM [3], VMware ESXi [4], Linux Containers (LXC) [5], LXD [6], and Docker. Another option is to use the Ironic project baremetal hypervisor driver. The nova-compute service communicate with the Ironic Service through its API, which then uses the Ironic Conductor service to bring up a baremetal provisioned machine with the required specifications.
Figure 1: Conceptual Architecture of how the different services of OpenStack interacti. Both the nova-network and Neutron provide basic network capabilities to Virtual Machines (VMs) for connecting to each other and the outside world. Neutron also provides some advanced networking features such as constructing multi-tiered networks, defining virtual networks, routers and subnet. Neutron has a series of plugins that can further enhance the network capabilities; for example, LBaaS (Load Balancer as a Service) is a plugin that can support automated scaling of resources within a cloud with a load balancer (such as HAproxy [4]) used to decide which resource in the pool services the request.
3 Building Private Cloud with OpenStack for Submarine Mission Systems 3.1 Provided Hardware For the deployment of the OpenStack Private Cloud laboratory, DST provided us with the following hardware: •
An IBM BladeCenter HS22 Type 7870 with four blades. o
•
Each blade has two GPU Expansion Blades using Nvidia Tesla M2070 GPUs
Two IBM Networking OS Virtual Fabric 10GB Switch Module for IBM BladeCenter 7
3.2 Deployed Software Apart from the OpenStack services that are required, there are several other key software components involved in the deployment of the private cloud laboratory. For baremetal provisioning and disk image backup/restoration Clonezilla needs to be deployed. Additonal hypervisors to be used within OpenStack may require additional software. In our deployment we used the KVM hypervisor and VMware Hypervisors. The KVM hypervisor is the default for OpenStack and as such is usually installed by default. The VMware hypervisors however require an additional host for the ESXI master. The licensing of this software was provided by DST.
3.3 OpenStack Services •
From
Figure 1 we can see all of the different services and what they can provide for the Juno release of the OpenStack private cloud. For our requirements not all services were required, the services that we deployed for our cloud were as follows: •
Keystone
•
Nova (including the legacy networking service nova-network)
•
Glance
•
Horizon
•
Cinder
3.4 Deployment Layout The planned deployment layout can be seen below in Figure 2. The four physical blade hosts are deployed as: •
Clonezilla Host (10.33.136.2)
•
OpenStack Controller (10.33.136.3)
•
ESXI Master (10.33.136.6)
•
Compute Host (10.33.136.7)
The other hosts seen in Figure 2 are potential other hosts. This figure also shows that each of the compute hosts can only contain one hypervisor types. This implies that for any deployment the minimum number of compute hosts is equal to the number of hypervisor types that are required to be provided by the cloud.
8
Figure 2: OpenStack deployment layout with networking setup.
3.5 BladeCenter Hardware Preparation In order to prepare the BladeCenter for deployment rolling firmware upgrades to the UEFI (BIOS) and Integrated Management Module (IMM) of all four blades were performed. The BladeCenter is accessed via the IBM Advanced Management Module (AMM). It may be necessary to configure the Java plugin in your web browser. The access credentials for the AMM are available on the wiki and are the same as those in the configuration originally provided by DST.
3.5.1 Java Plugin Setup For a Fedora desktop, installing the Java plugin involves these additional steps: 1. Install the Java plugin in the web browser: sudo yum -y install icedtea-web 2. To allow AMM to upload images to the blade: sudo setsebool -P unconfined_mozilla_plugin_transition 0
3.5.2 IMM and Firmware Upgrades Both IMM and UEFI firmware upgrades are performed through the Blade Tasks -> Firmware Update section of the AMM by specifying a .bin file that has been downloaded from the IBM support center web site. A UEFI firmware update may specify a prerequisite IMM version in its associated .txt documentation. That is the minimum IMM version for which the UEFI firmware will work, but using that IMM version may introduce other problems. Support files, including updated firmware, for the HS22 (Type 7870) can be downloaded from the IBM Support Center at [2]. You will need an IBM ID (site login) to access the page. The preferred way to update the IMM and UEFI firmwares is: 1. Determine the current version of the firmwares. These are listed in the AMM web GUI under Monitors -> Firmware VPD. Note that the IMM firmware is referred to as "Blade Sys Mgmt Processor", whereas the UEFI firmware is called "FW/BIOS". 2. Install the latest IMM firmware through the AMM web GUI's Blade Tasks -> Firmware Update section. This
9
will automatically reboot the blade. 3. Finally, update the UEFI firmware through the Firmware Update section. This won't automatically update the blade. You must manually reboot the blade to complete the process.
3.5.3 Parts Explosion IBM's Help System has information on the HS22 (Type 7870) hardware [3].
3.5.4 Detected Hardware Automatically detected hardware inventory is listed amm2.cs.adelaide.edu.au, Monitors -> Hardware VPD.
in
the
AMM
GUI
at
cufflink-
A complete summary of the BladeCenter hardware in tabular form is available at http://cufflinkamm2.cs.adelaide.edu.au/private/hwvpd_summary.php
3.5.5 Network Adapters Each of the blades has an entry in the AMM under Monitors -> Hardware VPD. On board NICs are listed under the blade itself. Each blade also has an Ethernet HSEC (High Speed Expansion Card) attached to Blade Expansion Module 2. Details of these NICs are tabulated at http://cufflinkamm2.cs.adelaide.edu.au/private/hwvpd_summary.php#hwvpd_mac On board NICs are 1Gbps fibre and those on the expansion card are 10Gbps fibre. The part number for the 10Gbps NICs is 46M6171, which is described on a non-IBM site as a Broadcom 10 Gigabit Gen 2 2Port Ethernet CFFh expansion card (http://www.com-com.co.uk/IBM/parts/46M6171.ihtml). Searches for that part number restricted to the ibm.com domain don't turn up any results except a single forums post, but it is clear that that is the part. It appears to be the same as the part at [4]. It is a CFFh (Compact Form-Factor Horizontal) expansion card, installed according to the instructions at [5]
3.6 Deploying OpenStack on Virtual Hardware When experimenting with new OpenStack configurations, particularly networking features, the use of virtual machines as the deployment hosts for OpenStack services is highly recommended. Virtual machines provide an idealised set of host and network hardware in which a new configuration can be debugged. Our experience with Neutron, for instance, was that multiple hardware and software configuration issues arose, with very similar problem symptoms. Being able to eliminate some hardware (the physical switch, the host network interfaces) as a potential cause, saved time in debugging and gave certainty that the OpenStack software configuration was correct. Time was saved, despite the fact that on the surface, extra work had to be performed in setting up a virtualised OpenStack installation in addition to the one on physical hardware. A virtual machine installation of OpenStack should be part of every OpenStack experimenter's toolkit. This report details two virtual machine based DevStack configurations that were helpful in prototyping and debugging the configuration of Neutron and other services, in the sections “DevStack All-In-One Single VM”, page Error! Bookmark not defined., and “DevStack Multi-Node VMs”, on page Error! Bookmark not defined..
3.7 Switch Module for IBM BladeCenter The switch module in the racks is an IBM Virtual Fabric 10Gb Switch Module for IBM BladeCenter [6]. It has a web management GUI at http://cufflink-io2.cs.adelaide.edu.au. The best document to read to understand the Virtual Fabric Switch Module (VFSM) is the IBM Virtual Fabric 10Gb Switch Module for IBM BladeCenter Application Guide [7], document. The installation guide is available from [8].
10
3.7.1 Ports • Internal ports • 14 internal auto-negotiating ports: 1 Gb or 10 Gb to the server blades • Two internal full-duplex 100 Mbps ports connected to the management module • External ports • Up to ten 10 Gb SFP+ ports (also designed to support 1 Gb SFP if required, flexibility of mixing 1 Gb/10 Gb) • One 10/100/1000 Mb copper RJ-45 used for management or data • An RS-232 mini-USB connector for serial port that provides an additional means to install software and configure the switch module
3.7.2 Firmware Upgrade Instructions were derived from [9]. 1. Log into the switch: telnet cufflink-io2.cs.adelaide.edu.au 2. List general system information: /info/sys/gen 3. Note the Hardware Part Number: Hardware Part Number: 46C7193 4. Also take note of which flash image is currently used to boot the switch. Software Version 6.8.1.0 (FLASH image1), active configuration. 5. Search the IBM support site for the current firmware update download page. 6. Download the .zip and .txt file from [10] 7. Install and enable TFTP server. Ensure that the firewall is opened for that service. 8. Unpack the zip file to /var/lib/tftboot, where it will be served by the TFTP server: unzip ~/Desktop/ibm_fw_bcsw_24-10g-7.6.1.0_anyos_noarch.zip 9. Check connectivity to the TFTP server host, by running the ping command on the switch: ping 129.127.2.69 10. Download the the OS image to the opposite flash slot which is not currently in use (1 or 2): /boot/gtimg 2 129.127.2.69 GbESM-24-10G-7.6.1.0_OS.img • Enter 'y' to confirm the operation. The switch will verify that the image is correct before flashing it. 11. Download the boot image: /boot/gtimg boot 129.127.2.69 GbESM-24-10G-7.6.1.0_Boot.img • Enter 'y' to confirm the operation. The switch will verify that the image is correct before flashing it. 12. Configure the switch to boot from the new image: /boot/image image2 13. Save any unsaved configuration changes to flash: /save • Confirm the action by entering 'y'. 14. Reset the switch: /boot/reset
3.7.3 Industry Standard Command Line Interface (ISCLI) This procedure is derived from the IBM Virtual Fabric 10Gb Switch Module ISCLI - Industry Standard CLI Command Reference [11], PDF).
11
By default, the VFSM boots into the IBM N/OS CLI. To access the ISCLI: 1. Log into the switch: telnet cufflink-io2.cs.adelaide.edu.au 2. Set the CLI mode after the next reboot to ISCLI: /boot/mode iscli When in ISCLI mode, to switch back to the IBM N/OS CLI: 1. Switch to privileged EXEC: enable • The prompt will change from "Router>" to "Router#". 2. Switch to Global Configuration mode: configure terminal • The prompt will change to "Router(config)#". 3. Change the mode: boot cli-mode ibmnos-cli 4. Enter the following command to reset (reload) the switch: reload
3.7.4 VLANs The VLAN features of the switch are described in the IBM Virtual Fabric 10Gb Switch Module for IBM BladeCenter Application Guide [12] PDF, Chapter 7. By default, all ports on a VFSM are set as untagged/access mode members of VLAN 1, and all ports are configured as PVID/Native VLAN = 1. At least, as stated in that document in one place. Another part of the same document contradicts that, stating "By default, the N/OS software is configured so that tagging/trunk mode is disabled on all external ports, and enabled on all internal ports." In overview, the VLAN features of the switch are: • An untagged frame entering the switch will propagate to any ports on the PVID/Native VLAN of the port that it enters through. • A frame that was already tagged before arriving at the switch will retain that VLAN ID as it traverses the switch. What happens to that tag when it leaves the switch depends on the mode of the port that it exits. • A tagged frame leaving the switch via an untagged/access mode member (port) has any VLAN ID (VID) stripped from the frame. An untagged frame leaving the switch through an untagged/access mode member stays untagged as it exits the switch. • A tagged frame leaving the switch via a tagged/trunk mode member will have that tag preserved as it exits the switch. An untagged frame leaving the switch via a tagged/trunk mode member will have the PVID tag added to the frame. The above discussion distinguishes handling of VLAN tags in frames based on the mode of the port through which they exit the switch. It is also possible to tag frames on entry into the switch. This is called ingress tagging. By default, ingress VLAN tagging/trunk mode is disabled on all ports. When enabled, frames are tagged with the PVID/Native VLAN ID of the ingress port, even if the frame already has a tag. That is, untagged frames entering the switch through that port are tagged by the PVID, and tagged frames entering the switch through that port are tagged a second time, as the outer tag of the frame. If the frame leaves the
12
switch via a tagging/trunk mode member, the outer tag is retained (the frame will leave the switch with two tags). If, on the other hand, the frame exits the switch via an untagged/access mode member, then the outer tag will be stripped, leaving only the tag that the frame had on entering the switch (or no tag if it was untagged). Thus, ingress VLAN tagging/trunk mode is used to tunnel packets through a public domain without altering their original 802.1Q status. The VFSM also supports protocol-based VLANs, where untagged/access mode frames are assigned a VLAN ID based on the frame type and ethernet type of the frame, or failing that, the PVID/native VLAN is assigned to the frame. When a tagged/trunk mode frame arives at the port, the VLAN ID in the frame's tag is used.
3.7.5 Adding the Second Switch Module This section details the steps taken to configure the second switch module added to the rack. When moving the switch into a different chassis, this procedure should be followed to configure the switch in its new location, bearing in mind that the AMM will be accessed on a different IP, and some details will differ according the contents of the new chassis.
3.7.6 AMM Configuration Log in to the AMM at http://cufflink-amm2.cs.adelaide.edu.au (129.127.9.81). 1. Navigate the web interface to I/O Module Tasks -> Configuration. • The first tab, "IPv6 Support", will show that I/O Module Bays 7 and 9 are populated. • I/O Module Bay ("slot") 7 is the first switch, providing 10Gb/s connectivity on the eth2 interface of the blades. • I/O Module Bay ("slot") 7 is the new, second switch, providing 10Gb/s connectivity on the eth3 interface of the blades. 2. Select the "Slot 9" tab to configure the second switch. 3. Under IPv4, New Static IP Configuration, enter the following settings: • Configuration status: Enabled • Configuration method: Static • IP address: 129.127.9.83 • Subnet mask: 255.255.255.0 • Gateway address: 129.127.9.99 4. Click Save, to the right of that section. • This changes the external IP address that the AMM uses to expose the switch management interface, but doesn't change the configuration of the switch itself. • This address is the same IP that the switch had when it was in the other chassis. • The IP address of the first switch is 129.127.9.82, with the same gateway and subnet mask.
4 Ubuntu 14.04 Desktop or Server Installation and Configuration. This section describes the common procedures and configuration of the installation of Ubuntu 14.04 on both physical and virtual machines.
4.1 Installation Ubuntu 14.04 was installed from ubuntu-14.04.1-desktop-amd64.iso or ubuntu-14.04-server-amd64.iso with the following customisations: • Add a user called “user” with the customary password. 13
• Time zone Australia/Adelaide. All other options were left as defaults, including the language setting (US English). Some tools run into problems with non-default language/locale settings and the test coverage of non-American English is generally not as extensive.
4.2 Post-Installation 4.3 Networking Networking is configured according to the role that the host will have, per any section referencing this one. However, in VLANs, bridged and bonded interfaces don't work "out of the box" on Ubuntu. In order to install the packages to support those features, it will be necessary to get a basic network configuration up and running, using a configuration similar to the following, for /etc/network/interfaces (the IP address will change): auto eth2 iface eth2 inet static address 10.33.136.2 netmask 255.255.255.192 gateway 10.33.136.62 dns-nameservers 8.8.8.8 8.8.4.4
4.4 Package Upgrades Upgrade all packages to the latest versions: apt-get update apt-get upgrade
4.5 OpenSSH 1. In order to remotely manage the server, install OpenSSH. The service will be enabled and started automatically: apt-get install openssh-server 2. To enable root to ssh in directly, modify the pre-existing PermitRootLogin line in /etc/ssh/sshd_config to read: PermitRootLogin yes 3. Reload the ssh configuration: service ssh reload
4.6 General/Base Packages For package searching using "apt-file search": apt-get install apt-file apt-file update
4.6.1 Additional Base Operating System Packages and NTP Configuration 1. Install additional packages as follows: apt-get install aptitude build-essential python-dev sudo traceroute 2. Ensure that the ntp service is enabled at boot:
git
ntp
ntpdate
openssh-server
\
14
update-rc.d ntp enable 3. Start the service: service ntp start 4. Test ntp: ntpdate -u pool.ntp.org
5 Clonezilla 5.1 Installation Installing Clonezilla is mostly a matter of installing DRBL. The DRBL installation instructions (http://drbl.org/installation) have not been updated for Ubuntu 14.04 (the latest information is for 13.10), but those instructions will work for 14.04 with some minor adaptations. This procedure will install Clonezilla SE (Server Edition).
5.2 Base Operating System Ubuntu 14.04 is installed per the procedure on page Error! Bookmark not defined.. A static IP address is configured on the management network (10.33.136.0/26) on the eth2 interface, or if the on-board 1Gb interfaces are disabled in the UEFI configuration, eth0. In the instructions that follow, it is assumed that Clonezilla controls hosts on the management network on eth0.
5.3 Package Installation 1. Install the key file that authenticates the DRBL package repository: wget -q http://drbl.org/GPG-KEY-DRBL -O- | sudo apt-key add 2. Append the following lines to /etc/apt/sources.list: deb http://archive.ubuntu.com/ubuntu trusty main restricted universe \ multiverse deb http://free.nchc.org.tw/drbl-core drbl stable 3. Update the local package information: sudo apt-get update 4. Install the DRBL package and dependencies that were apparently omitted from its spec file: sudo apt-get -y install drbl mkpxeinitrd-net clonezilla \ isc-dhcp-server tftpd-hpa nfs-kernel-server
5.4 DRBL Server Interactive Configuration Interactively configure the DRBL server as follows: 1. Run the configuration utility: drblsrv -i 2. Enter N to not download network installation boot images for various distributions. 3. Enter N to not send console output on the client to the serial port. 4. Enter N to not upgrade OS packages. 5. Enter 1 to use the locally installed kernel on the clients (3.13.0-32-generic x86_64).
5.5 DRBL Client Push Interactive Configuration The following procedure interactively configures how DRBL manages clients. However, note that if you have a pre-existing /etc/drbl/drblpush.conf file, then you re-deploy the DRBL configuration from that
15
using the command: drblpush -c /etc/drbl/drblpush.conf A full copy of /etc/drbl/drblpush.conf for the blade hardware is supplied later in this document. Interactively configure how DRBL manages clients: 1. Run the configuration utility: drblpush -i 2. For the NDS domain, enter localdomain. 3. For the NIS/YP domain, enter localdomain. 4. For the client hostname prefix, enter compute- (note the trailing '-'). • Hostnames offered to clients can be customised individually in /etc/drbl/client-ip-hostname, which is in /etc/hosts format. 5. The drblpush script will provide different options here depending on the configured interfaces on the system. • On a system with a single interface, it will ask you to confirm that you want to continue using eth0 for both the public internet access port and to manage the clients. • On the blade hardware, it assumes that the SOL port, usb0 is the public internet and that the clients are managed on eth0, which is only correct in the latter case. 6. When asked whether you want to collect the MAC addresses of the clients, enter N. 7. Enter Y to configure DRBL to offer the same IP address to clients on every boot. 8. Press Enter to accept the default filename in which to save MAC addresses (macadr-eth0.txt). 9. For the start of the client DHCP IPv4 address range (last octet), enter 7. 10. Enter Y to accept the configured range (10.33.136.7 onwards). 11. Press Enter to continue. 12. Enter 1 (DRBL Single System Image) mode for clients. In this mode thin, all clients will use the same shared /etc and /var directory image in tmpfs. • Option 0 here allows each client to access its own private /etc and /var served over NFS. • This setting only applies to the operation of thin clients. In the laboratory, we are using Clonezilla to read and write bitwise images of the hard drive of compute nodes, which then boot from the local disk, and don't run as thin clients. 13. Enter 1, for Clonezilla box mode, where /etc and /var are given to Clonezilla via tmpfs. 14. Enter /var/partimag as the directory to store saved images. 15. Enter N to not use local swap partitions in thin clients. 16. Enter 2 to boot clients in text mode. 17. Enter Y to set the root password for clients. 18. Enter the desired thin client root password and enter it again to confirm. 19. Enter N to not set a pxelinux password for clients. 20. Enter N to not set a client boot prompt. 21. Enter N to not use a graphic background for clients. 22. Enter Y to let thin client users access local devices. 23. Enter N to not set a public IP alias interface on clients. 24. Enter N to not set up the Clonezilla host as an xdmcp terminal server. 25. Enter N to not configure DRNL as a NAT server for the clients. • The clients will receive an IP address that can route to the internet directly, but they're unlikely to need internet access while running Clonezilla anyway. 26. Press Enter to acknowledge the message that the current host supports NFS over TCP. 27. Enter Y to deploy. 28. Edit /etc/drbl/drblpush.conf and insert the correct gateway in the [eth0] section:
16
gateway=10.33.136.62 The resultant /etc/drbl/drblpush.conf file is: #Setup for general [general] domain=localdomain nisdomain=localdomain localswapfile=no client_init=text login_gdm_opt= timed_login_time= maxswapsize= ocs_img_repo_dir=/var/partimag total_client_no=0 create_account= account_passwd_length=8 hostname=computepurge_client=no client_autologin_passwd= client_root_passwd=PASSW0RD client_pxelinux_passwd= set_client_system_select=no use_graphic_pxelinux_menu=no set_DBN_client_audio_plugdev=yes open_thin_client_option=no client_system_boot_timeout= language=en_US.UTF-8 set_client_public_ip_opt=no config_file=drblpush.conf collect_mac=no run_drbl_ocs_live_prep=yes drbl_ocs_live_server= clonezilla_mode=clonezilla_box_mode live_client_branch=alternative live_client_cpu_mode=i386 drbl_mode=drbl_ssi_mode drbl_server_as_NAT_server=no add_start_drbl_services_after_cfg=yes continue_with_one_port= #Setup for eth0 [eth0] interface=eth0 mac=macadr-eth0.txt ip_start=7 gateway=10.33.136.62
5.6 Collecting Client MAC Addresses Hosts are put under the control of DRBL or Clonezilla by adding their MAC addresses to a configuration file. You can collect MAC addresses manually by displaying the details of an interface with the ifconfig command. Alternatively, DRBL includes a stand-alone utility, drbl-collect-mac, to gather the MAC addresses of any clients that connect to the DHCP server. It will write these to a file named after the interface on which the DHCP server is listening, as configured by the INTERFACES= setting in /etc/default/isc-dhcp-server.
17
The MAC address file will typically be called /etc/drbl/macadr-eth0.txt. The format is simple: one MAC address per line, as 6 hex octets separated by colons. If you edit this file, do not add leading or trailing spaces or blank lines, because DRBL will insert garbage entries into /etc/dhcp/dhcpd.conf in that case. Note that /etc/drbl/macadr-eth0.txt will be completely rewritten to contain only the MAC addresses of the clients that connected to the DHCP server when you run drbl-collect-mac, so it's a good idea to back up that file if you don't intend to reboot every client. 1. On the Clonezilla host, backup the previously collected MAC addresses: cp /etc/drbl/macadr-eth0.txt /etc/drbl/macadr-eth0.`date +%Y%m%d-%H%M%S` 2. Run drbl-collect-mac: drbl-collect-mac eth0 3. Start a client, configured to boot from PXE and then enter Y at the drbl-collect-mac prompt to collect its MAC address. 4. Enter 1 to list the collected MAC addresses. 5. Enter q to quit collecting. The collected MAC addresses should be in /etc/drbl/macadr-eth0.txt. It will look something like this: 52:54:00:2c:25:3b 52:54:00:b1:a6:a6 52:54:00:92:0a:d3
5.7 Configuring New Clients After a client's MAC address has been added to etc/drbl/macadr-eth0.txt, the NFS exports and DHCP server configuration must be updated to correctly boot the client with PXE. Assuming you have an /etc/drbl/drblpush.conf configuration file, you can reconfigure clients whose MAC address has been added using the following command: yes | drblpush -c /etc/drbl/drblpush.conf This will reconfigure new clients according to the previously saved settings. Clients will be assigned IP addresses in the order that their MAC addresses are listed in /etc/drbl/macadr-eth0.txt. Given that ip_start in /etc/drbl/drblpush.conf is 7, the first MAC address will correspond to 10.33.136.7. Subsequent MAC addresses in that file will be mapped to 10.33.136.8, .9, etc..
5.8 Fix Subnet Masks As noted at [13], DRBL doesn't understand subnet masks. It is hard-coded to assume 24-bit class C subnets, always. You must therefore fix /etc/dhcp/dhcpd.conf every time that you run drblpush, which will probably only be when adding new MAC addresses to /etc/drbl/macadr-eth0.txt. 1. Open /etc/dhcp/dhcpd.conf in an editor. 2. Near the top of the file, change the subnet-mask option to the correct subnet: option subnet-mask 255.255.255.192 3. At the bottom of the file, change the 10.33.136.0 subnet accordingly: subnet 10.33.136.0 netmask 255.255.255.192 { option subnet-mask 255.255.255.192; option routers 10.33.136.62; next-server 10.33.136.2; host compute-10-33-136-7 { hardware ethernet 00:10:18:ab:00:04; fixed-address 10.33.136.7; option host-name "compute-10-33-136-7"; 18
} } 4. Also, uncomment any 'option host-name "";' lines in the subnet section. This will ensure that clients use the hostname provided by the DHCP server. 5. Restart the DHCP server: service isc-dhcp-server restart
5.8.1 Fix Clonezilla Disk Detection On some hardware, Clonezilla is unable to detect the local hard drives and gives up on trying to save or restore disks or partitions. This problem affects the IBM HS22 blades. It is probably caused by slow initialisation of the disks by the LSI RAID controller, since the fix we discovered is to add extra delay when Clonezilla is reading the partition table during cloning. The procedure to fix disk detection is as follows: 1. Edit /usr/share/drbl/sbin/ocs-functions. 2. Find the function called gen_proc_partitions_map_file() and insert the following lines at the start of the function: # When the system first starts up, /proc/partitions shows an entry for sr0 only. # After a short delay, it shows sda, sda1, sda2, etc and sdb. Wait for more than just sr0. # There are two lines of header data in /proc/partitions before the first line of data. while [ "$(wc -l < /proc/partitions)" -lt 4 ]; do echo 'Waiting for disks to initialize...' sleep 5 done When Clonezilla boots a host to be cloned using PXE, it mounts /usr using NFS, and that includes the Clonezilla source code edited above. Clonezilla tries to find the disk or partitions to save or restore in /proc/partitions. As described in the comments added to the code, when Clonezilla fails to find the local disks and partitions on the blades, /proc/partitions contains only a header line, a blank line and an entry for sr0. The expected entries for sda, sda1, etc. are missing, but appear a few seconds later, when the hardware is fully initialised. The code simply waits until /proc/partitions contains at least 4 lines.
5.9 Verifying the Configuration 1. Check that the all services are running: service isc-dhcp-server status service tftpd-hpa status service nfs-kernel-server status 2. Check for errors in /var/log/syslog. 3. Check that /etc/exports has entries for each client with a MAC address listed in /etc/drbl/macadr-eth0.txt: # Generated by DRBL at 19:36:06 2015/01/29 /tftpboot/node_root 10.33.136.7(ro,async,no_root_squash,no_subtree_check) /usr 10.33.136.7(ro,async,no_root_squash,no_subtree_check) /home 10.33.136.7(rw,sync,no_root_squash,no_subtree_check) /var/spool/mail 10.33.136.7(rw,sync,root_squash,no_subtree_check) /opt 10.33.136.7(ro,async,no_root_squash,no_subtree_check) /var/partimag 10.33.136.7(rw,sync,no_root_squash,no_subtree_check)
19
5.10 Client UEFI Settings The IBM HS22 (Type 7870) blades will fail to boot by PXE unless they are configured to prefer legacy PXE mode over UEFI PXE support. To configure the blades correctly: 1. 2. 3. 4. 5. 6.
Reboot the blade. Press F1 to enter the system configuration utility. Select "System Settings" and press Enter. Select "Network" and press Enter. Select "Network Boot Configuration" and press Enter. Select the relevant network adapter, as identified by its MAC address and press Enter. (The 1Gb ethernet ports will contain the words "Onboard PFA" in their name, whereas the 10Gb ports that we are using instead will contain "Slot 2 Dev PFA" in the name). 7. Set the PXE Mode to "Legacy Support" and press Enter. 8. Select Save Changes and press Enter. 9. Press Esc five times and then press Y to save the changes.
5.11 DRBL Commands 5.11.1 Restarting Services DRBL can restart all services with the following command: drbl-all-service restart
5.11.2 Controlling DRBL Interactively The dcs command brings up a text mode menu system that can control all clients, or a subset selected by their MAC or IP addresses. Selected clients can be rebooted, booted into a thin client or an image taken or written to them using Clonezilla. Various configuration options for DRBL are also available through these menus.
5.11.3 DRBL Command Line Where dcs changes the state of a client, it will output a command line for drbl-ocs that can be run later to achieve the same result as the interactive menus.
5.11.4 Saving An Image Use the following procedure to save an image of a node using Clonezilla: 1. Prepare the node exactly how you would like it. Be particularly mindful of requirements on user accounts, DHCP configuration and the hostname, as described in the section “Standard Image Requirements”, page Error! Bookmark not defined.. 2. If the node is an OpenStack compute node, login as stack@controller and run /opt/stack/devstack/removenode.sh to remove instances and disable the nova-compute service on the node, e.g.: /opt/stack/devstack/remove-node.sh compute-10-33-136-7 3. Configure the DRBL/Clonezilla dcs utility to save an image of the node, as follows: 1. Login as root on the Clonezilla ssh root@clonezilla 2. Run dcs. 3. From the menu, select "Part Select Client(s) by IP or MAC address" and press Enter. 4. Select "by_IP_addr_list" and press Enter. 5. Mark the node to save by pressing the Space key, then press Enter. 6. Select the mode "clonezilla-start" and press Enter. 7. Select "Beginner" to use defaults for most options, then press Enter.
host:
20
8. Select "save-disk", then press Enter. 9. Select "Now_in_server" to enter the image name now, rather than when Clonezilla boots the node, then press Enter. 10. Input the base name of the image file in the /var/partimag directory and press Enter. 11. Input sda as the disk to save and press Enter. 12. Select "Skip checking/repairing source file system" and press Enter. 13. Select "Yes, check the saved image" and press Enter. 14. Select "-p poweroff" as the post-imaging command. The node will shut down when imaging is complete. 15. Accept the default image chunk size of "1000000" (in units of MB, i.e. 1TB total) and press Enter. 16. The dcs command will then reconfigure DRBL and quit. 4. Start the IBM Advanced Management Module (AMM) remote control GUI to view the console of the node. 5. Remotely reboot the node: ssh root@ reboot 6. When the UEFI boot screen appears, press F1 to enter the configuration menus and configure "PXE Network" as the first entry in the boot order, as follows: 1. Select "Start Options" and press Enter. 2. Choose the option to boot from the "PXE Network" and exit the UEFI configuration utility. 7. The system will continue booting. Clonezilla will start by PXE boot and will save the image to the Clonezilla host in the /var/partimag directory. 8. When Clonezilla has finished, the node will power off. 9. Once the node has powered off, use the AMM remote control facility to power the node back on. For a baremetal-provisioned node, always ensure that the boot order is set such that "PXE Network" is the last option. Otherwise, Clonezilla will boot when you actually intend to boot the operating system on the hard drive.
5.12 Baremetal Provisioning with Clonezilla Clonezilla can be interactively configured to restore an image onto a hard drive or partition using the dcs command, in much the same way that it is used to save an image. When dcs quits it saves a script in /tmp that contains the command used to install the image on the node. That command will be something like: drbl-ocs -b -g auto -e1 auto startdisk restore "$IMAGE" sda
-e2
-r
-x
-j2
-p
reboot
-h
"$IP"
-l
en_US.UTF-8
\
Here, $IP is the IP address of the node and $IMAGE is the directory under the /var/partimag directory containing the image. Note that the command contains the "-p reboot" option, which causes Clonezilla to reboot the node as its final action after imaging. The standard options listed in the command above are preserved in a shell script on the Clonezilla host: /root/provision-node.sh. Typical usage would be: /root/provision-node.sh 10.33.136.7 gold-devstack-juno-ubuntu-4-nics The provisioning mechanism relies on the following configuration of the UEFI boot manager: • There is an entry in the boot order to boot each possible operating system contained in images that might be assigned to the node. That is, if the node could be assigned CoreOS or Ubuntu, then the boot order will contain entries to boot both CoreOS and Ubuntu. • All entries in the boot order that boot an operating system image precede the "PXE Network" entry. Typically, the "PXE Network" would appear last in the boot order. The way that baremetal provisioning using Clonezilla works is as follows: 21
1. The provision-node.sh script is run on the Clonezilla host to configure DRBL to restore an image to the specified node. 2. An automated process logs into the node as root and erases the MBR/GPT with the command: dd if=/dev/zero of=/dev/sda bs=1024 count=64; reboot 3. The node reboots, the UEFI boot manager discovers the lack of a GPT and all operating system boot options fail. The boot manager falls back on the "PXE Network" boot option. 4. Clonezilla boots over PXE and installs the image, then reboots the node again. 5. This time the UEFI boot manager finds a valid GPT and boots the newly installed operating system.
5.12.1 Saved Configuration Crucial host operating system configuration files for the Clonezilla host, as well as the configuration of the Clonezilla software, are saved in version control under openstack_project/Clonezilla/. The files are: • /root/provision-node.sh - The script that configures Clonezilla to roll out a specified image to the host at a particular IP address. This script is called by SSH as the root user from the OpenStack Development GUI. • /etc/hostname, /etc/hosts, /etc/network/interfaces - Hostname and static IP address. • /etc/drbl/drblpush.conf - DRBL configuration. • /etc/drbl/macadr-eth0.txt - MAC addresses of DST-provided blades.
5.13 Standard Image Requirements 5.13.1 Standardised User Accounts Standard cloud images for OpenStack have no standard practice for the name of the unprivileged user account. They typically also do not support password authentication and rely on cloud-init to inject an SSH key. So that we can access baremetal and virtual instances with minimal concern for the underlying operating system, we will standardise on a login account and SSH key authentication, as follows: • For both baremetal and virtual machine instances, SSH login is under the "user" account. Login is authenticated using the "provisioning.pub" public key. • On baremetal systems, the "user" account must be added manually, prior to taking a Clonezilla image of the node. • On virtual machines, the "user" account is injected using cloud-init. • On baremetal systems, the root account has its standard password and also configured to accept "provisioning.pub" for remote login. SSH login as root is configured as enabled if that is not allowed by default. • On baremetal systems running nova-compute, the DevStack "stack" user is also configured with its standard password and the "provisioning.pub" key in its authorized_keys file. • On baremetal systems, the "user" account is configured to allow passwordless sudo to root by creating a file /etc/sudoers.d/user with the following contents: user ALL=(ALL:ALL) NOPASSWD: ALL • On virtual machines, cloud-init is used to inject the same configuration for the "user" account.
5.13.2 Hypervisor Hosts Nodes that run the OpenStack nova-compute (hypervisor) service must meet the following additional requirements: • The node must have a unique host name. Since all compute nodes except the controller are installed by cloning with Clonezilla, this means that these baremetal nodes must use the hostname provided by the DHCP server running on the Clonezilla host. The hostname will be of the form: compute-A-B-C-D where A.B.C.D is the IP address of the DHCP-managed interface of the node. • All OpenStack services must start automatically on boot, and the nova-compute service must be re-enabled
22
with the equivalent of the following command: nova service-enable "$(&/dev/null glance image-create \ --name "$COMPONENT_HOST_VMDK_NAME" \ --file "$COMPONENT_HOST_VMDK" \ --progress \ --container-format bare \ --disk-format vmdk \ --property image_type="snapshot" \ --property hypervisor_type="vmware" \ --property vmware_disktype="streamOptimized" \ --property vmware_image_version="1" \ --property vmware_adaptertype="ide" \ --is-public true The script also imports a saved snapshot of a KVM instance with a non-default kernel, and where NVIDIA CUDA 7 has been installed and some CUDA samples have been compiled, since those where quite time-consuming activities.
8.1.13 /opt/stack/devstack/remove-node.sh This script disables the OpenStack nova-compute service on a host that is specified by its hostname. This is the reason why care must be taken in setting the hostnames of compute nodes running OpenStack services; the OpenStack commands use the hostname to select services to disable. OpenStack doesn't provide an API to remove a compute node from the database, but a compute node must somehow be disabled if the physical host is reprovisioned for some other purpose, e.g. running a Linux system rather than a hypervisor. So the approach taken by the OpenStack Development GUI is to 42
run remove-node.sh over an SSH connection, to disable the nova-compute service before the physical host is reprovisioned. The script also deletes any virtual machine instances under the control of the compute node. OpenStack is aware that when a given Host Aggregate's member hosts have no enabled nova-compute service, then the Host Aggregate and the corresponding Availability Zone are unavailable. This is used in the OpenStack Development GUI to detect when a compute node has been removed by baremetal provisioning. The relationship between compute nodes, Host Aggregates and Availability zones is discussed in further detail in the section "OpenStack Development GUI", page Error! Bookmark not defined..
8.1.14 /etc/rc.local As the Controller also acts as a PCI passthrough-capable KVM compute node, PCI devices must be configured at startup according to the section "PCI Passthrough", page Error! Bookmark not defined.. Configuration was added to /etc/rc.local to detach the GPU devices from the physical host, so that they could be allocated to virtual machines, and to disable the secondary functions of the two GPUs (the .1, audio function).
8.1.15 Saved Configuration for Compute Nodes 8.1.16 /opt/stack/devstack/local.conf The local.conf files on the KVM and VMware compute nodes are, for the most part, the same. The local.conf file for the VMware hypervisor is shown below: [[local|localrc]] #RECLONE=yes # Branches. #KEYSTONE_BRANCH=stable/juno #NOVA_BRANCH=stable/juno #NEUTRON_BRANCH=stable/juno #SWIFT_BRANCH=stable/juno #GLANCE_BRANCH=stable/juno #CINDER_BRANCH=stable/juno # NOTE: Heat breaks stack.sh on stable/juno. #HEAT_BRANCH=master #TROVE_BRANCH=stable/juno #HORIZON_BRANCH=stable/juno MULTI_HOST=1 HOST_IP=10.33.136.4 SERVICE_HOST=controller FLAT_INTERFACE=eth0 #FIXED_RANGE=10.0.0.0/24 #FIXED_NETWORK_SIZE=254 FIXED_RANGE=10.33.136.0/26 FIXED_NETWORK_SIZE=46 FLOATING_RANGE=10.33.136.48/28 #ENABLE_FILE_INJECTION=True DATABASE_TYPE=mysql
43
MYSQL_HOST=$SERVICE_HOST RABBIT_HOST=$SERVICE_HOST GLANCE_HOSTPORT=$SERVICE_HOST:9292 CINDER_SERVICE_HOST=$SERVICE_HOST CINDER_ENABLED_BACKENDS=lvm:default # Services ENABLED_SERVICES=n-cpu,n-net,n-api,c-vol # VNC NOVA_VNC_ENABLED=True VNCSERVER_PROXYCLIENT_ADDRESS=$HOST_IP VNCSERVER_LISTEN=$HOST_IP NOVNCPROXY_URL="http://$SERVICE_HOST:6080/vnc_auto.html" # Credentials DATABASE_PASSWORD=Passw0rd MYSQL_PASSWORD=Passw0rd ADMIN_PASSWORD=Passw0rd SERVICE_PASSWORD=Passw0rd SERVICE_TOKEN=Passw0rd RABBIT_PASSWORD=Passw0rd # Enable Logging LOGFILE=/opt/stack/logs/stack.sh.log LOGDAYS=1 VERBOSE=True LOG_COLOR=False SCREEN_LOGDIR=/opt/stack/logs # VMware VIRT_DRIVER=vsphere #CINDER_DRIVER=vsphere VMWAREAPI_IP=10.33.136.5
[email protected] VMWAREAPI_PASSWORD='P455w0rd!' VMWAREAPI_CLUSTER=OpenStack [[post-config|$NOVA_CONF]] [DEFAULT] # No longer doing: auto_assign_floating_ip = True pci_passthrough_whitelist=[{ \\"vendor_id\\": \\"10de\\", \\"product_id\\": \\"06d2\\" }] # If we don't specify a datastore regex, Compute will use the first datastore returned by the vSphere API. # Do not repeat the [[post-config|$NOVA_CONF]] header. It will hide what follows. [vmware] datastore_regex = NFSDatastore.* # Experimental, not working. #[[post-config|$CINDER_CONF]] # #[DEFAULT] #default_volume_type = vsphere #enabled_backends = vsphere #[vsphere] #volume_driver = cinder.volume.drivers.vmware.vmdk.VMwareVcVmdkDriver #vmware_host_ip = $VMWAREAPI_IP
44
#vmware_host_username = $VMWAREAPI_USER #vmware_host_password = $VMWAREAPI_PASSWORD #volume_backend_name = vsphere Some notable features of this configuration are as follows: • The FLAT_INTERFACE (the management/private network) is eth0 for the vmware-nova-compute virtual machine, but it is eth2 for the physical blade that runs the KVM compute node. (That configuration is not shown, but is provided in the source code repository.) • When using VIRT_DRIVER=vsphere, it is necessary to specify CINDER_ENABLED_BACKENDS=lvm:default. The name of the Cinder backing file is derived from this setting, and when using VMware, the default file name is different for VMware from what it is for KVM. We want the file name to be the same on both hypervisor types to simplify the code in restack.sh, which re-enables a compute node after a reboot. • Although VMware supports PCI passthrough to virtual machines created through the vSphere Web Client (see the section "VMware", page Error! Bookmark not defined.), these capabilities are not passed through OpenStack. Nevertheless, pci_passthrough_whitelist is set the same as it is on the KVM hypervisor (mainly to verify the lack of functionality). • The VMWAREAPI_USER and corresponding VMWAREAPI_PASSWORD settings are the login credentials for vCenter Server, as used in vSphere Web Client. • The cluster name is specified by VMWAREAPI_CLUSTER. • stack.sh lacks a setting to set the name of the datastore. The nova-compute vsphere driver uses the VMware API to select the datastore for virtual machines and temporary files, and that will return the first datastore on a given ESXi host. If the host has a local datastore, the result is that virtual machines will be placed on that. We explicitly set a regular expression naming the datastore to use in the [vmware] section of /etc/nova/nova.conf.
8.1.17 /etc/rc.local On compute nodes that are started by baremetal provisioning, the standard system wide post-initialisation script, /etc/rc.local, has custom contents to do the following: • Run restack.sh to re-enable OpenStack services, since with DevStack, they don't start up automatically on reboot. • Run re-enable.sh to re-enable the OpenStack nova-compute service for the compute node, signifying that it is now available for use. • Set up PCI passthrough of the GPU devices.
8.1.18 /opt/stack/devstack/restack.sh This custom script appears on any compute node that comes and goes by baremetal provisioning. It restores operation of the Cinder backing file (/opt/stack/data/stack-volumes-default-backing-file) and the OpenStack services, each within their own screen session. The script is necessary to correct a deficiency in the standard DevStack rejoin-stack.sh script, which is not amenable to automated (non-interactive) execution due to screen's handling of pseudo terminals. The script also sets the current IP address of the compute node within /etc/nova/nova.conf and, when Neutron is in use, the Neutron ML2 configuration file, since stack.sh inserts the compute node's IP address into these files. The IP address changes when the compute node is installed on a different (or additional) physical host by baremetal provisioning because it is allocated by DHCP. The script is run from /etc/rc.local when the system boots, with the following line: ssh -i /opt/stack/.ssh/provisioning -o 'StrictHostKeyChecking no' stack@localhost "devstack/restack.sh" >>"$LOG" 2>&1 &
45
8.1.19 /opt/stack/devstack/re-enable.sh This custom script re-enables the nova-compute service, which was disabled prior to the host being shut down, as part of the baremetal provisioning process
8.1.20 /etc/dhcp/dhclient-exit-hooks.d/hostname This custom script sets the hostname to that supplied by the DHCP server (running on the Clonezilla host). This is required for correct operation of remove-node.sh, restack.sh and re-enable.sh.
8.1.21 /etc/network/interfaces The host network interface configuration differs significantly between the KVM compute nodes and the VMware compute node virtual machine. In the case of KVM compute nodes, as with the controller, promiscuous mode must be enabled for novanetwork to intercept traffic destined for the IP addresses of virtual machine instances. Since multiple KVM compute nodes can be created by baremetal provisioning, they have IP addresses assigned by DHCP. The relevant configuration is: auto eth2 iface eth2 inet dhcp up ip link set $IFACE promisc on As always with nova-network, the IP address of the physical NIC is moved to a bridge, br100. In the case of VMware compute node (of which there is only ever one), we explicitly create the br100 bridge, set a static IP address on it, set it into promiscuous mode and add the eth0 interface of the virtual machine. The compute node on VMware must be configured in this way for instances to be able to reach the internet. It is not sufficient to simply allow stack.sh to configure br100 automatically and move the IP address of the network interface to it. Note that if br100 on the controller is configured in this way, it results in ICMP traffic (pings) being duplicated. auto eth0 iface eth0 inet manual auto br100 iface br100 inet static bridge_ports eth0 bridge_stp off bridge_maxwait 0 bridge_fd 0 address 10.33.136.4 netmask 255.255.255.192 gateway 10.33.136.62 dns-nameservers 8.8.8.8 up ip link set $IFACE promisc on
8.2 DevStack All-In-One Single VM 8.2.1 Overview The following procedure describes how to prepare a VM running a single-node "all-in-one" DevStack setup that includes Ironic (baremetal provisioning) as the hypervisor. If you don't want Ironic, then the relevant services can be removed from local.conf. If you want Ironic and the KVM hypervisor as well, then follow the instructions in the section “DevStack Multi-Node VMs”, page Error! Bookmark not defined., instead. OpenStack requires at least one node 46
per hypervisor type, where Ironic is counted as one type.
8.2.2 VM Preparation Configure a KVM virtual machine with the following specs: • At least 8GB RAM. The VM will need RAM to start instances in. • One NIC: virtio, NAT • Ubuntu 14.04 Server: • User: user/PASSW0RD • Set the password for root to the same. • Post installation: apt-get update && apt-get -y upgrade && apt-get -y install git reboot • Perform the OpenSSH configuration in the section “Ubuntu 14.04 Desktop or Server Installation and Configuration”, page Error! Bookmark not defined., for convenient shell access to the VM.
8.2.3 "stack" User Setup 1. Log into the VM as root. 2. Do an initial DevStack checkout as root: git clone https://github.com/openstack-dev/devstack 3. Check out the Juno branch: cd devstack && git checkout stable/juno 4. Create a user called stack: tools/create-stack-user.sh • Note that the script sets up stack's home directory as /opt/stack. 5. Set the usual password for that user, for SSH access: passwd stack
8.2.4 DevStack Checkout As User "stack" 1. Log in as stack. 2. Clone DevStack into stack's home directory: git clone https://github.com/openstack-dev/devstack 3. Check out the Juno branch: cd devstack && git checkout stable/juno
8.2.5 local.conf localrc Customisation The stack.sh script can be tailored by environment variables, which can be overridden by creating /opt/stack/devstack/local.conf and adding a section called localrc as follows: [[local|localrc]] VARIABLE=value The format of local.conf is explained in greater detail at https://github.com/openstack-dev/devstack#localconfiguration. Create /opt/stack/devstack/local.conf as follows: [[local|localrc]] ADMIN_PASSWORD=Passw0rd
47
DATABASE_PASSWORD=Passw0rd MYSQL_PASSWORD=Passw0rd RABBIT_PASSWORD=Passw0rd SERVICE_PASSWORD=Passw0rd SERVICE_TOKEN=T0ken # Enable Ironic API and Conductor. enable_service ironic enable_service ir-api enable_service ir-cond # Ironic requires Neutron, so enable that and disable nova-network. ENABLED_SERVICES=rabbit,mysql,key ENABLED_SERVICES+=,n-api,n-crt,n-obj,n-cpu,n-cond,n-sch,n-novnc,n-cauth, ENABLED_SERVICES+=,neutron,q-svc,q-agt,q-dhcp,q-l3,q-meta,q-lbaas ENABLED_SERVICES+=,g-api,g-reg ENABLED_SERVICES+=,cinder,c-api,c-vol,c-sch,c-bak ENABLED_SERVICES+=,ironic,ir-api,ir-cond ENABLED_SERVICES+=,heat,h-api,h-api-cfn,h-api-cw,h-eng ENABLED_SERVICES+=,horizon # Create 3 extra VMs to act as Ironic's baremetal nodes. # Create 3 virtual machines to pose as Ironic's baremetal nodes. IRONIC_VM_COUNT=3 IRONIC_VM_SSH_PORT=22 IRONIC_BAREMETAL_BASIC_OPS=True # Bare minimums: IRONIC_VM_SPECS_RAM=1024 IRONIC_VM_SPECS_DISK=10 # Size of the ephemeral partition in GB. Use 0 for no ephemeral partition. IRONIC_VM_EPHEMERAL_DISK=0 VIRT_DRIVER=ironic # By default, DevStack creates a 10.0.0.0/24 network for instances. # If this overlaps with the hosts network, adjust with the following. NETWORK_GATEWAY=10.1.0.1 FIXED_RANGE=10.1.0.0/24 FIXED_NETWORK_SIZE=256 # Log all output to files. LOGFILE=$HOME/devstack.log SCREEN_LOGDIR=$HOME/logs IRONIC_VM_LOG_DIR=$HOME/ironic-bm-logs
8.2.6 Installation 1. Run the installation script: cd ~stack/devstack ./stack.sh • This will typically take 1.5 to 2 hours to run, depending on the machine.
8.2.7 Horizon Testing If the stack.sh script runs to successful completion, the output will look something like this:
48
Horizon is now available at http://192.168.122.28/ Keystone is serving at http://192.168.122.28:5000/v2.0/ Examples on using novaclient command line is in exercise.sh The default users are: admin and demo The password: Passw0rd This is your host ip: 192.168.122.28 1. Browse to the provided IP address, e.g. http://192.168.122.28 and log in as admin/Passw0rd. 2. Verify that: • the System | Hypervisors tab lists the 3 VMs pre-allocated as bare-metal nodes. • the System | Flavors tab shows baremetal as a possible instance type. • the System | Images tab lists the ironic initramfs and kernel images, ir-deploy-pxe_ssh.*. 3. Sign out of Horizon and sign back in as demo/Passw0rd. 4. Under Project | Compute | Images, click Launch, next to the cirros-0.3.2-x86_64-uec image (a minimal Linux system used for testing). 5. In the Launch Instance dialogue, set an Instance Name, leave the Flavor as m1.tiny and click Launch. 6. Once the instance is in the "Running" Power State, note the IP of the instance from the Instances table in the Horizon GUI. 7. SSH into the new instance from the VM that ran stack.sh: ssh
[email protected] • The default password for the cirros account is "cubswin:)". 8. Launch another instance, this time using the "baremetal" flavour, and again, verify that you can SSH in.
8.2.8 Command Line Testing 1. Login as user stack on the controller (the machine that ran stack.sh). 2. Set up environment variables: source ~/devstack/openrc • It's worth familiarising yourself with what gets set, e.g. set | grep OS_ • In the absence of command line arguments to that scripts, OS_USERNAME will be set to demo. 3. Set up an SSH key pair: ssh-keygen • Press Enter to use the default answer for all questions. 4. Add the public key under the name "default" in Nova: nova keypair-add default --pub-key ~/.ssh/id_rsa.pub 5. List available flavours: nova flavor-list • Note that one of the flavours will be "baremetal". 6. List images: nova image-list 7. Get the ID of the default image: image=$(nova image-list | egrep "$DEFAULT_IMAGE_NAME"'[^-]' | awk '{ print $2 }') 8. Start a new instance called "testing", using the "default" key: nova boot --flavor baremetal --image $image --key-name default testing 9. Check on the status of that instance: nova list 10. When its Power State is "Running", SSH in: ssh
[email protected] • Note that, due to the use of the SSH keypair, a password was not requested. 49
11. To view the state of baremetal nodes, use the admin user: source ~/devstack/openrc admin admin 12. Then list the nodes: stack@ubuntu:~/devstack$ ironic node-list +--------------------------------------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+--------------------------------------+-------------+--------------------+-------------+ | 45810ead-1167-4699-a47b-1efad350412b | b92830ec-dfac-4a25-b7e9-88b6be4453c5 | power on | active False | | 28220f63-4204-4776-bffa-7dc26769e8db | 3ce61e9c-d6ad-4af0-95e8-87f52c3013a2 | power on | active False | | 0d2ab465-14aa-4a35-a1cd-81dec1f56c8d | None | power off | None | False | +--------------------------------------+--------------------------------------+-------------+--------------------+-------------+ stack@ubuntu:~/devstack$
| |
8.3 DevStack Multi-Node VMs 8.3.1 Overview This section describes the procedure to set up a multi-node DevStack installation in two virtual machines, featuring both the Ironic (baremetal) and KVM hypervisors. OpenStack requires a minimum of one node per distinct hypervisor type, where Ironic counts as a type. The first VM acts as a combined OpenStack Controller node and OpenStack Networking node. It includes Horizon, Keystone, Glance, Cinder, Heat, Neutron (OpenStack Networking), Ironic (baremetal provisioning) and supporting services. Networking is provided by Neutron rather than the simpler, legacy nova-network service because Ironic requires Neutron.
8.3.2 References Though far from an exhaustive list, the following links were particularly helpful in debugging this configuration: • http://pshchelo.bitbucket.org/devstack-with-neutron.html • https://ask.openstack.org/en/question/30502/can-ping-vm-instance-but-cant-ssh-ssh-command-halts-with-nooutput/ • https://ask.openstack.org/en/question/12499/forcing-mtu-to-1400-via-etcneutrondnsmasq-neutronconf-perdaniels/ • http://getcloudify.org/2013/12/23/setting_up_devstack_havana_on_your_local_network.html • https://gist.github.com/tmartinx/9177697 • https://github.com/osrg/ryu/wiki/RYU-Openstack-Havana-environment-HOWTO • https://ask.openstack.org/en/question/49985/not-able-to-ssh-vms-on-compute-node-instance-console-is-alsonot-available-on-horizon/
8.3.3 VM Preparation Configure a KVM virtual machine for the Controller with the following specs: • At least 8GB RAM. The VM will need RAM to start baremetal instances in. • More than the default amount of disk - say 30GB. The Controller runs the Glance and may need quite a lot of disk space for images. • 2 NICs: • First NIC: virtio, NAT • Second NIC: virtio, host only, no IPv4/IPv6 configuration. • Ubuntu 14.04 Server: 50
• User: user/PASSW0RD • Set the password for root to the same. • Post installation: apt-get update && apt-get -y upgrade && apt-get -y install git reboot • Perform the OpenSSH configuration in the section “Ubuntu 14.04 Desktop or Server Installation and Configuration”, page Error! Bookmark not defined., for convenient shell access to the VM. • Set the host name in /etc/hostname and /etc/hosts to "controller". Configure a second KVM virtual machine for the Compute node: • Specifications the same as the first, but 6GB of RAM will be sufficient to run a few m1.small instances in. You may find that a full blown image like Fedora or Ubuntu won't boot in an m1.tiny or m1.nano flavoured instance. • Set the host name in /etc/hostname and /etc/hosts to "compute". To create the two network interfaces, you will need to create Virtual Networks through the virt-manager GUI: 1. In the main window of virt-manager ("Virtual Machine Manager"), right click on the domain "localhost (QEMU)" and select Details from the context menu. 2. Open the Virtual Networks tab and click "+" in the bottom left corner to define a new network. Follow the prompts to set up one NAT network (e.g. called "MgmtNetwork") and one host-only network with no IP (e.g. "DataNetwork"), as required by the NICs detailed above. 3. Open the configuration of each virtual machine, select each NIC from the list of hardware devices and select the "Network source:" to attach to the NIC, e.g. "Virtual network 'MgmtNetwork': NAT".
8.3.4 "stack" User Setup The DevStack installation script, stack.sh, must be run as a user called "stack" with the ability to sudo without a password. DevStack includes a create-stack-user.sh script to set up the stack user. To run that script, do the following: 1. Log into the VM as root. 2. Do an initial DevStack checkout as root: git clone https://github.com/openstack-dev/devstack 3. Check out the Juno branch: cd devstack && git checkout stable/juno 4. Create a user called stack: tools/create-stack-user.sh • Note that the script sets up stack's home directory as /opt/stack. 5. Set the usual password for that user, for SSH access: passwd stack The stack user must be set up on both the controller and the compute node.
8.3.5 DevStack Checkout As User "stack" On both the controller and the compute node, check out the DevStack sources as the stack user: 1. Log in as stack. 2. Clone DevStack into stack's home directory:
51
git clone https://github.com/openstack-dev/devstack 3. Check out the Juno branch: cd devstack && git checkout stable/juno
8.3.6 File Format DevStack installation script, stack.sh, is driven by a configuration file called local.conf, created at the root of the DevStack source tree (in the devstack/ directory). A fully functional, if basic, single node DevStack installation, with all services enabled, will result from an empty local.conf file. However, if you want to prevent stack.sh from prompting for passwords, then you will want to specify at least those details in local.conf. Note that MySQL installation will fail if you specify a password which is literally "password". The local.conf file is divided into sections by a header of the form: [[phase|filename]] [Section] where: • phase is the phase of the installation process, such as local or post-config. • filename is the name of the configuration file to modify - usually referred to by a variable, e.g. $Q_DHCP_CONF_FILE for the Neutron (Quantum) DHCP configuration. Configuration filenames in local.conf headers do not begin with a /, with the sole exception of /$Q_PLUGIN_CONF_FILE. • [Section] is the section header of the settings to be modified within that configuration file. More information is available at https://github.com/openstack-dev/devstack#local-configuration. For a multi-node installation, you will run stack.sh on each node with a different local.conf on each in order to control which services will run. On all nodes, you will want local.conf to include "MULTI_HOST=1", set the HOST_IP variable to be the host's currently configured primary IP address (e.g. the address of eth0), and on any host except the controller, set SERVICE_HOST to be the IP address of the controller.
8.3.7 Networks The local.conf files that follow assume the network conditions described in this section. These IP addresses can be changed to suit the specifics of your installation. Note, in particular, that there is very little reference to the physical network that hosts the controller and compute nodes, apart from the values of HOST_IP and SERVICE_HOST in local.conf. The stack.sh script picks up the gateway, routes and subnet mask from the pre-existing host configuration. (For this reason, it is important that the network configuration on the controller, particularly, is good, since it will determine the configuration of Neutron.) The stack.sh script creates two networks, visible in the Horizon web interface as "public" and "private", with subnets "public-subnet" and "private-subnet", respectively. Starting first with the private network, this the "fixed network" on which instances will communicate with each other. It is described by the following variables in local.conf: NETWORK_GATEWAY=10.1.0.1 FIXED_RANGE=10.1.0.0/24 FIXED_NETWORK_SIZE=256 The public network is the range of addresses that OpenStack allocates as "floating IPs". Floating IPs can be reserved by a tenant (project) and allocated or reallocated dynamically to instances. They're not visible to the instance itself. The instance's NIC will never show the floating IP if you run "ifconfig". You can't attach an instance's virtual NIC to the public network - the instance will fail to start - irrespective of 52
whether the instance has a NIC on the private network or not. Instead, floating IPs are "NAT"ed or "masqueraded" by Neutron. The public network is described in local.conf by the following variables: PUBLIC_NETWORK_GATEWAY=192.168.200.1 FLOATING_RANGE=192.168.200.0/24 Q_FLOATING_ALLOCATION_POOL=start=192.168.200.20,end=192.168.200.100 The Q_FLOATING_ALLOCATION_POOL allows you to restrict the range of IP addresses that will be handed out as Floating IPs. We have seen example local.conf files online that purport to configure the public subnet to be the same as the physical network of the Controller and Compute nodes, but have never had any success with these settings. There are also some settings in the Neutron configuration scripts called by stack.sh that appear to be of interest. For future reference, the relevant settings are: Q_USE_PROVIDERNET_FOR_PUBLIC=True PUBLIC_PHYSICAL_NETWORK=public OVS_BRIDGE_MAPPINGS=public:br-ex In the local.conf files that follow, the public subnet (192.168.200.0/24) is completely distinct from the subnet hosting the Controller and Compute nodes (192.168.100.0/24). However, the Controller, in its capacity as the Network node, forwards traffic between instances and an arbitrary external address. Therefore, to access the instances from another host, it suffices to simply set up a static route (on that other host) to either the public or private subnet with the Controller address as the gateway.
8.3.8 Controller local.conf On the Controller, /opt/stack/devstack/local.conf is listed, below. There are a few points to note: • Neutron is configured to use the Modular Layer 2 (ML2) plugin, which is the default. • The Q_ML2_TENANT_NETWORK_TYPE setting configures ML2 to use Virtual Extensible LANs (VXLANs) (http://en.wikipedia.org/wiki/Virtual_Extensible_LAN) for the tenant networks. These are an alternative to VLANs with a less-restricted range of IDs. An alternative might be to use GRE (setting "gre"). • The encapsulation of tenant traffic (whether GRE or VXLAN) causes packet size to exceed the default MTU of 1500 on the instances, which causes an SSH connection into an instance to hang at the key exchange phase. This local.conf sets ups the Neutron DHCP configuration to override the instance MTU setting to 1400, to avoid that problem. • LOGDAYS is set to 1 to limit the number of log files retained to one day's worth. • LOG_COLOR is set to False to keep colour escape sequences out of the logs. [[local|localrc]] MULTI_HOST=1 HOST_IP=192.168.100.110 # https://wiki.openstack.org/wiki/Neutron/ML2 FLAT_INTERFACE=eth0 Q_PLUGIN=ml2 Q_ML2_TENANT_NETWORK_TYPE=vxlan # The default private network for the instances. NETWORK_GATEWAY=10.1.0.1 FIXED_RANGE=10.1.0.0/24 FIXED_NETWORK_SIZE=256 # Configure the public network. 53
# Will fail with message about EXT_GW_IP if you don't specify PUBLIC_NETWORK_GATEWAY. PUBLIC_NETWORK_GATEWAY=192.168.200.1 FLOATING_RANGE=192.168.200.0/24 Q_FLOATING_ALLOCATION_POOL=start=192.168.200.20,end=192.168.200.100 # Services enable_service ironic enable_service ir-api enable_service ir-cond enable_service rabbit mysql key enable_service n-api n-crt n-obj n-cpu n-cond n-sch n-novnc n-cauth enable_service g-api g-reg enable_service cinder c-api c-vol c-sch c-bak enable_service horizon enable_service tempest disable_service n-net enable_service neutron enable_service q-svc enable_service q-agt enable_service q-dhcp enable_service q-l3 enable_service q-meta # Credentials DATABASE_PASSWORD=Passw0rd MYSQL_PASSWORD=Passw0rd ADMIN_PASSWORD=Passw0rd SERVICE_PASSWORD=Passw0rd SERVICE_TOKEN=Passw0rd RABBIT_PASSWORD=Passw0rd # Create 3 extra VMs to act as Ironic's baremetal nodes. IRONIC_VM_COUNT=3 IRONIC_VM_SSH_PORT=22 IRONIC_BAREMETAL_BASIC_OPS=True # Bare minimums: IRONIC_VM_SPECS_RAM=1024 IRONIC_VM_SPECS_DISK=10 # Size of the ephemeral partition in GB. Use 0 for no ephemeral partition. IRONIC_VM_EPHEMERAL_DISK=0 VIRT_DRIVER=ironic # Enable Logging LOGFILE=/opt/stack/logs/stack.sh.log LOGDAYS=1 VERBOSE=True LOG_COLOR=False SCREEN_LOGDIR=/opt/stack/logs [[post-config|/$Q_PLUGIN_CONF_FILE]] # TBD # Set the MTU to 1400 on instances to prevent SSH hang on key exchange.
54
# Requires that you set up /etc/neutron/dnsmasq-neutron.conf. # https://ask.openstack.org/en/question/30502/can-ping-vm-instance-but-cant-ssh-ssh-command-halts-with-no-output/ # https://ask.openstack.org/en/question/12499/forcing-mtu-to-1400-via-etcneutrondnsmasq-neutronconf-per-daniels/ [[post-config|$Q_DHCP_CONF_FILE]] [DEFAULT] dnsmasq_config_file = /etc/neutron/dnsmasq-neutron.conf
8.3.9 Compute Node local.conf [[local|localrc]] MULTI_HOST=1 HOST_IP=192.168.100.111 SERVICE_HOST=192.168.100.110 FLAT_INTERFACE=eth0 Q_PLUGIN=ml2 Q_ML2_TENANT_NETWORK_TYPE=vxlan # TBD: Is it necessary to even specify these network ranges on the Compute node? # The default private network for the instances. NETWORK_GATEWAY=10.1.0.1 FIXED_RANGE=10.1.0.0/24 FIXED_NETWORK_SIZE=256 FLOATING_RANGE=192.168.200.0/24 Q_FLOATING_ALLOCATION_POOL=start=192.168.200.20,end=192.168.200.100 DATABASE_TYPE=mysql MYSQL_HOST=$SERVICE_HOST RABBIT_HOST=$SERVICE_HOST GLANCE_HOSTPORT=$SERVICE_HOST:9292 Q_HOST=$SERVICE_HOST # Services ENABLED_SERVICES=n-cpu,rabbit,neutron,q-agt #,q-l3 # Credentials DATABASE_PASSWORD=Passw0rd MYSQL_PASSWORD=Passw0rd ADMIN_PASSWORD=Passw0rd SERVICE_PASSWORD=Passw0rd SERVICE_TOKEN=Passw0rd RABBIT_PASSWORD=Passw0rd # Enable Logging LOGFILE=/opt/stack/logs/stack.sh.log LOGDAYS=1 VERBOSE=True LOG_COLOR=False SCREEN_LOGDIR=/opt/stack/logs
8.3.10 Controller Network Configuration 1. On the Controller only, configure /etc/network/interfaces to read as follows: # The loopback network interface auto lo iface lo inet loopback 55
# The primary network interface auto eth0 iface eth0 inet static address 192.168.100.110 gateway 192.168.100.1 netmask 255.255.255.0 dns-nameservers 8.8.8.8 8.8.4.4 auto eth1 iface eth1 inet manual up ifconfig $IFACE 0.0.0.0 up up ip link set $IFACE promisc on down ip link set $IFACE promisc off down ifconfig $IFACE down • Reboot the Controller so that these settings take effect. • On the Compute Node, it is sufficient to simply set a static IP on eth0 (192.168.100.111) and the same gateway and subnet mask as on the Controller.
8.3.11 local.sh File 1. Login to the Controller as user stack. 2. Check whether bridge br-ex already exists: sudo ovs-vsctl show | grep br-ex 3. If it doesn't exist, add it: sudo ovs-vsctl add-br br-ex 4. Create /devstack/local.sh with the following contents: sudo ovs-vsctl --no-wait -- --may-exist add-port br-ex eth1 5. Make the script executable: chmod +x ~/devstack/local.sh 6. This script should run automatically at the end of stack.sh, but run it now, just to be sure: cd ~/devstack && ./local.sh
8.3.12 Instance Internet Forwarding 1. On both the Controller and the Compute node, enable IP forwarding and allow the VMs to access the internet: sudo su echo 1 > /proc/sys/net/ipv4/ip_forward echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE 2. Also, on both nodes, edit /etc/sysctl.conf and add the following lines: net.ipv4.conf.eth0.proxy_arp = 1
8.3.13 net.ipv4.ip_forward = 1 Running stack.sh 1. On both the Controller and the Compute nodes, log on as user stack and run the following commands: cd devstack && ./stack.sh
56
8.3.14 Access Rules In order to be able to ping instances and SSH into them, it is necessary to open the firewall to allow this traffic by defining some access rules: 1. Login to the Controller as user stack. 2. Source the admin user's credentials: cd devstack && . openrc admin demo 3. Add an access rule to the "default" secgroup to allow pings: nova secgroup-add-rule default icmp -1 -1 0.0.0.0/0 4. Add an access rule to the "default" secgroup to allow pings: nova secgroup-add-rule default tcp 22 22 0.0.0.0/0 5. Log in to the Horizon web interface on the Controller (http://192.168.100.110) as admin (password: Passw0rd), and verify that these two new access rules are listed, by clicking "Manage Rules" for the "default" secgroup under Project -> Compute -> Access & Security -> Security Groups.
8.3.15 SSH Key Some cloud images, such as CentOS and Ubuntu don't specify a password for their default user. Instead, they require that you inject an SSH key into the instance when you create it. You then log into the instance as follows, e.g. in the case of Ubuntu: ssh -i ~/.ssh/key
[email protected] If only one keypair is registered with the project, then instances will automatically use that when they are created in Horizon, without any manual interaction by the user. 1. Log on to the Controller as user stack. 2. Create a passwordless SSH key: ssh-keygen -N "" -f ~/.ssh/key 3. Create a new keypair in OpenStack under the name "key": nova keypair-add --pub-key ~/.ssh/key.pub key 4. Log into Horizon (http://192.168.100.110) as user admin (password: Passw0rd) and verify that the key is listed under Project -> Compute -> Access & Security -> Key Pairs.
8.3.16 Access Rules and SSH Key Script The following script configures access rules and the keypair in one step: . /opt/stack/devstack/openrc admin demo # Open the firewall (secgroup "default") for pings and SSH. nova secgroup-add-rule default icmp -1 -1 0.0.0.0/0 nova secgroup-add-rule default tcp 22 22 0.0.0.0/0 # Generate an RSA keypair with no password for instance SSH login. if [ ! -f ~/.ssh/key ]; then ssh-keygen -N "" -f ~/.ssh/key fi # Register the public key with nova under the name "key" (doesn't have to be the same filename). # The first key registered will be injected into instances by default. if ! nova keypair-show key >&/dev/null; then nova keypair-add --pub-key ~/.ssh/key.pub key fi
57
Note that the script uses the admin user, but sets the tenant (project) to demo. This is because the demo tenant is preconfigured with a router between the public and private networks, and therefore we will use the demo tenant for testing. Beware that access rules added to the demo tenant won't be present in the admin tenant, and vice versa.
8.3.17 Additional Images If you intend to use CirrOS, then you will need to install the latest, 0.3.3 image of that. In addition, you might want to install images for Ubuntu and CoreOS, among others. This script does that: . /opt/stack/devstack/openrc admin admin UBUNTU_URL=http://uec-images.ubuntu.com/releases/14.04/release/ubuntu-14.04-server-cloudimg-amd64-disk1.img CIRROS_URL=http://download.cirros-cloud.net/0.3.3/cirros-0.3.3-x86_64-disk.img glance image-create --name "Ubuntu 14.04 - 64-bit" --location $UBUNTU_URL \ --disk-format qcow2 --container-format bare --is-public true glance image-create --name "CirrOS 0.3.3 - 64-bit" --location $CIRROS_URL \ --container-format bare --disk-format qcow2 --is-public true #----------------------------------------------------------------------------COREOS_BASE_URL=http://alpha.release.core-os.net/amd64-usr/current COREOS_IMAGE_BASENAME=coreos_production_openstack_image.img COREOS_VERSION=$(curl $COREOS_BASE_URL/version.txt 2>/dev/null | grep COREOS_VERSION_ID | cut -d= -f2) mkdir -p ~/coreos && cd ~/coreos wget $COREOS_BASE_URL/$COREOS_IMAGE_BASENAME.bz2 bunzip2 --force $COREOS_IMAGE_BASENAME.bz2 glance image-create --name "CoreOS $COREOS_VERSION" --file $COREOS_IMAGE_BASENAME \ --container-format ovf --disk-format qcow2 --is-public True
8.3.18 Test Instance 8.3.19 Launching Create an instance to verify correct operation, as follows: 1. Log into Horizon (http://192.168.100.110) as user admin (password: Passw0rd). 2. In the top left of the main screen, select the tenant (project) as "demo". 3. Under Project -> Network -> Network Topology, note that there are two networks depicted, public and private, with router1 between them. This network structure is created automatically by stack.sh. The router is not present under the admin tenant. 4. Under Project -> Compute, click Images. 5. Next to the image named "Cirros 0.3.3 - 64-bit", click Launch. 6. On the Details tab, enter the Instance Name "test". 7. Click the Access & Security tab and note that the Key Pair has automatically been set to "key". 8. Click on the Networking tab and note that the private network has already been selected for NIC:1 of the instance. 9. Click Launch on the Launch Instance popup, to start the instance. 10. While the instance is spawning, select Associate Floating IP in the Actions column. 11. Choose a pre-allocated Floating IP. If none is supplied, click "+" to allocate one, and then click Allocate IP in the Allocate Floating IP popup. 12. Click Associate in the Manage Floating IP Associations popup, to associate the Floating IP with the instance. If
58
the "Port to be associated" is not set, it means you are working under the wrong tenant and there is no router between the public and private networks. 13. If all is correct, you will see that the instance has two IP addresses: one each on the private and public networks. You may need to click "Instances" to refresh the instance table.
8.3.20 Testing 1. Log on to the Controller as user stack. 2. Ping the fixed network gateway: ping 10.1.0.1 3. Ping the instance's fixed network address: ping 10.1.0.5 • If this fails, check that you have the right access rules for the tenant where you created the instance (demo). 4. Ping the instance's floating IP: ping 192.168.200.22 5. SSH in on the private network: ssh -i ~/.ssh/key
[email protected] • If you omit the SSH key, you can use the password "cubswin:)" for CirrOS. 6. SSH in on the floating IP: ssh -i ~/.ssh/key
[email protected]
8.3.21 External Access You can access any address on the public network by setting up a static route on the host where you require access. The route uses the Controller as the gateway to the public network: sudo route add -net 192.168.200.0 netmask 255.255.255.0 gw 192.168.100.110 You can then access the floating IP of the instance (it won't work for the fixed/private IP, though): ping 192.168.200.22 ssh -i /path/to/key
[email protected]
8.3.22 Neutron Network Configuration Although the final OpenStack configuration delivered to DST uses nova-network instead of Neutron for networking, it is useful to document the problems encountered configuring Neutron, in order to save time during any future work that may require its use. The configuration problems with Neutron manifested as either an inability to ping or SSH into instances, or extremely slow network traffic when connecting to an instance via SSH (on the order of several minutes to log in and 15 seconds for each keystroke to be echoed during an interactive login). There were, in fact, multiple layers of configuration issues with the same or similar symptoms. After all of the configuration problems were solved in a test environment comprising VirtualBox virtual machines, and the configuration transferred to the actual blade hardware, the same symptoms reappeared, with a solution specific to the hardware.
8.3.23 Access Rules Out of the box, OpenStack (including DevStack) does not permit instances to be pinged or accept network traffic. They are protected by a firewall, and specific exceptions must be made in the form of access rules. DevStack automatically creates one security group called "default". We configured this group using the custom shell script, access.sh, to allow ICMP pings and SSH into instances. The script reads, in part:
59
nova secgroup-add-rule default icmp -1 -1 0.0.0.0/0 nova secgroup-add-rule default tcp 22 22 0.0.0.0/0
8.3.24 IP Forwarding IP forwarding must be enabled on the fixed/private interface and an additional iptables firewall rule for that interface must be added to the controller (the network node) and all compute nodes. IP forwarding was enabled in /etc/sysctl.conf: net.ipv4.ip_forward=1 net.ipv4.conf.eth2.proxy_arp=1 The iptables rule was necessary to allow instances to do NAT to access the internet. It was added to /etc/network/interfaces so that it would be applied automatically when the system boots: auto eth2 iface eth2 inet static # ... other configuration elided ... post-up iptables -t nat -A POSTROUTING -o $IFACE -j MASQUERADE
8.3.25 Instance MTUs When the tenant network encapsulates traffic in a protocol like GRE or VXLAN, the extra overhead can cause the packet size to exceed the MTU on the controller (network node). The resulting packet fragmentation causes applications that send large messages (such as SSH key exchange) to hang. This issue can be solved by configuring the DHCP agent to inject a smaller MTU into instances. This MTU needs to be at least 100 bytes smaller than that of the network node (the controller). The DHCP agent's configuration file, /etc/neutron/dhcp_agent.ini, should contain: [DEFAULT] dnsmasq_config_file = /etc/neutron/dnsmasq-neutron.conf The dnsmasq-neutron.conf file is not created by DevStack. You must create it manually, as follows: mkdir -p /etc/neutron echo dhcp-option-force=26,1400>>/etc/neutron/dnsmasq-neutron.conf
8.3.26 Instance Operating System DevStack automatically sets up several images for creating instances. One of these is CirrOS, which is a minimal and very lightweight Linux distribution that would seem to be ideal for basic testing of OpenStack. Unfortunately, DevStack installs CirrOS 0.3.2, which does not honour the MTU setting suggested to it by the DHCP server. For the aforementioned MTU setting to work, it is necessary to install a later version of CirrOS (0.3.3) or use a different operating system that doesn't have that problem, e.g. Ubuntu.
8.3.27 Resolving Internet Hosts For instances to be able to resolve DNS names correctly, it was necessary to override the DNS nameserver used by instances. The default setting configured by DevStack does not work. The following line was added to access.sh: neutron subnet-update provider_net --dns-nameservers list=true 8.8.8.8 8.8.4.4
60
8.3.28 Disable Transmission Checksumming The procedures above are sufficient to get functioning network connections to instances in a virtual test environment. However, on the DST-provided HS22 blades, a problem with the same symptoms as the MTU issue (slow SSH access) manifests. The MTU of the physical switch (IBM Virtual Fabric) was ruled out as a possible cause; it defaults to jumbo frames (9000 bytes) and doesn't have a setting to disable that. Various Large Segment Offload (http://en.wikipedia.org/wiki/Large_segment_offload) and Large Receive Offload (http://en.wikipedia.org/wiki/Large_receive_offload) technologies, such as TSO, GSO and GRO, were investigated as a possible cause and disabled using the ethtool utility, to no avail. If the problem recurs on different hardware, these may be a profitable avenue of investigation. In the end, the problem turned out to be a feature of the network interface driver called Transmission Checksumming (TX), which was seeing the encapsulated instance traffic as garbled. The solution was to disable this feature on both the management / fixed / private interface of the network node and compute nodes, and on the public interface on the network node. This can be done in /etc/network/interfaces as follows: auto eth2 iface eth2 inet static # ... other configuration elided ... post-up ethtool -K $IFACE tx off auto eth3 iface eth3 inet manual # ... other configuration elided ... post-up ethtool -K $IFACE tx off
9 OpenStack Development GUI 9.1 Installation The GUI has been tested on Fedora 20 Linux and should work on any contemporary Linux system. The GUI is installed as an Eclipse plugin and has been tested with Eclipse Luna. The installation procedure that follows includes instructions on how to install Eclipse Luna. Eclipse requires a Java 1.7 or Java 1.8 runtime environment to run. The recommended installation procedure is as follows: 1. Download Eclipse Luna SR2 (eclipse-java-luna-SR2-linux-gtk-x86_64.tar.gz) http://eclipse.org/downloads/packages/eclipse-ide-java-developers/lunasr2. 2. Unpack the tarball under your home directory:
from
tar xvzf ~/Downloads/eclipse-java-luna-SR2-linux-gtk-x86_64.tar.gz -C ~ • Other installation locations are possible, but note that you must have write permission to the directory. 3. Copy the plugin JAR from the delivered software to the Eclipse dropins/ directory: cp openstack_project/osgui/products/osgui_1.0.0.a.jar ~/eclipse/dropins 4. Copy the SSH key files used to login to instances for various automation tasks to the user's SSH configuration directory: cp openstack_project/osgui/products/keys/* ~/.ssh/ 5. Copy support files used by the component mechanism to a directory called ".osd" under the user's home directory: mkdir ~/.osd; cp -r openstack_project/osgui/products/components/ ~/.osd/ 6. You may need to interactively configure the use of a JavaSE 1.8 JRE, as follows: 61
sudo alternatives --config java • Select the preferred JRE by number and press Enter.
9.2 Configuration The OpenStack Development GUI must be configured on first use, as follows: 1. Start Eclipse: ~/eclipse/eclipse • You may need to pass the -clean option on the Eclipse command line when upgrading plugin versions. 2. Choose a Workspace directory in which project files will be stored. The default (~/.workspace) is a reasonable choice. 3. Once Eclipse has fully started, close the Welcome project. 4. To open the OpenStack Development GUI views, from the Eclipse main menu select Window | Show View | Other..., in the list of views, expand the OSGUI folder, shift-click to select all three views - "Components", "Server Details" and "Server List" - and then click OK. 5. Drag the views around the the Eclipse window with the mouse and close unwanted views so that the the OpenStack Development GUI views are laid out as desired.
9.3 OpenDDS Modeling In order to define dummy component inputs and outputs in terms of datatypes described in an OpenDDS model in XMI format, it is necessary to install the OpenDDS Modeling plugin into Eclipse. More information about OpenDDS Modeling is available at http://www.opendds.org/modeling.html.
9.3.1 OpenDDS Modeling Installation To install the OpenDDS Modeling SDK: 1. In the Eclipse main menu, select Help | Install New Software... 2. To the right of "Work with:", click Add... 3. Enter the name "OpenDDS Modeling" and the Location "http://www.opendds.org/modeling/eclipse", then click OK. 4. Check the box next to OpenDDS Modeling SDK in the list of plugins. Click Next. 5. Click Next again, then select "I accept the terms of the license agreement", then click Finish. 6. Click OK on the Security Warning and finally click Yes to restart Eclipse.
9.3.2 OpenDDS Modeling Project Creation In order to create an OpenDDS Model diagram, it is necessary to create a project as follows: 1. Select File | New | Other..., select OpenDDS Models -> OpenDDS Diagram from the tree and click Next. 2. Enter a Model Name and click Finish. The delivered software provides an example OpenDDS Modeling XMI file at openstack_project/osgui/src/osd/dds/Test.opendds. The corresponding visual representation (diagram) for this model is openstack_project/osgui/src/osd/dds/Test.opendds_diagram.
9.4 Component Hosts Software and dummy components have specific expectations of the host that will run them: • The host must be running Ubuntu 14.04. • It must have sufficient RAM and disk space to install non-trivial software stacks. • It must have a recent OpenJDK JDK package installed, as well as a variety of development tools and OpenSplice DDS.
62
• It must have the customary "user" account, capable of passwordless sudo and SSH access via the provisioning key. • It must have supporting scripts installed under /opt/components/archive/.
9.5 Firewall Requirements DDS domain participants send a UDP packet to multicast address 239.255.0.1, port 7400 or 7401, in order to discover each other. DDS messages are sent by UDP on ports 74xx e.g. 7411 or 7413. For DDS participants on baremetal hosts to establish DDS communications with instances, it is sufficient to add an Ingress OpenStack access rule on UDP port range 7400 to 7450, with the following commands, run as user stack on the OpenStack controller: . devstack/openrc admin demo nova secgroup-add-rule default udp 7400 7450 0.0.0.0/0
9.6 Component Host Image Preparation 9.6.1 Software Installation An image of a standard component host configuration is prepared by setting up an instance with the requisite software packages and then taking a snapshot. Subsequently, new component host instances are started by launching the snapshot. The procedure to prepare the snapshot is as follows: 1. Launch a new Ubuntu 14.04 instance with the "m1.small" flavour using the OSD GUI, so that the "user" account is created automatically. 2. Create an archive of the component support infrastructure by running a script under the OSD source tree: openstack_project/osd/src/osd/component/archive/prepare.sh • This will create ~/.osd/components/preparation.tar.gz. 3. Clear out any old known hosts associated with the new instance's IP: ssh-keygen -R 10.33.136.7 4. Copy the archive to the instance: scp -i ~/.ssh/provisioning \ ~/.osd/components/preparation.tar.gz
[email protected]:/tmp 5. SSH in as user: ssh -i ~/.ssh/provisioning
[email protected] 6. Become root: sudo su 7. Unpack the archive to the root directory: tar xvzf /tmp/preparation.tar.gz -C / 8. If updating a host that has already had the component infrastructure installed: rm /opt/components/archive/installed 9. Run the installation script as root: /opt/components/archive/install.sh • This may take up to 15 minutes to install of the software and update the system, if the packages have not been installed before. Note: If preparing a baremetal component host, see the section on Baremetal Host Configuration below for additional procedures.
9.6.2 Testing In the OSD GUI, check that the newly prepared instance is listed when you choose a host to install components on. Launch a test component and verify that it can be started, paused and stopped. 63
Finally, uninstall all components running on the instance prior to imaging, as follows: 1. SSH in as user: ssh -i ~/.ssh/provisioning
[email protected] 2. Uninstall all components: for C in $(cd /opt/components/archive/run; ls); do /opt/components/archive/component.sh uninstall $C; done
9.6.3 Snapshot Creation 1. Log into an OpenStack host, e.g. the controller: ssh stack@controller 2. Authenticate as admin in the demo project: . devstack/openrc admin demo 3. List available images: glance image-list 4. If desired, delete the old component host image: glance image-delete ubuntu-component-host • This may cause problems (the instance may shut down and refuse to start again) if the image is referenced by a running image. It is probably best to wait a few minutes before moving to the next step. 5. List running instances: nova list 6. Take a snapshot of the instance that was prepared as a component host: nova image-create component-host-preparation ubuntu-component-host • Here, "component-host-preparation" is the instance name and "ubuntu-component-host" is the name of the snapshot that will be created. • The progress of image creation can be observed by doing: watch glance image-list 7. If desired, download the snapshot as a local image file: glance image-download ubuntu-component-host --file component-host.img
9.6.4 Baremetal Host Configuration The software installation procedure for components on baremetal is the same as that for instances detailed in the Software Installation section, above. In order for components running on baremetal hosts to communicate by DDS with components running on instances, they need to be in the same broadcast domain, which means that the baremetal host needs an interface on the same subnet as the public interface of the instances. When using OpenStack Networking (Neutron), the public interface on the host must be assigned an IP address on the public network. This is done by adding the following line to /etc/rc.local on baremetal component hosts: /opt/baremetal/public_interface.sh The public_interface.sh script assigns eth3 an IP address computed automatically from the IP address that DHCP assigned to eth2. When using OpenStack legacy networking (nova-network) in a single-interface configuration, the public_interface.sh does not need to be run. The DDS configuration also needs to be adjusted to ensure that the correct physical interface is used for 64
DDS traffic. When using Neutron, that will be eth3. When using nova-network in a single interface configuration, it will be eth2. The file /opt/baremetal/ospl.xml contains the OpenSplice configuration for the nova-network case. The relevant settings are described in the "OpenSplice DDS Version 6.x Deployment Guide". The procedure for updating the DDS configuration is as follows: 1. Ensure that OpenSplice 6.4 has been installed under /opt/OpenSplice-6.4, as described above. 2. As user "root", copy a custom OpenSplice configuration into place: cp /opt/baremetal/ospl.xml \ /opt/OpenSplice-6.4/HDE/x86_64.linux/etc/config/ospl.xml Note: ospl.xml and public_interface.sh are found in the source /openstack_project/osgui/src/osd/component/archive/preparation/opt/baremetal/.
code
repository
under
10 PCI Passthrough PCI passthrough provides the highest possible performance for accessing a GPU from within a virtual machine, and as such is considered to be the ideal. DST indicated a strong interest particularly in PCI passthrough over the other options. It requires hardware support on the host and there are limitations on hypervisor type when used in an OpenStack environment (discussed below). For completeness, the other commonly used option to access a GPU from within a VM is known as API remoting, where the GPGPU API is split into front-end and back-end components. The front-end runs in the VM and emulates the GPGPU API. It communicates requests and results with the back-end, via shared memory or a network interface. Bandwidth between the front-end and back-end can be a significant determining factor in the efficacy of this technique. Related research into API remoting and PCI passthrough for various hypervisors can be found at [21].
10.1 Installation Procedures The general procedure for configuring PCI passthrough in OpenStack is described at [22]. The procedures documented below reiterate that procedure tailored to the specific hardware provided by DST, and contain some additional required steps that are not mentioned in the official documentation. At the time of writing, OpenStack supports PCI passthrough for Xen and KVM hypervisors, but not for VMware vCenter.
10.2 Host Preparation To do passthrough of PCI devices to an instance, the host's IOMMU must be enabled. The feature is called "Intel VT-d" or "AMD-Vi", for Intel and AMD chipsets, respectively. On the HS22 blades, VT-d is enabled through the UEFI configuration utility: 1. Reboot the host. 2. Press F1 to enter the UEFI configuration utility. 3. Navigate to System Settings -> Processors -> VT-d and select Enable, then press Esc, Esc, Esc and Y to save the updated settings.
10.3 KVM Host Configuration To prepare an OpenStack compute node running the KVM hypervisor to do PCI passthrough, you first follow the procedures to enable PCI passthrough in KVM, as described at: [23]. For an Ubuntu 14.04 host on an HS22 blade, the following procedure will suffice: 1. Enable the IOMMU in the kernel by passing the "intel_iommu=on" parameter to the kernel, as follows: 1. Edit /etc/default/grub and add "intel_iommu=on" to the “GRUB_CMDLINE_LINUX=” setting. 65
2. Run update-grub and then reboot the host. 3. After rebooting, check that the expected parameter value is in /proc/cmdline: cat /proc/cmdline 4. Look for the message "Intel-IOMMU: enabled" in the output of dmesg: dmesg | grep -iE 'dma|iommu' 2. Unbind the GPU devices from the host using the pci_stub module. Also, although not mentioned in the KVM documentation, you must disable the .1 function of the device - an audio device that will prevent OpenStack from allocating the GPU to an instance. To do this, add the following to /etc/rc.local and reboot the host again: # Detach the GPU from the host. /sbin/modprobe pci_stub echo "10de 06d2" > /sys/bus/pci/drivers/pci-stub/new_id echo '0000:1b:00.0' > '/sys/bus/pci/devices/0000:1b:00.0/driver/unbind' echo '0000:1c:00.0' > '/sys/bus/pci/devices/0000:1c:00.0/driver/unbind' echo '0000:1b:00.0' > /sys/bus/pci/drivers/pci-stub/bind echo '0000:1c:00.0' > /sys/bus/pci/drivers/pci-stub/bind # Hide the .1 (audio) function from OpenStack. echo 1 > '/sys/bus/pci/devices/0000:1b:00.1/remove' echo 1 > '/sys/bus/pci/devices/0000:1c:00.1/remove' The Bus/Device/Function (BDF) numbers for the GPUs used in the above can be seen in the output of lspci -nn | grep -i nvidia: 1b:00.0 3D controller [0302]: NVIDIA Corporation GF100GL [Tesla M2070] [10de:06d2] (rev a3) 1c:00.0 3D controller [0302]: NVIDIA Corporation GF100GL [Tesla M2070] [10de:06d2] (rev a3) As can be seen in the output above, the PCI vendor and product IDs of the GPUS are also listed; the vendor ID of NVIDIA is 10de and the product ID for the Tesla M2070 is 06d2. Note that these identifiers are always hexadecimal numbers.
10.4 OpenStack Nova Compute Configuration OpenStack's PCI passthrough functionality is enabled by configuring the pci_alias and pci_passthrough_whitelist settings (described at [24]) in the [DEFAULT] section of /etc/nova/nova.conf. Also, various filters must be enabled in the same section. Different settings must be configured on different hosts, depending on the services that are running. On the controller, /etc/nova/nova.conf must contain: [DEFAULT] pci_alias = { "name": "gpu", "vendor_id": "10de", "product_id": "06d2" } This setting ascribes an arbitrary name to a specific combination of vendor and product identifiers corresponding to the PCI device to be passed through to instances. In order to spawn an instance with an attached PCI device, a flavor will be defined that references this alias and hence the specific required hardware. On the scheduler node (the node that runs nova-scheduler), which will typically be the controller also, /etc/nova/nova.conf must list PciPassthroughFilter among the scheduling filters that select where to place an instance: [DEFAULT] # ... scheduler_available_filters = nova.scheduler.filters.all_filters scheduler_available_filters = nova.scheduler.filters.pci_passthrough_filter.PciPassthroughFilter 66
scheduler_default_filters = RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,PciPassthroughFilter On any compute node (running nova-compute), /etc/nova/nova.conf must contain: [DEFAULT] pci_passthrough_whitelist = [{ "vendor_id": "10de", "product_id": "06d2" }] This tells the compute node that any device with the specified PCI vendor and product ID is available for passthrough to an instance. When configuring these settings using DevStack, note that you must escape the quotes around strings in JSON syntax. Because of the way DevStack evaluates configuration file insertions, you must use doubled backslashes for this. On the Controller, the required snippet of local.conf would look like this: [[post-config|$NOVA_CONF]] [DEFAULT] pci_passthrough_whitelist=[{ \\"vendor_id\\": \\"10de\\", \\"product_id\\": \\"06d2\\" }] pci_alias={ \\"name\\": \\"gpu\\", \\"vendor_id\\": \\"10de\\", \\"product_id\\": \\"06d2\\" } scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler scheduler_available_filters=nova.scheduler.filters.all_filters scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciPassthroughFilter scheduler_default_filters=RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesF ilter,PciPassthroughFilter
10.5 OpenStack Flavor Creation OpenStack flavors represent specific combinations of hardware resources, and this concept extends to passed through PCI devices. The previously defined "alias" is used to reference a specific PCI device by its vendor and product IDs. (You can have more than one alias if there are several device types.) Since each blade has two Tesla M2070 GPUs, we define one flavor to request that a single GPU be attached to an instance, and a second flavor for attaching both GPUs: # Create a flavor based on m1.small, with 1 GPU. nova flavor-create m1.small.1-gpu 10 2048 20 1 nova flavor-key m1.small.1-gpu set "pci_passthrough:alias"="gpu:1" # Create a flavor based on m1.small, with 2 GPUs. nova flavor-create m1.small.2-gpu 11 2048 20 1 nova flavor-key m1.small.2-gpu set "pci_passthrough:alias"="gpu:2" Testing confirmed that a given compute node could run one m1.small.2-gpu instance or two m1.small.1gpu instances.
10.6 Instance Kernel OpenStack cloud images feature a stripped-down kernel that doesn't include the "drm" driver module. The lack of this module prevents the NVIDIA driver module from loading in the instance. Therefore, when running a Linux instance with a passed through NVIDIA GPU, it is necessary to install a full kernel. For Ubuntu 14.04, the kernel is installed as follows: sudo apt-get install -y linux-generic sudo reboot
67
10.7 Testing PCI passthrough devices in instances were tested using NVIDIA CUDA 7 sample programs and Roy Longbottom's GPGPU benchmarks. PCI passthrough instances could not be started on VMware (as expected). On KVM compute nodes, instances with passed through GPUs could be spawned and the GPUs were correctly listed in the output of lspci. The NVIDIA driver modules (nvidia and nvidia_uvm) could be loaded, but none of the CUDA samples or benchmarks would run. The output of the deviceQuery sample was: cudaGetDeviceCount returned 10 -> invalid device ordinal [deviceQuery] test results... FAILED We tried fully powering off the blade to reset the GPUs (which helped for baremetal GPU access) and then starting the instance from the previously-saved snapshot to see if it would have any effect on the availability of the passed through GPU, but it did not. An internet search of related work eventually revealed two particularly relevant articles. Firstly, at [25], the author eventually managed to get PCI passthrough working with an IBM HS22 blade and an NVIDIA Tesla M2070-Q. However, that required an updated firmware from NVIDIA. It doesn't appear to be possible to download firmware for the M2070 on the NVIDIA site. Nor is there any firmware for the GPU Expansion Blades available on IBM's support site. Secondly, at [26], there is a statement that NVIDIA only (officially) support PCI passthrough on Quadro (-Q devices such as the M2070-Q) and GRID devices. Given this fact and the results of testing, we did not continue trying to get the Tesla M2070 GPUs to work in KVM instances. We did re-test PCI passthrough on VMware guests started through vSphere Web Client (rather than OpenStack) and had the same results; the PCI devices appeared to be available and were listed by lspci but the NVIDIA drivers could not make use of them.
10.8 GPGPU Benchmarks 10.8.1 Ubuntu 14.04 Installation This procedure is informed by: [27]: 1. Verify the presence of a CUDA-compatible GPU: lspci | grep -i nvidia 2. Download the "network installer" package, cuda-repo-ubuntu1404_7.0-28_amd64.deb (10K bytes). This is a repository definition. wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.028_amd64.deb -O cuda-repo-ubuntu1404_7.0-28_amd64.deb 3. Install the package: sudo dpkg -i ./cuda-repo-ubuntu1404_7.0-28_amd64.deb 4. Update the package list and install the cuda package: sudo apt-get update; sudo apt-get install -y cuda 5. Create /etc/profile.d/cuda.sh with the following contents: export PATH=/usr/local/cuda-7.0/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-7.0/lib64:$LD_LIBRARY_PATH 6. Reboot the system. Installation of CUDA blacklists the nouveau driver, and rebooting will unload it. 68
sudo reboot
10.8.2 Initial Testing 1. Check the CUDA driver version: cat /proc/driver/nvidia/version 2. Using the major version number from the output of the preceding command, list more information about the nvidia module: /sbin/modinfo nvidia_346 3. Become root: sudo su 4. Install and compile sample applications: cuda-install-samples-7.0.sh /opt cd /opt/NVIDIA_CUDA-7.0_Samples/ make -j 16 5. Run the deviceQuery sample application: cd /opt/NVIDIA_CUDA-7.0_Samples/bin/x86_64/linux/release; ./deviceQuery 6. Run the bandwidthTest sample: ./bandwidthTest
10.9 Roy Longbottom's CUDA Mflops Benchmark Roy Longbottom's Linux CUDA Mflops benchmark is described at: • [28], and • [29] 1. Download the sources: wget http://www.roylongbottom.org.uk/linux_cuda_mflops.tar.gz -O linux_cuda_mflops.tar.gz 2. Extract: tar xvzf linux_cuda_mflops.tar.gz 3. Change to the 64-bit binaries directory: cd linux_cuda_mflops/bin_64 4. If you run plan on running the benchmarks as root, set the LD_LIBRARY_PATH to include the current directory: export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH 5. Run the single-precision benchmark, and save the results file: ./cudamflops64SP; mv CudaLog.txt baremetal_64_sp.txt 6. Run the double-precision benchmark, and save the results file: ./cudamflops64DP; mv CudaLog.txt baremetal_64_dp.txt
10.10 NVIDIA bandwidthTest Sample 1. Use the NVIDIA bandwidthTest sample program to measure bandwidth in transfers between the device and the host: cd /opt/NVIDIA_CUDA-7.0_Samples/bin/x86_64/linux/release ./bandwidthTest > baremetal_bandwidth.txt
69
11 Hypervisor Suitability Comparison 11.1 Phoronix Test Suite Installation On any machine under test, Phoronix Test Suite was installed and configured as follows: 1. Install the package: apt-get install phoronix-test-suite 2. Run an phoronix-test-suite command to. phoronix-test-suite list-tests • This will trigger an interactive acceptance of the license terms, which would otherwise stall scripted runs. Enter y to accept the license and n (twice) to decline sharing the results. 3. Configure batch invocation of benchmarks by running the following command and answering questions as shown below: phoronix-test-suite batch-setup These are the default configuration options for when running the Phoronix Test Suite in a batch mode (i.e. running phoronix-test-suite batch-benchmark universe). Running in a batch mode is designed to be as autonomous as possible, except for where you'd like any end-user interaction. Save test results when in batch mode (Y/n): y Open the web browser automatically when in batch mode (y/N): n Auto upload the results to OpenBenchmarking.org (Y/n): n Prompt for test identifier (Y/n): n Prompt for test description (Y/n): n Prompt for saved results file-name (Y/n): n Run all test options (Y/n): y Batch settings saved.
11.2 Cyclictest installation Cyclictest is installed on Ubuntu from the rt-tests package: sudo apt-get install -y rt-tests
11.2.1 Iperf Installation The iperf benchmark and scripts to run the client and server were installed as follows: 1. Install the iperf package: sudo apt-get install -y iperf 2. Copy the custom client-side benchmark runner scripts to the VM or baremetal host under test, e.g.: scp -i ~/.ssh/provisioning \ openstack_project/Benchmarks/System/iperf_???.sh
[email protected]: 3. Copy the custom server-side benchmark runner scripts to the Clonezilla host: scp -i ~/.ssh/provisioning \ openstack_project/Benchmarks/System/iperf_*_server.sh \
[email protected]: In order to enable TCP and UDP communication between the iperf server on the Clonezilla host and virtual machine instances, two additional access (firewall) rules were configured within OpenStack Horizon, for the demo tenant. These were added to the default security group. The rules were:
70
• Ingress, port 5001/TCP, address 0.0.0.0/0 (i.e. any). • Ingress, port 5001/UDP, address 0.0.0.0/0 (i.e. any).
11.2.2 Pre-Benchmark Tuning Initial benchmark results showed that with default settings on baremetal, the physical machine was unable to achieve even 10% of the capacity of the 10Gb/s interface in UDP. The UDP bandwidth was about 810Mb/s. Above 2Gb/s, the error rate gradually increased to tens of percentage points such that the overall bandwidth was even less. Virtual machine results reflected this limitation of the host by showing essentially identical performance. In order to get more meaningful results, both the host and the VMs were tuned in the following ways, as suggested at http://datatag.web.cern.ch/datatag/howto/tcp.html: • Receive and send buffer sizes were increased by about a factor of 10: sysctl -w net.core.rmem_max=26214400 sysctl -w net.core.wmem_max=26214400 • The transmit queue size for the network interface was increased: ifconfig eth2 txqueuelen 2000 • The receive buffer size and UDP packet size were increased accordingly by adding the following arguments to the iperf command line: --window 25M --len 32768 Of these two arguments, the more important by far is the --len setting which sends much larger UDP packets than the default (1470 bytes).
71
Acknowledgement This research was performed under contract to the Defence Science and Technology (DST) Group Maritime Division, Australia.
Appendix A: Troubleshooting A.1: Hardware A.1.1: HS22 BladeCenter FW/BIOS ROM Corruption One of the blades showed the following status in the AMM: "FW/BIOS, firmware progress (ABR Status) FW/BIOS ROM corruption" [30]. The suggested resolution is to download new firmware appropriate to the blade - HS22 (Type 7870) – from [31]
UEFI Hang HS22 (Type 7870) may hang indefinitely with the message "UEFI platform initialization", as described at [32]. The suggested workaround is to remove the blade from the chassis, remove its battery for 30 seconds and then put everything back into place. It is also recommended to upgrade the IMM to the latest version available for that blade type, from [33] . Note that the listed prerequisite IMM version will be the earliest version known to work with the firmware, but that can cause problems. You should download the most recent IMM version and install that as the prerequisite.
IO Modules (IOMs) not detected In the case of the IOMs not being detected, when they have been installed there is likely to be a fault in either the blade or the IOM. In this case the recommended action is to replace that blade.
A.2: DevStack A.2.1: Stack.sh Common Retry If the stack.sh script fails and you want to retry, after making configuration changes, you may want to run unstack.sh, which resides in the same directory as stack.sh; it stops any OpenStack services that have been started so far.
Oslo.db version requirements incorrect If stack.sh fails with an error message about oslo.db version requirements, then, per the advice at https://bugs.launchpad.net/oslo.db/+bug/1402312 do: sudo pip uninstall oslo.db sudo pip install oslo.db ./unstack.sh ./stack.sh
72
Horizon Unauthorised If a Horizon login fails with the message "Unauthorised at /project/" and a page full of stack traces, then clear cookies and other browser data since the last successful Horizon login and try again.
A.2.2: Networking MTU Fix As previously noted, when the tenant network encapsulates traffic in a protocol like GRE or VXLAN, the extra overhead can cause the packet size to exceed the MTU on the Controller. The resulting packet fragmentation causes applications that send large messages (such as SSH key exchange) to hang. The local.conf file on the Controller includes a reference to /etc/neutron/dnsmasq-neutron.conf that injects a smaller MTU into instances than that which exists on the Network node (the Controller). This file must be created, just one time, as follows: 1. Log on to the Controller as user stack. 2. Create /etc/neutron/dnsmasq-neutron.conf: sudo su mkdir -p /etc/neutron echo dhcp-option-force=26,1400>>/etc/neutron/dnsmasq-neutron.conf 3. Verify the contents of the file: cat /etc/neutron/dnsmasq-neutron.conf Note, however, that Cirros 0.3.2 (the default Cirros image in OpenStack Juno) does not honour the MTU setting suggested to it by the DHCP server. A later section describes the procedure for loading the image for Cirros 0.3.3, which does not have that problem. Alternatively, you can use a different operating system for the instance, e.g. Ubuntu, which also does not have the problem.
A.3: GPGPU A.3.1: Benchmarking deviceQuery program hangs If the deviceQuery sample program hangs, the GPUs most likely need to be fully powered off for a while to fully reset their state. Just rebooting the host is not enough. So, fully power off the host system for a minute, then start it and test it again. Prior to running the deviceQuery sample program: • The output of lsmod | grep nvidia will show just a kernel module called nvidia, in use by a module named drm. • There will be only a single GPU-related device: /dev/nvidiactl After successfully running the deviceQuery sample program (which will show both Tesla M2070 GPUs): • The output of ls -la /dev/nvidia* will be: crw-rw-rw- 1 root root 195, 0 Apr 14 16:55 /dev/nvidia0 crw-rw-rw- 1 root root 195, 1 Apr 14 16:55 /dev/nvidia1 crw-rw-rw- 1 root root 195, 255 Apr 14 16:54 /dev/nvidiactl crw-rw-rw- 1 root root 249, 0 Apr 14 16:55 /dev/nvidia-uvm • There will be two loaded modules with nvidia in their name (lsmod | grep nvidia):
73
nvidia_uvm nvidia drm
67139 0 8370760 1 nvidia_uvm 302817 1 nvidia
• There will be an information pseudo-file for each GPU that did not exist before (note: ':' must be escaped in the filename): cat /proc/driver/nvidia/gpus/0000\:1b\:00.0/information
74
Appendix B: Glossary Abbreviation
Meaning
KVM
Kernel-based Virtual Machine
AMM
Advanced Management Module (for BladeCenter)
UEFI
Unified Extensible Firmware Interface
BIOS
Basic Input/Output System
IMM
Integrated Management Module (for BladeCenter)
VPD
Virtual Private Database
NIC
Network Interface Card
CFFh
Combination Form Factor horizontal
SFP
Small Form-factor Pluggable transceiver
TFTP
Trivial File Transfer Protocol
ISCLI
Industry Standard Command Line Interface
VFSM
Virtual Fabric Switch Module
CLI
Command Line Interface
VLAN
Virtual Local Area Network
PVID
Port VLAN ID
DRBL
Diskless Remote Boot in Linux
MAC
Media Access Control (address)
IP
Internet Protocol (address)
DHCP
Dynamic Host Configuration Protocol
SOL
Serial Over LAN
RAID
Redundant Array of Inexpensive/Independent Disks
PXE
Preboot eXecution Environment
SSH
Secure Shell
DNS
Domain Name Service
DRS
Distributed Resource Scheduler
ML2
Modular Layer 2
GPGPU
General-Purpose computing on Graphics Processing Units
QEMU
Quick Emulator
LXC
Linux Container
LBaaS
Load Balancer as a Service
75
API
Application Programming Interface
76
Acknowledgement This research was performed under contract to the Defence Science and Technology (DST) Group Maritime Division, Australia. David Silver worked as a contract researchers and programmer on this project. Ben Ramsey was a research assistant (and programmer) on this project. Ali Babar was the lead researcher on this project.
Appendix C: References [1] OpenStack. (2015, July) Releases - OpenStack. [Online]. http://wiki.openstack.org/releases [2] Linux Foundation. (2015) The Xen Project, the powerful open source industry standard for virtualization. [Online]. www.xenproject.org [3] Fabrice Bellard. (2015) QEMU. [Online]. http://wiki.qemu.org/Main_Page [4] VMWare. (2015) vSphere ESXI Bare-Metal Hypervisor | United States. [Online]. https://www.vmware.com/products/esxi-and-esx/overview [5] Canonical Ltd. (2015) Linux Containers - LXC - Introduction. [Online]. www.linuxcontainers.org/lxc/introduction [6] Canonical Ltd. (2015) Linux Containers - LXD - Introduction. [Online]. http://www.linuxcontainers.org/lxd/introduction [7] HAProxy. (2015) HAProxy - The Reliable, High Performance TCP/HTTP Load Balancer. [Online]. www.haproxy.org [8] IBM. (2015, May) IBM Support. [Online]. http://www933.ibm.com/support/fixcentral/systemx/selectFixes?parent=BladeCenter%2BHS22&product=ibm/systemx/7870 &&platform=All&function=all [9] (2013, Jan) IBM Help System. [Online]. https://publib.boulder.ibm.com/infocenter/bladectr/documentation/index.jsp?topic=/com.ibm.bladecenter.hs22.doc /dw1iv_r_parts_listing_HS22.html [10] IBM. (2013, June) IBM Redbooks. [Online]. http://www.redbooks.ibm.com/abstracts/tips0728.html [11] IBM. (2013, Dec) IBM Help System. [Online]. https://publib.boulder.ibm.com/infocenter/bladectr/documentation/index.jsp?topic=/com.ibm.bladecenter.hs22.doc /dw1iu_t_installing_the_expansion_unit.html [12] IBM. (2014, April) IBM Redbooks. [Online]. http://www.redbooks.ibm.com/abstracts/tips0708.html [13] IBM. (2013, Dec) IBM Docs. [Online]. http://public.dhe.ibm.com/systems/support/system_x_pdf/vfsm_ag_7_8.pdf [14] IBM. (2014, Jan) IBM BladeCenter Virtual Fabric 10Gb Ethernet Switch Module documentation. [Online]. https://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5091835 [15] IBM. (2013, Jan) IBM Virtual Fabric 10Gb ESM Ethernet Switch for IBM. [Online]. https://delivery04.dhe.ibm.com/sar/CMA/XSA/03kg0/0/ibm_fw_bcsw_24-10g-7.6.1.0_anyos_noarch.txt [16] IBM. (2013, Jan) IBM Support: Fix Central. [Online]. http://www933.ibm.com/support/fixcentral/systemx/downloadFixes?parent=BladeCenter%2BH%2BChassis&product=ibm/sy stemx/8852&&platform=NONE&function=fixId&fixids=ibm_fw_bcsw_24-10g7.6.1.0_anyos_noarch&includeRequisites=1&includeSupersedes=0&downloadMethod=http [17] IBM. (2013, Dec) IBM Support. [Online]. http://public.dhe.ibm.com/systems/support/system_x_pdf/vfsm_is_7_8.pdf [18] IBM. (2013, Dec) IBM Support. [Online]. http://public.dhe.ibm.com/systems/support/system_x_pdf/vfsm_ag_7_8.pdf [19] DRBL. DRBL - FAQ/Q&A. [Online]. http://drbl.org/faq/fineprint.php?fullmode=1&path=./2_System#50_class_B_clients.faq [20] VMware. my VMware. [Online]. https://my.vmware.com/group/vmware/evalcenter?p=free-esxi5 [21] VMware. (2005) vSphere Documentation Center. [Online]. https://pubs.vmware.com/vsphere50/index.jsp?topic=%2Fcom.vmware.vsphere.upgrade.doc_50%2FGUID-17862A54-C1D4-47A9-88AA2A1A32602BC6.html [22] VMware. (2007) Virtual Networking Concepts. [Online]. http://www.vmware.com/files/pdf/virtual_networking_concepts.pdf [23] VMware. (2014) vSphere Networking. [Online]. http://pubs.vmware.com/vsphere77
[24] [25] [26] [27] [28] [29] [30] [31]
[32] [33] [34] [35] [36]
[37]
[38] [39]
[40] [41]
[42] [43] [44] [45]
55/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-552-networking-guide.pdf OpenStack. (2015) VMware vSphere - OpenStack Configuration Reference Kilo. [Online]. http://docs.openstack.org/kilo/config-reference/content/vmware.html VMware. (2015) VMware vSphere 6.0 Documentation Center. [Online]. http://pubs.vmware.com/vsphere60/index.jsp?topic=%2Fcom.vmware.vcli.ref.doc%2Fvcli-right.html OpenStack. (2015) stack.sh. [Online]. http://docs.openstack.org/developer/devstack/stack.sh.html et al. John Walters, "GPU Passthrough Performance: A Comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL Applications," IEEE 7th International Conference on Cloud computing, 2014. OpenStack. (2015) PCI Passthrough - OpenStack. [Online]. https://wiki.openstack.org/wiki/Pci_passthrough Red Hat. (2015, May) How to assign devices with VT-d in KVM. [Online]. http://www.linuxkvm.org/page/How_to_assign_devices_with_VT-d_in_KVM OpenStack. (2014) nova.conf - configuration options - OpenStack Configuration Reference. [Online]. http://docs.openstack.org/juno/config-reference/content/list-of-compute-config-options.html Alexander Ervik Johnsen. (2012, Mar) How to get IBM HS 22 with GPU Expansion Blade to work with XenServer 6.x GPU Passthrough. [Online]. http://www.ervik.as/citrix/xenserver/3692-how-to-get-ibm-hs-22-withgpu-expansion-blade-to-work-with-xenserver-6x-gpu-passthrough Matt Bach. (2014) Multi-headed VMWare Gaming Setup. [Online]. https://www.pugetsystems.com/labs/articles/Multi-headed-VMWare-Gaming-Setup-564/ NVIDIA. (2015, Mar) CUDA Getting Started Guide for Linux. [Online]. http://developer.download.nvidia.com/compute/cuda/7_0/Prod/doc/CUDA_Getting_Started_Linux.pdf Roy Longbottom. (2015, Jan) Linux CUDA GPU Parallel Computing Benchmarks. [Online]. http://www.roylongbottom.org.uk/linux_cuda_mflops.htm Roy Longbottom. (2014, Oct) CUDA GPU Double Precision Benchmark. [Online]. http://www.roylongbottom.org.uk/cuda2.htm IBM. (2011) Blade hangs on boot and "FW/BIOS, firmware progress (ABR Status) FW/BIOS ROM corruption" message in AMM - IBM BladeCenter HS22, HS22V. [Online]. https://www947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5086680&brandind=5000020 IBM. (2015) IBM Support: Fix Central. [Online]. http://www933.ibm.com/support/fixcentral/systemx/selectFixes?parent=BladeCenter%2BHS22&product=ibm/systemx/7870 &&platform=All&function=all IBM. (2010) Hangs at "UEFI platform initialization" after backflash - IBM BladeCenter HS22. [Online]. https://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5084256&brandind=5000020 IBM. (2015) Select fixes BladeCenter HS22, 7870 (All platforms). [Online]. www933.ibm.com/support/fixcentral/systemx/selectFixes?parent=BladeCenter%2BHS22&product=ibm/systemx/7870 &&platform=All&function=all OpenStack. (2015, April) OpenStack Documentation. [Online]. http://docs.openstack.org/juno/installguide/install/apt/content/ch_overview.html#architecture_conceptual-architecture (2015, March) NVIDIA CUDA GETTING STARTED GUIDE FOR LINUX Installation and Verification on Linux Systems. [Online]. http://developer.download.nvidia.com/compute/cuda/7_0/Prod/doc/CUDA_Getting_Started_Linux.pdf (2015, Mar) CUDA Getting Started Guide for Linux. [Online]. http://developer.download.nvidia.com/compute/cuda/7_0/Prod/doc/CUDA_Getting_Started_Linux.pdf IBM. IBM Support: Fix Central. [Online]. http://www933.ibm.com/support/fixcentral/systemx/groupView?query.productGroup=ibm%2FBladeCenter (2015) IBM Fix Central. [Online]. http://www933.ibm.com/support/fixcentral/systemx/groupView?query.productGroup=ibm%2FBladeCenter John Paul Walters et al., "GPU Passthrough Performance: A Comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL Applications," in IEEE Cloud 2014.
78