Installing and configuring a simple grid using PBS 1)
Introduction ............................................................................................................................................... 1
2)
Installing and configuring Linux ................................................................................................................. 1 a. i. ii. iii. iv. v. vi. vii. b. c. i. ii. iii. iv. v.
3)
Installing Linux .......................................................................................................................................... 1 Installation Type: .................................................................................................................................. 2 Network Devices:.................................................................................................................................. 2 Firewall:.............................................................................................................................................. 2 Time zone: ........................................................................................................................................... 2 Root password: .................................................................................................................................... 2 Date and Time: Network Time Protocol: ..................................................................................................... 2 System User: ....................................................................................................................................... 3 Setting up accounts ................................................................................................................................... 3 Configuring Linux ...................................................................................................................................... 3 Three machines running Linux Debian: ...................................................................................................... 3 Updating the sources list file .................................................................................................................. 4 Updating the system ............................................................................................................................. 4 A working Yum configuration: ................................................................................................................. 4 Time synchronization ............................................................................................................................ 5
Deploying Torque (Open PBS) ................................................................................................................... 6 a. b. c. d. e.
4)
Deploying RSH .......................................................................................................................................... 6 Downloading and Installing Torque ................................................................................................................ 7 Configuring and Deploying Torque (PBS) ........................................................................................................ 7 Starting PBS ............................................................................................................................................ 8 Testing PBS ............................................................................................................................................................................... 9
Deploying The PostgreSQL Relational Database ....................................................................................... 9 a. b. c.
5)
First ....................................................................................................................................................... 9 Initialization ........................................................................................................................................... 10 Starting the database .............................................................................................................................. 11
Fixing Java and ANT ................................................................................................................................. 12 a. b. c.
6)
Removing default Java ............................................................................................................................. 12 Java installation...................................................................................................................................... 13 Installing Ant .......................................................................................................................................... 14
Installing the Globus Toolkit ..................................................................................................................... 14 a.
b. c. d. e. f. g.
Installing Software Pre-requisites .............................................................................................................. 14 i. Installing “openssl” .......................................................................................................................... 14 ii. Installing “dpkg” command ................................................................................................................ 15 iii. Installing “zlib” command ................................................................................................................... 15 iv. Installing “gcc” command................................................................................................................... 15 v. Installing “g++” command ................................................................................................................. 15 vi. Installing “GNU tar” command ............................................................................................................ 15 vii. Installing “GNU Make” command ....................................................................................................... 15 viii. Installing “GNU sed” command ........................................................................................................... 16 ix. Installing “sudo” command ................................................................................................................. 16 x. Installing “XML Parser” ..................................................................................................................... 16 Creating the “globus” user .................................................................................................................. 16 Configuring Paths .................................................................................................................................... 16 Configuring the Toolkit ............................................................................................................................ 17 Building and installing the Toolkit ................................................................................................................ 17 Set environment variables ........................................................................................................................ 17 Creating a Certificate Authority ................................................................................................................. 18
-1-
h. i. j. k. l. m. n. o. p. q. r. s. t. u.
7)
Obtaining a Host Certificate on nodeB ......................................................................................................... 20 Making a Copy for the Container ................................................................................................................ 21 Creating the grid-mapfile .......................................................................................................................... 21 Configuring the RFT Service ...................................................................................................................... 22 Configuring sudo ..................................................................................................................................... 23 Starting the Container ............................................................................................................................. 23 Starting globus-gridftp-server .................................................................................................................. 25 Repeat for nodeA .................................................................................................................................. 25 Obtaining Credentials for Generic User ....................................................................................................... 26 Testing the Grid Services on nodeB .......................................................................................................... 28 Obtaining host credentials for nodeA .......................................................................................................... 29 Testing globus-gridftp-server on nodeA .................................................................................................... 30 Testing file staging .................................................................................................................................. 31 Completing Deployment on nodeC ............................................................................................................. 32
Connecting Globus Gram WS and Torque (Open PBS) ............................................................................ 33 a. b.
8)
Building the WS GRAM PBS jobmanager ....................................................................................................... 33 Testing the GRAM WS PBS jobmanager ........................................................................................................ 34
Installing PBS execution hosts (Linux):................................................................................................... 35 a. b. c.
Introducing Torque to the Worker Nodes ..................................................................................................... 35 Installing Torque on the Worker Nodes ........................................................................................................ 35 Testing the cluster .................................................................................................................................. 36
-2-
Appendix B: Installing and configuring a simple grid 1) Introduction This tutorial shows how to create a simple grid. It will appear like: o a grid client (NodeA) o two "clusters" of Torque (PBS) job managed machines represented by NodeB and nodeC
Internet
nodeA
nodeB
nodeC
PBS worker nodes
Figure 1 : Proposed grid architecture
This tutorial requires a rudimentary knowledge of Linux and administration of Linux systems, networking, and basic programming principles.
Now
let’s build the grid!
2) Installing and configuring Linux a.
Installing Linux
This tutorial uses Linux-Debian operating system. It can be downloaded from http://cdimage.debian.org/debian-cd/5.0.3/i386/iso-cd/. Create your Debian installation CDs from the downloaded iso images. -1-
They are big downloads, so one should absolutely do a checksum verification against these images before creating CDs. During the installation choose the following configuration: i.
Installation Type:
You have to choose the installation type: Server. ii.
Network Devices:
This is where you installation may vary. While the initial string of the hostnames should be the same, the domain, IP address, netmask, gateway and DNS will most likely be different depending on the network parameters of your institution. The parameters for machines nodeA, nodeB, and nodeC used to create this tutorial are listed below. If you do not understand these parameters, or why yours may be different, the rest of this tutorial will probably be very difficult. Note: In this tutorial nodeA, nodeB, and nodeC are the actual host names of the machines.
nodeA
nodeB
---------------------------------------------------------------| IP: 172.16.1.1 | | Netmask: 255.255.0.0 | | Hostname: nodeA.isim.tn | | Gateway: 172.16.0.1 | | Primary DNS: 193.95.66.10 | ----------------------------------------------------------------
---------------------------------------------------------------| IP: 172.16.2.1 | | Netmask: 255.255.0.0 | | Hostname: nodeB.isim.tn | | Gateway: 172.16.0.1 | | Primary DNS: 193.95.66.10 | ----------------------------------------------------------------
nodeC ---------------------------------------------------------------| IP: 172.16.3.1 | | Netmask: 255.255.0.0 | | Hostname: nodeC.isim.tn | | Gateway: 172.16.0.1 | | Primary DNS: 193.95.66.10 | ---------------------------------------------------------------iii. Firewall:
No firewall - Disable SELinux iv. Time zone:
Set timezone to your proper timezone v.
Root password:
admin WARNING! If you haven't figured it out already we are instructing you to set up a very unsecure network You can choose a better secured root password. vi. Date and Time: Network Time
Protocol:
Enable Network Time Protocol Keep the defaults, 0.pool.ntp.org, 1.pool.ntp.org, 2.pool.ntp.org -2-
vii.
System User:
You can create a user name "sysuser" with the password "sysuser". This step isn't really necessary as these accounts will not be used, but it is always nice to have a miscellaneous user account available. b.
As root :
Setting up accounts 1
In order to install and configure Globus you can’t use the root account. You need to create a normal user account. Repeat the same thing to:
nodeA : because it will represent a client machine running job submission tools and other tools so it needs to have Globus installed
nodeB and nodeC : because it will represent a Linux cluster head node running various Globus web services, including WS GRAM and a GridFTP server Edit the file /etc/group and add the line:
globus:x:501:
If you want you can choose a group ID number other than 501, and/or another name other than 'globus'. command
/usr/sbin/useradd –c "Globus User" –g 501 –m –u 501 globus passwd globus
Next create a generic user account. This account will be the one used to exercise the grid services and tools. This should be done on:
nodeA : because it will represent a user somewhere on a network exercising and using grid services and grid tools
nodeB and nodeC : because it will be helpful for testing Torque (OpenPBS) installation
command
/usr/sbin/useradd –c "Haitham User" –g 100 –m –u 101 haitham
This will create user 'haitham' in group 'users' along with the home directory /home/haitham. The group 'users' has group ID 100 on most Linux systems. Set the password for the "globus" and "haitham" accounts to ‘globususer’ and ‘haithamuser’ (or any other password you choose) by running the following two commands as root on all three systems. command c.
passwd globus passwd haitham
Configuring Linux
I'll make sure that our default installation instructions get us to these starting assumptions: i.
Three machines running Linux Debian:
For instructions on how to do this please see the section on Linux Debian web site: 1
From this point all the following commands must be run as root (super user, System administrator).
-3-
http://www.debian.org/ ii.
Updating the sources list file
As root : Updating the “/etc/apt/sources.list” file. From http://www.mayin.org/ajayshah/ computing/debian-principles.html.2 This operation must be done for the three machines. /etc/apt/sources.list # #deb cdrom:[Debian GNU/Linux 5.0.3 _Lenny_ 20090905-08:23]/ lenny main
- Official i386 CD Binary-1
#deb cdrom:[Debian GNU/Linux 5.0.3 _Lenny_ 20090905-08:23]/ lenny main
- Official i386 CD Binary-1
deb http://ftp.fr.debian.org/debian/ lenny main deb-src http://ftp.fr.debian.org/debian/ lenny main deb http://security.debian.org/ lenny/updates main deb-src http://security.debian.org/ lenny/updates main deb http://volatile.debian.org/debian-volatile lenny/volatile main deb-src http://volatile.debian.org/debian-volatile lenny/volatile main
iii. Updating the system apt-get update iv. A working Yum configuration:
So that administrators are able to install, update, and remove packages using Yum. Verify that Yum is installed: command 3
yum --version
verifying 4 If Yum was not installed during the installation of Debian instructions for retrieving and installing yum can be found on this web site: http://www.phy.duke.edu/~rgb/General/ yum_article/yum_article/node10.html command
yum –y –update
verifying Note that that this first update will be a bit time consuming, but upon completion one 2
Add to biblio The command that must be run in order to install the software. 4 The command that should be run for verification and the result you will see. 3
-4-
should have a nice fresh install of the latest packages. v.
Time synchronization
Each of the machines should be running some type of time synchronization software so that the system times on the machines are accurate and consistent to within a few minutes. If ntpd (Network Time Protocol Daemon) is to be used, the following /etc/ntp.conf file will probably work, however if you followed the instructions in this tutorial for Installing Linux there will already be a ntp.conf file that will more than likely contain this very information. command verifying
cat /etc/ntp.conf server 0.pool.ntp.org server 1.pool.ntp.org server 2.pool.ntp.org driftfile /var/lib/ntp/drift
Also keep in mind that the machines need to have the network configured so that those time servers can be reached. You can test by doing this: command verifying
host 0.pool.ntp.org 0.pool.ntp.org has address 80.35.31.228 0.pool.ntp.org has address 80.163.145.206 0.pool.ntp.org has address 142.58.206.202 0.pool.ntp.org has address 199.103.21.233 0.pool.ntp.org has address 202.234.64.222 0.pool.ntp.org has address 213.210.55.91 0.pool.ntp.org has address 213.239.193.168 0.pool.ntp.org has address 216.27.160.99 0.pool.ntp.org has address 220.164.192.140 0.pool.ntp.org has address 62.152.126.5 0.pool.ntp.org has address 62.193.225.80 0.pool.ntp.org has address 64.172.230.138
Each of the machines should be configured so that all hostnames resolve correctly to IP addresses and vice versa. This will need to be accomplished by editing the file /etc/hosts on each (nodeA, nodeB, nodeC) of the machines. When /etc/host file is properly edited the cat command should yield output similar to this: command verifying
Note :
cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost 172.16.1.1 nodeA.isim.tn nodeA 172.16.2.1 nodeB.isim.tn nodeB 172.16.3.1 nodeC.isim.tn nodeC
in particular the format of the entries above: IP address
FULLY QUALIFIED DOMAIN NAME
alias
A common mistake is to put the alias before the FQDN. That is wrong. The FQDN should be first so that a lookup on the IP address returns the FQDN and not the short alias. -5-
For the purposes of this tutorial we will use the following names (short aliases) for the nodes: nodeA, nodeB, nodeC
3) Deploying Torque (Open PBS) Now it’s time to deploy Torque (Open PBS or just PBS) on nodeB and nodeC as a "remote" batch system, and then later we will configure Globus GRAM WS so that jobs can be submitted into PBS via Globus from "the grid". a.
Deploying RSH
By default PBS will want to use the rsh and rcp tools to copy around input and output files, even if jobs are only running on this single node "cluster". So the first step is to get rsh and similar tools installed. They most likely will not have been installed since to some extent they are a security risk, but for the purpose of this tutorial there is nothing to worry about. As root : Start by making sure that xinetd is installed: command
rpm –qa | grep xinetd
verifying
xinetd-2.3.13-6
Also make sure that it is configured to start up at boottime: command verifying
/sbin/chkconfig - -list | grep xinetd xinetd 0:off 1:off 2:off 3:on 4:on 5:on 6:off xinetd based services:
If xinetd is not available it can be installed by doing: command
apt-get install xinetd
Next install the two necessary rsh packages: command
apt-get install rsh-server apt-get install rsh
By default the rsh and rlogin services will not be enabled. To enable them edit the files /etc/xinetd.d/rsh /etc/xinetd.d/rlogin and change disable to 'no'. Then do:
command
/etc/init.d/xinetd restart
to restart xinetd (or 'start' if it is not running).
Next we have to configure the machine to allow access via rsh and rlogin from only itself. To do this edit the file /etc/hosts.equiv and add only a single line with the IP address for nodeb and make sure the IP address was entered correctly with the following command. Note that this file may not yet exist or simply be an empty file. -6-
command
cat /etc/hosts.equiv
verifying
172.16.2.1
Next test to make sure that rsh works for user 'haitham': command
/usr/bin/rsh nodeb /usr/bin/whoami
verifying
haitham b.
Downloading and Installing Torque
Download the source code from http://www.clusterresources.com/downloads/torque/ torque-2.0.0p7.tar.gz For this demo the torque tar.gz archive was downloaded using the systemuser account to the systemuser home directory (/home/systemuser) was unpacked, configured, built and installed as the root account but from the systemuser home directory. A typical way of doing this is to use the 'wget' command: As systemuser : command
wget
http://clusterresources.com/downloads/torque/torque-2.0.0p7.tar.gz
As root : Execute the following commands to unpack, configure, build, and install Torque (Open PBS). There will be resulting output from these commands. However they are not shown here. command
tar –zxf torque-2.0.0p7.tar.gz cd torque-2.0.0p7 ./configure --prefix=/opt/pbs make make install make packages ./torque-package-clients-linux-i686.sh --install --destdir /opt/pbs ./torque-package-mom-linux-i686.sh --install --destdir /opt/pbs c.
Configuring and Deploying Torque (PBS)
As root : Run the following command to begin the initial configuration of the PBS server:
command
/opt/pbs/sbin/pbs_server -t create /opt/pbs/bin/qmgr
When qmgr is run it will start a "prompt session" that will want some commands entered. Execute on this with the commands below so your session looks like the following: Note that the hostname in the first command given to the qmgr should be in all lowercase even if technically the host name has an uppercase letter in it. command
Qmgr: Qmgr: Qmgr: Qmgr: Qmgr:
set server operators =
[email protected] create queue batch set queue batch queue_type = Execution set queue batch started = True set queue batch enabled = True
-7-
Qmgr: Qmgr: Qmgr: Qmgr: Qmgr:
set queue batch max_queuable=3 set server default_queue = batch set server resources_default.nodes = 1 set server scheduling = True quit
Next we need to configure PBS to understand which nodes in the "cluster" are available to be used. Create the file /usr/spool/PBS/server_priv/nodes command
touch /usr/spool/PBS/server_priv/nodes
Then with vi or a similar editor add the following line: nodeB.isim.tn np=2 The 'np=2' is because the machine used while developing this tutorial has two processors or CPUs. If your computer only has one then leave off the 'np=' option, or 5set it appropriately.
A cat command on the file should yield a similar resulting output: command
cat /usr/spool/PBS/server_priv/nodes
verifying
nodeB.isim.tn np=2
Next we need to configure PBS so that it understands which node in the "cluster" is acting as the server. Create the file /usr/spool/PBS/mom_priv/jobs/config and edit it so that it looks similar to this: command
cat /usr/spool/PBS/mom_priv/jobs/config
verifying
$pbsserver nodeB.isim.tn $logevent 255 d.
Starting PBS
As root : Run the following three commands to start all of the necessary PBS components: command
/opt/pbs/sbin/pbs_mom /opt/pbs/bin/qterm -t quick /opt/pbs/sbin/pbs_server
After a few seconds you should be able to query to see which nodes are parts of the PBS "cluster": command
/opt/pbs/bin/pbsnodes –a
verifying
nodeB.isim.tn state = free np = 2 ntype = cluster status = opsys=linux,uname=Linux nodeB.isim.tn 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 i686, sessions=1894, nsessions=1, nusers=1, idletime=7528, totmem=3065064kb, availmem=3015308kb, physmem=1033456kb, ncpus=2,loadave=0.00, netload=127730552, state=free, jobs=? 0, rectime=1139525328
At this point PBS is ready to accept jobs, but it will not actually run the jobs because there is no scheduler to schedule the jobs.
5
To be careful np=2 or np=off
-8-
For this tutorial we will use the default simple scheduler. See the Torque documentation to learn about other schedulers. To start up the default simple scheduler run the following:
command
/opt/pbs/sbin/pbs_sched e.
Testing PBS
As haitham : Now any user on nodeb should be able to directly submit a job to be run via PBS and should be able to query the state of the job. To test run the following command to submit a job that will sleep for one minute and then print out the current time and date: command
echo "sleep 60;date" | /opt/pbs/bin/qsub
verifying
4.nodeB.isim.tn
Note that '4' might be a different integer, depending on exactly how many times you run this command. It is a simple job ID. To query for the status of the sleep job run: command
/opt/pbs/bin/qstat
verifying Job id ----------------4.nodeb
Name -------------STDIN
User -------------haitham
Time Use ---------0
S Queue - ------R batch
After the job completes user haitham should have the following files in her home directory: command
ls
verifying
STDIN.e4
STDIN.o4
The "STDIN.o4" file will contain the standard out from the job. It should be the time and date: command
cat STDIN.o4
verifying
Tue Feb 9 17:09:53 CST 2010
Note that: Throughout
this tutorial there are numerous instances of environment variables being set for the root and other users. Experienced Linux users may choose to set environment variables in ".rc" or similar files that are automatically executed upon login. Don’t forget that any entries you should type will be in red. Any action you should take will be in black.
4) Deploying The PostgreSQL Relational Database a.
First -9-
As root : We will want to run the Globus Reliable File Transfer (RFT) service on nodes B and C since they are the head nodes for our "clusters". RFT requires a relational database backend in order to preserve state across machine shutdowns. Depending on the details of your Debian installation the PostgreSQL database may already be installed. You can check using the 'rpm' command as shown: command
rpm –qa | grep postgres
verifying
postgresql-8.0.3-1 postgresql-server-8.0.3-1 postgresql-libs-8.0.3-1
You will need to have all three of those packages installed. If they are not installed you can use 'apt-get' to install the packages. Once you are confident the packages are installed you want to make sure that a 'postgres' user account is available: command
grep postgres /etc/passwd
verifying
postgres:x:26:26:PostgreSQL Server:/var/lib/pgsql:/bin/bash
If a 'postgres' user account is not available for some reason please create the account now using the 'useradd' command.
We do not want postgres to start upon boot right now and we do not want to use the /etc/init.d/postgresql script. It is too general and doesn't suit our needs. Please use the 'chkconfig' command to make sure that postgres will not be started automatically: command
/sbin/chkconfig --list | grep postgres
verifying
postgresql 0:off 1:off 2:off 3:off 4:off 5:off 6:off
If the output shows that any of the runlevels 0 through 6 have 'on' please review the chkconfig command and turn 'on' for that level to 'off'. b.
Initialization
Next we need to initialize the PostgreSQL database. We will be doing this using the non-standard location for PostgreSQL database files in order to prevent any problems with previously used databases. As root : Begin by creating a directory that is owned by user postgres and in group postgres: command
mkdir -p /opt/pgsql/data chown -R postgres /opt/pgsql chgrp -R postgres /opt/pgsql
Next become the 'postgres' user: As postgres : command
su - postgres
Initialize the database by running the following command, being sure to use the –D option to point to the location where the new database files will be stored: -10-
command
/usr/bin/initdb –D /opt/pgsql/data
verifying
The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale en_US.UTF-8. The default database encoding has accordingly been set to UNICODE. fixing permissions on existing directory /opt/pgsql/data ... ok creating directory /opt/pgsql/data/global ... ok creating directory /opt/pgsql/data/pg_xlog ... ok creating directory /opt/pgsql/data/pg_xlog/archive_status ... ok creating directory /opt/pgsql/data/pg_clog ... ok creating directory /opt/pgsql/data/pg_subtrans ... ok creating directory /opt/pgsql/data/base ... ok creating directory /opt/pgsql/data/base/1 ... ok creating directory /opt/pgsql/data/pg_tblspc ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 1000 creating configuration files ... ok creating template1 database in /opt/pgsql/data/base/1 ... ok initializing pg_shadow ... ok enabling unlimited row size for system tables ... ok initializing pg_depend ... ok creating system views ... ok loading pg_description ... ok creating conversions ... ok setting privileges on built-in objects ... ok creating information schema ... ok vacuuming database template1 ... ok copying template1 to template0 ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the -A option the next time you run initdb. Success. You can now start the database server using: /usr/bin/postmaster -D /opt/pgsql/data or /usr/bin/pg_ctl -D /opt/pgsql/data -l logfile start The database is now initialized. c.
Starting the database
As user postgres start the database. It is important that you use both the -i and the -D flags.
You should redirect stdout and stderr to a file for logging purposes. Start the database in the background: command
/usr/bin/postmaster –i –D /opt/pgsql/data > /opt/pgsql/logfile 2>&1 & ps auwwwwx | grep post
verifying
root 5544 0.0 0.1 4420 1152 pts/0 S 11:19 0:00 su – postgres postgres 5545 0.2 0.1 4384 1440 pts/0 S 11:19 0:00 -bash postgres 5602 0.0 0.2 19476 3056 pts/0 S 11:22 0:00 /usr/bin/postmaster -i -D /opt/pgsql/data postgres 5603 0.0 0.1 9392 2052 pts/0 S 11:22 0:00 postgres: logger process postgres 5605 0.0 0.3 19476 3116 pts/0 S 11:22 0:00 postgres: writer process postgres 5606 0.0 0.2 10392 2088 pts/0 S 11:22 0:00 postgres: stats buffer process postgres 5607 0.0 0.2 9568 2208 pts/0 S 11:22 0:00 postgres: stats collector process
You should see the following or similar processes running after starting the database: There is more configuration to be done but it will be done later after installing the Globus Toolkit.
-11-
Remember that Postgres needs to be deployed in this way on both nodes B and C. It is not necessary on nodeA.
5) Fixing Java and ANT Java and ANT need to be fixed on all three machines. This is due to the Globus Toolkit being installed on all three machines. a.
Removing default Java
The default java installed for Debian must be removed. It may not work with the Globus toolkit and it is easier and less troublesome to remove it entirely. First see if the default java is installed: command
which java
verifying
/usr/bin/java
That file is usually a symlink: command
ls –alh /usr/bin/java
verifying
lrwxrwxrwx 1 root root 22 Oct 7 16:06 /usr/bin/java -> /etc/alternatives/java
That link is actually a link to the actual binary: command
ls –alh /etc/alternatives/java
verifying
lrwxrwxrwx 1 root root 35 Oct 7 16:06 /etc/alternatives/java -> /usr/lib/jvm/jre-1.4.2gcj/bin/java
You can use 'rpm' to query and see which package actually owns the binary: command
rpm –qf /usr/lib/jvm/jre-1.4.2-gcj/bin/java
verifying
java-1.4.2-gcj-compat-1.4.2.0-40jpp_31rh.FC4.2
With the name of the binary package known, use apt-get to remove it: Note that the apt-get remove command should be the entire file name WITH extension. This will remove a fair number of packages, but that is ok.
command verifying
apt-get remove java-1.4.2-gcj-compat-1.4.2.0-40jpp_31rh Package Removing: java-1.4.2-gcj-compat
Arch Version i386
Removing for dependencies: ant i386 ant-apache-bcel i386 ant-apache-log4j i386 ant-apache-oro i386 ant-apache-regexp i386 ant-apache-resolver i386 ant-commons-logging i386 ant-javamail i386
-12-
Repository Size
1.4.2.0-40jpp_31rh installed
1.7 k
1.6.2-3jpp_8fc 1.6.2-3jpp_8fc 1.6.2-3jpp_8fc 1.6.2-3jpp_8fc 1.6.2-3jpp_8fc 1.6.2-3jpp_8fc 1.6.2-3jpp_8fc 1.6.2-3jpp_8fc
7.6 M 5.6 k 8.6 k 3.0 k 65 k 3.7k 4.0 k 6.7 k
installed installed installed installed installed installed installed installed
ant-jdepend ant-jmf ant-junit ant-nodeps ant-scripts ant-swing ant-trax avalon-logkit java-1.4.2-gcj-compat-devel java-1.4.2-gcj-compat-src jessie ldapjdk struts11
i386 1.6.2-3jpp_8fc installed i386 1.6.2-3jpp_8fc installed i386 1.6.2-3jpp_8fc installed i386 1.6.2-3jpp_8fc installed i386 1.6.2-3jpp_8fc installed i386 1.6.2-3jpp_8fc installed i386 1.6.2-3jpp_8fc installed noarch 1.2-2jpp_4fc installed i386 1.4.2.0-40jpp_31rh installed i386 1.4.2.0-40jpp_31rh installed noarch 1.0.0-8 installed noarch 4.17-1jpp_2fc installed i386 1.1-1jpp_7fc installed
38 k 376 115 k 395 k 12 k 3.3 k 122 k 81 k 2.2 k 0.0 391 k 439 k 5.9 k
Transaction Summary =========================== Install 0 Package(s) Update 0 Package(s) Remove 23 Package(s) Total download size: 0 Is this ok [y/N]: y ... Removed: java-1.4.2-gcj-compat.i386 0:1.4.2.0-40jpp_31rh Dependency Removed: ant.i386 0:1.6.2-3jpp_8fc ant-antlr.i386 0:1.6.2-3jpp_8fc antapache-bcel.i386 0:1.6.2-3jpp_8fc ant-apache-log4j.i386 0:1.6.2-3jpp_8fc ant-apacheoro.i386 0:1.6.2-3jpp_8fc ant-apache-regexp.i386 0:1.6.2-3jpp_8fc ant-apacheresolver.i386 0:1.6.2-3jpp_8fc ant-commons-logging.i386 0:1.6.2-3jpp_8fc antjavamail.i386 0:1.6.2-3jpp_8fc ant-jdepend.i386 0:1.6.2-3jpp_8fc ant-jmf.i386 0:1.6.23jpp_8fc ant-junit.i386 0:1.6.2-3jpp_8fc ant-nodeps.i386 0:1.6.2-3jpp_8fc antscripts.i386 0:1.6.2-3jpp_8fc ant-swing.i386 0:1.6.2-3jpp_8fc ant-trax.i386 0:1.6.23jpp_8fc avalon-logkit.noarch 0:1.2-2jpp_4fc java-1.4.2-gcj-compat-devel.i386 0:1.4.2.0-40jpp_31rh java-1.4.2-gcj-compat-src.i386 0:1.4.2.0-40jpp_31rh jessie.noarch 0:1.0.0-8 ldapjdk.noarch 0:4.17-1jpp_2fc struts11.i386 0:1.1-1jpp_7fc Complete!
Now check to make sure that the default java is gone: command
which java
verifying
/usr/bin/which: no java in(/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/ local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/root/bin)
command
ls –l /usr/bin/java
verifying
ls: /usr/bin/java: No such file or directory b.
Java installation
Note that there are a lot of choices on what to download. What you want is the JDK 6.0 or a newer version. You do not need the entire J2EE suite of tools.
command
apt-get install java6-sun
verifying
java -version java version "1.6.0" gij (GNU libgcj) version 4.3.2 Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO
-13-
Warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. which java /usr/lib/jvm/java-6-sun-1.6.0.16/bin/java which javac /usr/lib/jvm/java-6-sun-1.6.0.16/bin/javac
configuring
export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.16 export PATH=$JAVA_HOME/bin:$PATH
The configuring command will be used in the section “Configuring Paths” page 16.
c.
Installing Ant
Ant is a 'make like' tool for Java development. It is required to compile the source of some components of the Globus Toolkit. When the default java was removed more then likely so was ant and that is desired.
command
apt-get install ant
verifying
Postex:~# ls /usr/share/ant/ bin INSTALL LICENSE LICENSE.xerces docs KEYS LICENSE.dom NOTICE etc lib LICENSE.sax README which ant /usr/share/ant/ ant –version Apache Ant version 1.6.5 compiled on June 2 2005
configuring
export ANT_HOME=/usr/share/ant/ export PATH=$ANT_HOME/bin:$PATH
TODO welcome.html WHATSNEW
The configuring command will be used in the section “Configuring Paths” page 16.
6) Installing the Globus Toolkit Now let’s install and configure Globus toolkit [69] [70] [71] [72]. It should be installed as user 'globus' and not as root. The toolkit should be deployed on
nodeA since that node will serve as the 'client' machine. It will be able to send jobs to the other nodes.
nodeB and nodeC since that node will host the Globus GRAM WS that front ends the PBS batch system, along with other Globus grid services. a.
Installing Software Pre-requisites
First of all, we need to install and configure, on all the machines, the software prerequisites that Globus needs to work correctly. i.
Installing “openssl”
-14-
command 6
apt-get install openssl
verifying 7
openssl version OpenSSL 0.9.8g 19 Oct 2007 ii.
Installing “dpkg” command
command
apt-get install dpkg
verifying
dpkg --version iii. Installing “zlib” command
command
apt-get install zlib-bin zlib1g zlib1g-dev
verifying
dpkg --list | grep zlib ii zlib-bin ii zlib1g ii zlib1g-dev
1:1.2.3.3.dfsg-12 1:1.2.3.3.dfsg-12 1:1.2.3.3.dfsg-12
compression library -sample programs compression library -runtime compression library -development
iv. Installing “gcc” command
command
apt-get install gcc
verifying
gcc --version v.
Installing “g++” command
command
apt-get install g++
verifying
g++ --version vi. Installing “GNU
tar” command
command
apt-get install tar
verifying
tar --version tar (GNU tar) 1.20 Copyright © 2008 Free Software Foundation, Inc. Licence GPLv3+ : GNU GPL version 3 ou ultérieure Ceci est un logiciel libre : vous êtes libre de le modifier et de le redistribuer. Il est fourni SANS GARANTIE, dans la mesure de ce que permet la loi. Écrit par John Gilmore et Jay Fenlason. vii.
Installing “GNU Make” command
command
apt-get install make
verifying
make --version GNU Make 3.81
6 7
The command that must be run in order to install the software. The command that should be run for verification and the result you will see.
-15-
Copyright (C) 2010 Free Software Foundation, Inc. Ceci est un logiciel libre ; voir le source pour les conditions de copie. Il n'y a PAS de garantie ; tant pour une utilisation COMMERCIALE que pour RÉPONDRE À UN BESOIN PARTICULIER. Ce logiciel est construit pour i486-pc-linux-gnu viii.
Installing “GNU sed” command
command
apt-get install sed
verifying
sed --version GNU sed version 4.1.5 Copyright (C) 2003 Free Software Foundation, Inc. Ce logiciel est libre; voir les sources pour les conditions de reproduction. AUCUNE garantie n'est donnée; y compris pour des RAISONS COMMERCIALES ou pour répondre a un besoin particulier, à l'étendue permise par la loi. ix. Installing “sudo” command
command
apt-get install sudo
verifying
sudo -V x.
Installing “XML Parser”
command
apt-get install
verifying
locate XML/Parser.pm /usr/lib/perl5/XML/Parser.pm b.
Creating the “globus” user
Create a user named “globus”. This non-privileged user will be used to perform administrative tasks such as starting and stopping the container, deploying services, etc. Create the target directory as root, then chown it to the globus user. command
useradd globus mkdir /usr/local/globus-4.2.1.1 chown globus:globus /usr/local/globus-4.2.1.1
verifying
ls –all /usr/local/globus-4.2.1.1
We assume that the toolkit is being installed to “/usr/local/globus4.2.1.1”, but it can be replaced with any other directory you wish to install to.
c.
Configuring Paths
As globus : Before starting the installation, Globus Toolkit needs the definition of the ANT_HOME, JAVA_HOME, and the GLOBUS_LOCATION and to put the java virtual machine and the java compiler on your path. -16-
If you exit the globus account, upon entry into the globus account the export command will need to be run again.
The instructions below show how it is done from the command line, however, if you are logging off and on you might want to make sure this is set up via a profile file so it is automatically executed. export ANT_HOME=/usr/share/ant/ export PATH=$ANT_HOME/bin:$PATH export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.16 export PATH=$JAVA_HOME/bin:$PATH export GLOBUS_LOCATION=/usr/local/globus-4.2.1.1
command
Note that:
Throughout this tutorial there are numerous instances of environment variables being set for the root and other users. Experienced Linux users may choose to set environment variables in ".bashrc" or similar files that are automatically executed upon login. d.
Configuring the Toolkit tar zxvf gt4.2.1-x86_deb_4.0-installer.tar.gz cd gt4.2.1-all-source-installer ./configure --prefix=$GLOBUS_LOCATION
command
If you need to enable a scheduler, use command line arguments to “./configure” for a more custom install. Here are the lines to enable features which are disabled by default: Optional Features: Build pre-webservices mds. Default is disabled.
--enable-prewsmds
--enable-wsgram-condor Build GRAM Condor scheduler interface. Default is disabled. Build GRAM LSF scheduler interface. Default is disabled. --enable-wsgram-lsf --enable-wsgram-pbs --enable-i18n
Build GRAM PBS scheduler interface. Default is disabled. Enable internationalization. Default is disabled.
Enable Data Replication Service. Default is disabled. --enable-drs [...] Optional Packages: [...] Use the iodbc library in dir /usr/lib/libiodbc.so.2 --with-iodbc=dir Required for RLS builds. --with-gsiopensshargs="args" Arguments to pass to the build of GSI-OpenSSH, like --with-tcp-wrappers
e.
For a full list of options, run as the globus user “./configure --help”.
Building and installing the Toolkit
Now it’s time to start the installation. command f.
make make install
Set environment variables -17-
As root : In order for the system to know the location of the Globus Toolkit commands just installed, source the globus-user-env.sh script. command
g.
export ANT_HOME=/usr/share/ant/ export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.16 export GLOBUS_LOCATION=/usr/local/globus-4.2.1.1 . $GLOBUS_LOCATION/etc/globus-user-env.sh
Creating a Certificate Authority
As root : In this next section we create a certificate authority. Note that it is only necessary to create a certificate authority on one machine in your 'grid'. For the purposes of this tutorial we will use nodeB, though any of the Globus installations could be used, including a host that is not directly part of your grid.
Now run the 'setup-simple-ca' command to begin the setup process. This is a short, menu driven script so the input needed to be typed in is shown in red: Note that you will be prompted twice for a password or pass phrase. Please be sure to remember this pass phrase.
command
$$GLOBUS_LOCATION/setup/globus/setup-simple-ca
verifying
WARNING: GPT_LOCATION not set, assuming: GPT_LOCATION=/usr/local/globus-4.2.1.1 C e rt if ic at eAut ho rit y Se t up This script will setup a Certificate Authority for signing Globus users certificates. It will also generate a simple CA package that can be distributed to the users of the CA. The CA information about the certificates it distributes will be kept in: /home/globus/.globus/simpleCA/ The unique subject name for this CA is: cn=Globus Simple CA, ou=simpleCA-nodeB.isim.tn, ou=GlobusTest, o=Grid Do you want to keep this as the CA subject (y/n) [y]: n Enter a unique subject name for this CA: cn=,ou=ConsortiumTutorial,ou=GlobusTest,o=Grid Replace with the name of your organization. Enter the email of the CA (this is the email where certificate requests will be sent to be signed by the CA): Replace with your e-mail address, or simply a dummy e-mail address as this is not important. The CA certificate has an expiration date. Keep in mind that once the CA certificate has expired, all the certificates signed by that CA become invalid. A CA should regenerate the CA certificate and start re-issuing ca-setup packages before the actual CA certificate expires. This can be done by re-running this setup script. Enter the number of DAYS the CA certificate should last before it expires. [default: 5 years (1825 days)]: Hit Return to accept default. Enter PEM pass phrase: Verifying - Enter PEM pass phrase: Replace with your choice of PEM pass phrase, the pass phrase "GlobusTutorial" (whithout the quotes) was used in the compilation of this tutorial. creating CA config package...done.
-18-
A self-signed certificate has been generated for the Certificate Authority with the subject: /O=Grid/OU=GlobusTest/OU=ConsortiumTutorial/CN= If this is invalid, rerun this script /usr/local/globus-4.2.1.1/setup/globus/setup-simple-ca and enter the appropriate fields. ------------------------------------------------------------------The private key of the CA is stored in /home/globus/.globus/simpleCA//private/cakey.pem The public CA certificate is stored in /home/globus/.globus/simpleCA//cacert.pem The distribution package built for this CA is stored in /home/globus/.globus/simpleCA//globus_simple_ca_768c46a5_setup-0.19.tar.gz This file must be distributed to any host wishing to request certificates from this CA. CA setup complete. The following commands will now be run to setup the security configuration files for this CA: $GLOBUS_LOCATION/sbin/gpt-build /home/globus/.globus/simpleCA//globus_simple_ca_768c46a5_setup-0.19.tar.gz $GLOBUS_LOCATION/sbin/gpt-postinstall ------------------------------------------------------------------setup-ssl-utils: Configuring ssl-utils package Running setup-ssl-utils-sh-scripts... Note: To complete setup of the GSI software you need to run the following script as root to configure your security configuration directory: /usr/local/globus-4.2.1.1/setup/globus_simple_ca_f1f2d5e6_setup/setup-gsi For further information on using the setup-gsi script, use the -help option. The -default option sets this security configuration to be the default, and -nonroot can be used on systems where root access is not available. setup-ssl-utils: Complete
In the output above you may see that a unique hash number for the CA was created. In your example the hash number will be different since your organization name will be different. As indicated by the output above, we next need to run the 'setup-gsi' command. We will run it with the '-default' flag so that the CA we just created becomes the default certificate authority for certificates created on this node. We also use the '-nonroot' flag in order to keep the entire configuration under the directory $GLOBUS_DIRECTION. This does not have to be run as root as the output from the setup-simple-ca command entered above would have you indicate. command verifying
/usr/local/globus-4.2.1.1/setup/globus_simple_ca_f1f2d5e6_setup/setup-gsi -default nonroot setup-gsi: Configuring GSI security
-19-
Making trusted certs directory: /usr/local/globus-4.2.1.1/share/certificates/ mkdir /usr/local/globus-4.2.1.1/share/certificates/ Installing /usr/local/globus-4.2.1.1/share/certificates//grid-security.conf.f1f2d5e6... Running grid-security-config... Installing Globus CA certificate into trusted CA certificate directory... Installing Globus CA signing policy into trusted CA certificate directory... setup-gsi: Complete
Now the CA just created is installed and is the default for requesting certificates on nodeB. h.
Obtaining a Host Certificate on nodeB
Next we want to request a host certificate for nodeB. After making the request we will sign the certificate using the CA on nodeB. After the request is signed we will install the certificate in the proper place on nodeB.
To request a host certificate begin by setting up the environment properly: As root : command
export GLOBUS_LOCATION=/usr/local/globus-4.2.1.1 source $GLOBUS_LOCATION/etc/globus-user-env.sh
Run the 'grid-cert-request' command using the '-host' flag to indicate the fully qualified name of your nodeB, and the '-dir' option to direct the files into the directory $GLOBUS_LOCATION/etc:
command
grid-cert-request –host nodeB.isim.tn –dir $GLOBUS_LOCATION/etc
verifying
ls -alh $GLOBUS_LOCATION/etc/hostcert_request.pem -rw-r--r-- 1 root root 1.3K Feb 22 10:39 /usr/local/globus-4.2.1.1/etc/hostcert_request.pem ls -alh $GLOBUS_LOCATION/etc/hostkey.pem -r-------- 1 root root 887 Feb 22 10:39 /usr/local/globus-4.2.1.1/etc/hostkey.pem
Now that the certificate has been requested, the request must be signed by the CA. To sign the request become user globus again and set up the environment again:
As globus : command
su – globus export GLOBUS_LOCATION=/usr/local/globus-4.2.1.1 source $GLOBUS_LOCATION/etc/globus-user-env.sh
The certificate request is signed using the command 'grid-ca-sign'. When prompted enter the password for the certificate authority: command verifying
grid-ca-sign -in $GLOBUS_LOCATION/etc/hostcert_request.pem -out $GLOBUS_ LOCATION/etc/hostcert.pem To sign the request please enter the password for the CA key: Password for CA key ls -alh $GLOBUS_LOCATION/etc/hostcert.pem -rw-rw-r-- 1 globus globus 2.5K Feb 22 10:45 /usr/local/globus-4.2.1.1/etc/.hostcert.pem ls -alh /home/globus/.globus/simpleCA/newcerts/01.pem -rw-rw-r-- 1 globus globus 2.5K Feb 22 10:45 /home/globus/.globus/simpleCA/newcerts/ 01.pem
Note: If the hostcert.pem file exists the grid-ca-sign command may be run with the -f (force) option.
-20-
The new signed certificate is at: /home/globus/.globus/simpleCA//newcerts/01.pem As the output above indicates a signed version of the (public) certificate is kept with the CA files in the home directory of the user signing the certificate (globus in this case), but it is also ouptut where we asked it to be output: Before the host services on nodeB that run as user root (globus-gridftp-server) can use this certificate we need to make sure the permissions and ownership of the file are correct. The files need to be owned by root with the permissions shown below. You will again have to log in as root to do this.
As root : command
chown root.root /usr/local/globus-4.2.1.1/etc/hostcert.pem chmod 644 /usr/local/globus-4.2.1.1/etc/hostcert.pem
verifying
ls –alh $GLOBUS_LOCATION/etc/host*.pem -rw-r--r-- 1 root root 2.5K Feb 22 10:45 /usr/local/globus-4.2.1.1/etc/hostcert.pem -r-------- 1 root root 887 Feb 22 10:39 /usr/local/globus-4.2.1.1/etc/hostkey.pem i.
Making a Copy for the Container
The host certificate just created is owned by root and will be used by services such as 'globusgridftp-server'. Most often the other services and the container they run in are not run as root. They are run as user 'globus'. Still, these services usually run with a host type certificate.
So we need to make a copy of the host certificate that the globus user has access to: As root : command
cp $GLOBUS_LOCATION/etc/hostcert.pem $GLOBUS_LOCATION/etc/containercert.pem chown globus.globus $GLOBUS_LOCATION/etc/containercert.pem ls –alh $GLOBUS_LOCATION/etc/containercert.pem -rw-r--r-- 1 globus globus 2.5K Feb 22 10:57 /usr/local/globus-4.2.1.1/etc/containercert.pem cp $GLOBUS_LOCATION/etc/hostkey.pem $GLOBUS_LOCATION/etc/containerkey.pem chown globus.globus $GLOBUS_LOCATION/etc/containerkey.pem ls –alh $GLOBUS_LOCATION /etc/containerkey.pem -r-------- 1 globus globus 887 Feb 22 10:58 /usr/local/globus-4.2.1.1/etc/containerkey.pem
With the copy for the container we can, as user globus, edit the security configuration file so that the container can find the certificate and it’s key. Use any text editor to edit the file $GLOBUS_LOCATION/etc/globus_wsrf_core/global_security_descriptor.xml
and in that file look for the "key-file" and "cert-file" fields and set the paths to point to the container key and cert respectively. Your file should look like this: command
cat $GLOBUS_LOCATION/etc/globus_wsrf_core/global_security_descriptor.xml
j.
Creating the grid-mapfile -21-
For now it is easiest to create an empty grid-mapfile. Do this as user 'globus': command
touch $GLOBUS_LOCATION/etc/grid-mapfile
k.
Configuring the RFT Service
Before starting the container and the grid services that run within it such as GRAM WS, we need to configure the RFT service so that it uses the Postgres database we initialized before and we need to initialize the database tables that RFT needs.
First make sure again that Postgres is running: command verifying
ps auwwwwx | grep postmaster postgres 26088 0.0 0.3 19476 3172 pts/0 S Feb20 0:00 /usr/bin/postmaster -D /opt/pgsql/data -o -i
As postgres : When you create a postgres password for the globus user you can use the same thing as the unix password "globususer".
As the 'postgres' user run the 'createuser' command to create a 'globus' user for the database. Use the '-A' and '-d' flags so that the globus user has the correct permissions. When prompted enter a password for the globus user (for the database, not the UNIX password). We recommend a password with no spaces or strange characters: command
su postgres createuser -A -d -P globus Enter password for new user: Enter it again: CREATE USER
Next we need to edit the Postgres permissions file and add a line that allows the 'globus' user to connect to the database from this host. Use any text editor to edit the file /opt/pgsql/data/pg_hba.conf
Add the following line: host rftDatabase "globus" "172.16.2.1" 255.255.255.255 md5 You should replace the IP address with the IP address for your nodeB.
After having edited that file we need to restart the Postgres database. Run the "ps aux" command and look for the /usr/bin/postmaster -i -D /opt/pgsql/data process. Note the process id number for this process and send it a TERM signal. command
kill -SIGTERM "process id" /usr/bin/postmaster -i -D /opt/pgsql/data > /opt/pgsql/logfile 2>&1 & [2] 5707
The next step is to create the database and tables that the globus user and the RFT service need. Start by becoming the globus user and making sure your environment is set up properly: As globus : command
su – globus export GLOBUS_LOCATION=/usr/local/globus-4.2.1.1 source $GLOBUS_LOCATION/etc/globus-user-env.sh
Run the 'createdb' command from the Postgres distribution. -22-
command
createdb rftDatabase CREATE DATABASE psql -d rftDatabase -f $GLOBUS_LOCATION/share/globus_wsrf_rft/rft_schema.sql psql: /usr/local/globus-4.2.1.1/share/globus_wsrf_rft/rft_schema.sql:6: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "requestid_pkey" for table "requestid" CREATE TABLE psql:/usr/local/globus-4.2.1.1/share/globus_wsrf_rft/rft_schema.sql:11: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "transferid_pkey" for table "transferid" CREATE TABLE psql:/usr/local/globus-4.2.1.1/share/globus_wsrf_rft/rft_schema.sql:30: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "request_pkey" for table "request" CREATE TABLE psql:/usr/local/globus-4.2.1.1/share/globus_wsrf_rft/rft_schema.sql:65: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "transfer_pkey" for table "transfer" CREATE TABLE CREATE TABLE CREATE TABLE CREATE INDEX
With the database and tables created, we need to next edit the RFT configuration file so that it knows the correct password to use when authenticating to the database. Use any text editor and edit the file $GLOBUS_LOCATION/etc/globus_wsrf_rft/jndi-config.xml
Search for the password "foo" (it will be the only occurrence of that word in the file) and change it from "foo" to the password you set for the globus user to use the Postgres database. l.
Configuring sudo
A number of the Globus grid services, including GRAM WS, need to use the 'sudo' command in order to execute processes as the user that is requesting the service act on its behalf. One must use the command "visudo" to edit the /etc/sudoers file. It is a special wrapper to "vi" that preserves the file permissions. /usr/sbin/visudo (no file or path necessary)
As root : Edit the file /etc/sudoers and add the following two lines: Note: The following text appears on multiple lines, but each entry should be entered as one line of text globus ALL=(haitham) NOPASSWD: /usr/local/globus-4.2.1.1/libexec/globusgridmap-and-execute -g /usr/local/globus-4.2.1.1/etc/grid-mapfile /usr/local/globus-4.2.1.1/libexec/globus-job-manager-script.pl *
And globus ALL=(haitham) NOPASSWD: /usr/local/globus-4.2.1.1/libexec/globusgridmap-and-execute -g /usr/local/globus-4.2.1.1/etc/grid-mapfile /usr/local/globus-4.2.1.1/libexec/globus-gram-local-proxy-tool *
Those two lines will allow user 'globus' to use sudo to execute commands for user 'haitham'. We are using 'haitham' as the generic user account in this tutorial. You can use whatever generic user account on your system that you want, simply change the account name in the lines above to that account. m.
Starting the Container
With the container certificate and key available and with RFT configured, we are now ready to start the
-23-
Globus container and the grid services that will run in it. As user globus make sure that your environment is set up as usual. Also make sure that the java tools are in your PATH:
As globus : command verifying
export JAVA_HOME=/opt/jdk1.5.0_06 export PATH=$JAVA_HOME/bin:$PATH which java /usr/lib/jvm/java-6-sun-1.6.0.16/bin/java
It is best to run the container with more then the default memory used by the java virtual machine. This is accomplished by setting an environment variable: command
export GLOBUS_OPTIONS=-Xmx512M
Now use 'globus-start-container' to start the container and the default grid services: command
$GLOBUS_LOCATION/bin/globus-start-container Starting SOAP server at: https://172.16.2.1:8443/wsrf/services/ With the following services: [1]: https://172.16.2.1:8443/wsrf/services/TriggerFactoryService [2]: https://172.16.2.1:8443/wsrf/services/DelegationTestService [3]: https://172.16.2.1:8443/wsrf/services/SecureCounterService [4]: https://172.16.2.1:8443/wsrf/services/IndexServiceEntry [5]: https://172.16.2.1:8443/wsrf/services/DelegationService [6]: https://172.16.2.1:8443/wsrf/services/InMemoryServiceGroupFactory [7]: https://172.16.2.1:8443/wsrf/services/mds/test/execsource/IndexService [8]: https://172.16.2.1:8443/wsrf/services/mds/test/subsource/IndexService [9]: https://172.16.2.1:8443/wsrf/services/SubscriptionManagerService [10]: https://172.16.2.1:8443/wsrf/services/TestServiceWrongWSDL [11]: https://172.16.2.1:8443/wsrf/services/SampleAuthzService [12]: https://172.16.2.1:8443/wsrf/services/WidgetNotificationService [13]: https://172.16.2.1:8443/wsrf/services/AdminService [14]: https://172.16.2.1:8443/wsrf/services/DefaultIndexServiceEntry [15]: https://172.16.2.1:8443/wsrf/services/CounterService [16]: https://172.16.2.1:8443/wsrf/services/TestService [17]: https://172.16.2.1:8443/wsrf/services/InMemoryServiceGroup [18]: https://172.16.2.1:8443/wsrf/services/SecurityTestService [19]: https://172.16.2.1:8443/wsrf/services/ContainerRegistryEntryService [20]: https://172.16.2.1:8443/wsrf/services/NotificationConsumerFactoryService [21]: https://172.16.2.1:8443/wsrf/services/TestServiceRequest [22]: https://172.16.2.1:8443/wsrf/services/IndexFactoryService [23]: https://172.16.2.1:8443/wsrf/services/ReliableFileTransferService [24]: https://172.16.2.1:8443/wsrf/services/mds/test/subsource/IndexServiceEntry [25]: https://172.16.2.1:8443/wsrf/services/Version [26]: https://172.16.2.1:8443/wsrf/services/NotificationConsumerService [27]: https://172.16.2.1:8443/wsrf/services/IndexService [28]: https://172.16.2.1:8443/wsrf/services/NotificationTestService [29]: https://172.16.2.1:8443/wsrf/services/ReliableFileTransferFactoryService [30]: https://172.16.2.1:8443/wsrf/services/DefaultTriggerServiceEntry [31]: https://172.16.2.1:8443/wsrf/services/TriggerServiceEntry [32]: https://172.16.2.1:8443/wsrf/services/PersistenceTestSubscriptionManager [33]: https://172.16.2.1:8443/wsrf/services/mds/test/execsource/IndexServiceEntry [34]: https://172.16.2.1:8443/wsrf/services/DefaultTriggerService [35]: https://172.16.2.1:8443/wsrf/services/TriggerService [36]: https://172.16.2.1:8443/wsrf/services/gsi/AuthenticationService [37]: https://172.16.2.1:8443/wsrf/services/TestRPCService [38]: https://172.16.2.1:8443/wsrf/services/ManagedMultiJobService [39]: https://172.16.2.1:8443/wsrf/services/RendezvousFactoryService [40]: https://172.16.2.1:8443/wsrf/services/WidgetService
-24-
[41]: https://172.16.2.1:8443/wsrf/services/ManagementService [42]: https://172.16.2.1:8443/wsrf/services/ManagedExecutableJobService [43]: https://172.16.2.1:8443/wsrf/services/InMemoryServiceGroupEntry [44]: https://172.16.2.1:8443/wsrf/services/AuthzCalloutTestService [45]: https://172.16.2.1:8443/wsrf/services/DelegationFactoryService [46]: https://172.16.2.1:8443/wsrf/services/DefaultIndexService [47]: https://172.16.2.1:8443/wsrf/services/ShutdownService [48]: https://172.16.2.1:8443/wsrf/services/ContainerRegistryService [49]: https://172.16.2.1:8443/wsrf/services/TestAuthzService [50]: https://172.16.2.1:8443/wsrf/services/CASService [51]: https://172.16.2.1:8443/wsrf/services/ManagedJobFactoryService
Not all those services will be exercised in this tutorial, but they are deployed by default and there is no harm in letting them all run. Now that we are sure the container is running correctly, use Ctrl+c to stop it. After a few seconds the container will stop: Stopped SOAP Axis server at: https://172.16.2.1:8443/wsrf/services/
Start the container again but this time run it in the background and send the output and error to a file for logging: command
/usr/local/globus-4.2.1.1/bin/globus-start-container > $HOME/container.out 2>&1 & n.
Starting globus-gridftp-server
The globus-gridftp-server is run as root, so log in as such. Before starting the server set the GRIDMAP environment variable to point to the (now empty) grid-mapfile that you previously created:
As root : command
export GRIDMAP=/usr/local/globus-4.2.1.1/etc/grid-mapfile
Start the server using the -p flag to run it on the default port of 2811 and the -S flag to have it run in the background and detached from the terminal: command
/usr/local/globus-4.2.1.1/sbin/globus-gridftp-server –p 2811 –S o.
Repeat for nodeA
Before testing the grid services repeat the Globus toolkit deployment on nodeA, but stop before creating a certificate authority. You do not need a new certificate authority for nodeA. Only one certificate authority is needed per grid or organization.
As globus : Instead you want to install the certificate authority files you created on nodeB onto nodeA. Copy the distribution file from nodeB to nodeA (remember that your CA will use a different hash name). Run the scp command as user globus on nodeA: command
/home/globus/.globus/simpleCA/globus_simple_ca_f1f2d5e6_setup-0.19.tar.gz . root@nodeb's password: globus_simple_ca_f1f2d5e6_setup-0.19.tar.gz 100% 211KB 210.8KB/s 00:00
As user 'globus' on nodeA install the CA files: command
gpt-build ./globus_simple_ca_f1f2d5e6_setup-0.19.tar.gz gpt-build ====> CHECKING BUILD DEPENDENCIES FOR globus_simple_ca_f1f2d5e6_setup
-25-
gpt-build ====> Changing to /home/globus/BUILD/globus_simple_ca_f1f2d5e6_setup-0.19/ gpt-build ====> BUILDING globus_simple_ca_f1f2d5e6_setup gpt-build ====> Changing to /home/globus/BUILD gpt-build ====> REMOVING empty package globus_simple_ca_f1f2d5e6_setup-noflavor-data gpt-build ====> REMOVING empty package globus_simple_ca_f1f2d5e6_setup-noflavor-dev gpt-build ====> REMOVING empty package globus_simple_ca_f1f2d5e6_setup-noflavor-doc gpt-build ====> REMOVING empty package globus_simple_ca_f1f2d5e6_setup-noflavor-pgm_static gpt-build ====> REMOVING empty package globus_simple_ca_f1f2d5e6_setup-noflavor-rtl
command
gpt-postinstall running /usr/local/globus-4.2.1.1/setup/globus/./setup-ssl-utils.f1f2d5e6..[ Changing to /usr/local/globus-4.2.1.1/setup/globus/. ] setup-ssl-utils: Configuring ssl-utils package Running setup-ssl-utils-sh-scripts... Note: To complete setup of the GSI software you need to run the following script as root to configure your security configuration directory: /usr/local/globus-4.2.1.1/setup/globus_simple_ca_f1f2d5e6_setup/setup-gsi For further information on using the setup-gsi script, use the -help option. The -default option sets this security configuration to be the default, and -nonroot can be used on systems where root access is not available. setup-ssl-utils: Complete ..Done
Run the 'setup-gsi' command just as you have done previously on nodeB, again using the '-default' and '-nonroot' flags: command
/usr/local/globus-4.2.1.1/setup/globus_simple_ca_f1f2d5e6_setup/setup-gsi –default –nonroot setup-gsi: Configuring GSI security Making trusted certs directory: /usr/local/globus-4.2.1.1/share/certificates/ mkdir /usr/local/globus-4.2.1.1/share/certificates/ Installing /usr/local/globus-4.2.1.1/share/certificates//grid-security.conf.f1f2d5e6... Running grid-security-config... Installing Globus CA certificate into trusted CA certificate directory... Installing Globus CA signing policy into trusted CA certificate directory... setup-gsi: Complete
p.
Obtaining Credentials for Generic User
With the Globus toolkit installed on nodeA and your CA files installed you can now request a certificate for a generic user on your nodeA. We will use 'haitham' as the generic user account.
As haitham : Now use the 'grid-cert-request' script with no arguments to create a request for user haitham. When prompted enter a password or pass phrase for the password that will protect user haitham's credentials: command
grid-cert-request A certificate request and private key is being created. You will be asked to enter a PEM pass phrase. This pass phrase is akin to your account password, and is used to protect your key file. If you forget your pass phrase, you will need to obtain a new certificate. Generating a 1024 bit RSA private key ........++++++ ...........++++++ writing new private key to '/home/haitham/.globus/userkey.pem' Enter PEM pass phrase: Verifying - Enter PEM pass phrase:
You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a -26-
DN. There are quite a few fields but you can leave some blank. For some fields there will be a default value, if you enter '.', the field will be left blank. command
Level 0 Organization [Grid]:Level 0 Organizational Unit [GlobusTest]:Level 1 Organizational Unit [ps.univa.com]:Name (e.g., John M. Smith) []: A private key and a certificate request has been generated with the subject: /O=Grid/OU=GlobusTest/OU=ps.univa.com/CN=Haitham User If the CN=Haitham User is not appropriate, rerun this script with the -force -cn "Common Name" options. Your private key is stored in /home/haitham/.globus/userkey.pem Your request is stored in /home/haitham/.globus/usercert_request.pem Please e-mail the request to the Test01
[email protected] You may use a command similar to the following: cat /home/haitham/.globus/usercert_request.pem | mail
[email protected] Only use the above if this machine can send AND receive e-mail. if not, please mail using some other method. Your certificate will be mailed to you within two working days. If you receive no response, contact Test01 at
[email protected]
Ignore the instructions about emailing the certificate request. Instead you will act now to get the certificate request signed. First make sure the request was generated: command
ls –l .globus/usercert_request.pem -rw-r--r-- 1 haitham users 1346 Feb 23 10:41 .globus/usercert_request.pem
To sign the certificate we will copy the request from nodeA to nodeB, sign the certificate as user 'globus' (since globus owns the CA), and then return the signed certificate to user haitham's account. Begin by going to nodeB and copying over the certificate request for user haitham: command
cd .globus/simpleCA/ scp root@nodea:/home/haitham/.globus/usercert_request.pem . root@nodea's password: usercert_request.pem 100% 1346 1.3KB/s 00:00
Now use the 'grid-ca-sign' command to sign the certificate. You will need to enter the password that protects the CA (not the password for user haitham's private key): command
grid-ca-sign -in ./usercert_request.pem -out ./usercert.pem To sign the request please enter the password for the CA key: The new signed certificate is at: /home/globus/.globus/simpleCA//newcerts/02.pem
With the certificate signed you can go back to nodeA and grab it from nodeB: command
scp root@nodeb:/home/globus/.globus/simpleCA/usercert.pem $HOME/.globus/usercert.pem The authenticity of host 'nodeb (172.16.2.1)' can't be established. RSA key fingerprint is 1d:ff:38:57:0d:70:f8:5d:5e:f2:26:4b:d3:49:d9:79. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'nodeb,172.16.2.1' (RSA) to the list of known hosts. root@nodeb's password: usercert.pem 100% 2534 2.5KB/s 00:00
Check to make sure that user haitham's certificate is owned by user haitham and has the correct permissions: command
ls –alh .globus/usercert.pem -rw-r--r-- 1 haitham users 2.5K Feb 23 11:39 .globus/usercert.pem
-27-
q.
Testing the Grid Services on nodeB
Now that the services are running on nodeB and there is a generic user on nodeA that has a valid certificate, you can proceed with basic testing of the services on nodeB. The first thing to do is add user haitham's credentials to the grid-mapfile on nodeB so that the services on nodeB accept the credentials and user haitham can authenticate to those services.
Use the 'grid-cert-info' command as user haitham with the '-subject' flag on nodeA to determine the full certificate name or subject for user haitham's credentials: command
grid-cert-info -subject /O=Grid/OU=GlobusTest/OU=ps.univa.com/CN=Haitham User
Go back to nodeB and add this user to the grid-mapfile. Be sure to put the certificate name in double quotes as shown, and be sure to add the UNIX account name after the certificate name as shown: command
cat /usr/local/globus-4.2.1.1/etc/grid-mapfile "/O=Grid/OU=GlobusTest/OU=ps.univa.com/CN=Haitham User" haitham
Now back as user haitham on nodeA create a proxy certificate for user haitham by running 'grid-proxy-init'. When prompted enter the pass phrase that protects user haitham's private key: command
grid-proxy-init Your identity: /O=Grid/OU=GlobusTest/OU=ps.univa.com/CN=Haitham User Enter GRID pass phrase for this identity: Creating proxy .................................... Done Your proxy is valid until: Tue Feb 23 23:41:02 2010
You can use the 'grid-proxy-info' command with the '-all' flag to see the details of the proxy: command
grid-proxy-info –all subject : /O=Grid/OU=GlobusTest/OU=ps.univa.com/CN=Haitham User/CN=1947642770 issuer : /O=Grid/OU=GlobusTest/OU=ps.univa.com/CN=Haitham User identity : /O=Grid/OU=GlobusTest/OU=ps.univa.com/CN=Haitham User type : Proxy draft (pre-RFC) compliant impersonation proxy strength : 512 bits path : /tmp/x509up_u101 timeleft : 11:59:53
The first thing to test is to see if user haitham can authenticate to the globus-gridftpserver running on nodeB and move a file from nodeB to nodeA: command
globus-url-copy -vb gsiftp://nodeB.isim.tn/etc/issue file:/tmp/foo Source: gsiftp://nodeB.isim.tn/etc/ Dest: file:/tmp/ issue -> foo cat /tmp/foo Fedora Core release 4 (Stentz) Kernel \r on an \m
The next thing to test is submitting a simple job to the GRAM WS running on nodeB. Note that we will only use the 'fork' jobmanager for now. We will not be using PBS: command
globusrun-ws -submit -streaming -F https://nodeB.isim.tn:8443/wsrf/services/ ManagedJobFactoryService -c /usr/bin/whoami Delegating user credentials...Done.
-28-
Submitting job...Done. Job ID: uuid:9ea53924-a49c-11da-8a6e-0011d8b1eb22 Termination time: 02/24/2010 18:45 GMT Current job state: Active Current job state: CleanUp-Hold haitham Current job state: CleanUp Current job state: Done Destroying job...Done. Cleaning up any delegated credentials...Done. Note the single line of output 'haitham'. That is the output from the actual '/usr/bin/whoami' command that was run on nodeB remotely.
Try running the '/bin/date' command remotely: command
globusrun-ws -submit -streaming -F https://nodeB.isim.tn:8443/wsrf/services/ ManagedJobFactoryService -c /bin/date Delegating user credentials...Done. Submitting job...Done. Job ID: uuid:d069feea-a49c-11da-84e9-0011d8b1eb22 Termination time: 02/24/2010 18:47 GMT Current job state: Active Current job state: CleanUp-Hold Thu Feb 23 12:47:19 CST 2010 Current job state: CleanUp Current job state: Done Destroying job...Done. Cleaning up any delegated credentials...Done. r.
Obtaining host credentials for nodeA
We want user haitham to be able to run jobs from nodeA on nodeB that involve transferring files to and from nodeA and nodeB. For example we may want to stage input files for a job from nodeA to nodeB, then when the job is completed we may want to move output files from nodeB back to nodeA.
We need to run a globus-gridftp-server on nodeA so that the RFT service and move the files automatically as part of the GRAM WS request. The easiest way to enable the server on nodeA is to run it as root (though it could be run as user haitham). In order to run the server on nodeA as root we need to have host credentials for nodeA. As haitham : We begin by requesting a host certificate: command
grid-cert-request -host nodeA.isim.tn -dir $GLOBUS_LOCATION/etc Generating a 1024 bit RSA private key ...................++++++ ...........................++++++ writing new private key to '/root/hostkey.pem'
You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank. For some fields there will be a default value, If you enter '.', the field will be left blank. command
Level 0 Organization [Grid]:Level 0 Organizational Unit [GlobusTest]:Name (e.g., John M. Smith) []: A private host key and a certificate request has been generated with the subject:
-29-
/O=Grid/OU=GlobusTest/CN=host/nodeA.isim.tn The private key is stored in /usr/local/globus-4.2.1.1/hostkey.pem The request is stored in /usr/local/globus-4.2.1.1/hostcert_request.pem Please e-mail the request to the Test01
[email protected] You may use a command similar to the following: cat /usr/local/globus-4.2.1.1/hostcert_request.pem | mail
[email protected] Only use the above if this machine can send AND receive e-mail. if not, please mail using some other method. Your certificate will be mailed to you within two working days. If you receive no response, contact Test01 at
[email protected]
Now go to nodeB and as the globus user copy the certificate request from nodeA to nodeB so that it can be signed: As globus : command
scp root@nodea:/usr/local/globus-4.2.1.1/etc/hostcert_request.pem . root@nodea's password: hostcert_request.pem 100% 1319 1.3KB/s 00:00
Use the 'grid-ca-sign' command to sign the host request for nodeA. When prompted enter the password for the CA: command
grid-ca-sign –in ./hostcert_request.pem –out ./hostcert.pem To sign the request please enter the password for the CA key: The new signed certificate is at: /home/globus/.globus/simpleCA//newcerts/03.pem
Copy the signed certificate into place back on nodeA: command
scp hostcert.pem root@nodea:/usr/local/globus-4.2.1.1/etc/hostcert.pem root@nodea's password: hostcert.pem 100% 2518 2.5KB/s 00:00
Back on nodeA make sure that the permissions and ownership of the files is correct: command
ls –alh /usr/local/globus-4.2.1.1/etc/host*.pem -rw-r--r-- 1 root root 2.5K Feb 23 13:38 /usr/local/globus-4.2.1.1/etc/hostcert.pem -r-------- 1 root root 891 Feb 23 13:33 /usr/local/globus-4.2.1.1/etc/hostkey.pem s.
Testing globus-gridftp-server on nodeA
Since the remote grid services on nodeB will need to authenticate to the globus-gridftp-server on nodeA using the credentials for user haitham, we need on nodeA to create a grid-mapfile and put user haitham's credentials in it. You can do this in the same way that you added the credentials to the gridmapfile on nodeB.
On nodeA, run the following command: As haitham : command
grid-cert-info –subject /O=Grid/OU=GlobusTest/OU=ps.univa.com/CN=Haitham User
As user globus add this user to the grid-mapfile so that the cat command will produce a result like the one below. Be sure to put the certificate name in double quotes as shown, and be sure to add the UNIX account name after the certificate name as shown: command
cat /usr/local/globus-4.2.1.1/etc/grid-mapfile "/O=Grid/OU=GlobusTest/OU=ps.univa.com/CN=Haitham User" haitham
You need to start the globus-gridftp-server on nodeA. Start by defining the GRIDMAP -30-
environment variable: As root : command
export GRIDMAP=/usr/local/globus-4.2.1.1/etc/grid-mapfile
Start the server on nodeA: command
/usr/local/globus-4.2.1.1/sbin/globus-gridftp-server –p 2811 –S
Test the server by seeing if you can move a file from nodeA to nodeB using a 3rd party transfer (a transfer from one server to another, ie. using two gsiftp:// URLs) As haitham : command
globus-url-copy –vb gsiftp://nodeA.isim.tn/etc/issue gsiftp://nodeB.isim.tn/tmp/foo Source: gsiftp://nodeA.isim.tn/etc/ Dest: gsiftp://nodeB.isim.tn/tmp/ issue -> foo
Did the transfer work? We can find out without leaving nodeA by running a job on nodeB using GRAM WS: command
globusrun-ws –submit –streaming -F https://nodeB.isim.tn:8443/ wsrf/services/ManagedJobFactoryService -c /bin/cat /tmp/foo Delegating user credentials...Done. Submitting job...Done. Job ID: uuid:9dc7a3a8-a4a5-11da-9643-0011d8b1eb22 Termination time: 02/24/2010 19:50 GMT Current job state: Active Current job state: CleanUp-Hold Fedora Core release 4 (Stentz) Kernel \r on an \m Current job state: CleanUp Current job state: Done Destroying job...Done. Cleaning up any delegated credentials...Done.
So yes, the file was transferred. We are now ready to try running a job on nodeB from nodeA involving the transfer of files from one machine to the other. t.
Testing file staging
To test a simple file staging job we will submit a job from nodeA that uses an input file from nodeA. The file will be transferred to nodeB, the job will run, and the transferred file will be used as input. When the job is completed the input file that was transferred will be cleaned up (deleted).
As user haitham on nodeA create the file 'simple-stage-job.rsl' so that it has the contents shown below: command
cat simple-stage-job.rsl /bin/cat ${GLOBUS_USER_HOME} helloworld.txt gsiftp://nodeA.isim.tn/home/haitham/helloworld.txt file:///${GLOBUS_USER_HOME}/helloworld.txt
-31-
file:///${GLOBUS_USER_HOME}/helloworld.txt
The job command file or 'RSL' file declares that the executable on the remote nodeB will be '/bin/cat', it will run in the home directory or the remote user, and the argument will be the name of the input file 'helloworld.txt'. We also see in the RLS file that the file helloworld.txt will be staged in by transferring it from the home directory of the user on nodeA and placing it in the home directory on nodeB. The file on nodeB will be cleaned up when the job is completed. Before running the job, create the file 'helloworld.txt' on nodeA as shown: command
cat helloworld.txt Hello World from haitham on nodeA!
Now run the job from nodeA on nodeB and use the '-f' flag to run the job described in the file 'simple-stage-job.rls': command
globusrun-ws –submit –streaming –F https://nodeB.isim.tn:8443/wsrf/services/ ManagedJobFactoryService –f simple-stage-job.rsl Delegating user credentials...Done. Submitting job...Done. Job ID: uuid:91b95050-a4a7-11da-bc4c-0011d8b1eb22 Termination time: 02/24/2010 20:04 GMT Current job state: StageIn Current job state: Active Current job state: CleanUp-Hold Hello World from haitham on nodeA! Current job state: CleanUp Current job state: Done Destroying job...Done. Cleaning up any delegated credentials...Done.
Notice above the output from running the 'cat' job on nodeB. You should verify that on nodeB the input file that had been transferred into haitham's home directory has been cleaned up after the job completed. u.
Completing Deployment on nodeC
At this point all the services are running and tested on nodeB, and we have a generic user configured on nodeA to be able to run jobs, including jobs that involve the staging of files. You should go back through this chapter and repeat the necessary steps to deploy Globus on nodeC. Again you do not need to create a new certificate authority. You only need the one available and owned by the 'globus' user on nodeB. But you will need to get host certificates for nodeC and make a copy that the container on nodeC can use. When you use the 'grid-ca-sign' command to sign the certificate that you copied over from nodeC you will need to run it with the -force option.
You should be able to run all the tests of services in this chapter on nodeC that you run on nodeA. Simply replace "nodeC" with "nodeA" in the URLs used on the command line for -32-
globus-url-copy and globusrun-ws.
7) Connecting Globus Gram WS and Torque (Open PBS) As haitham : Until now we deployed Globus on nodeB and tested the GRAM WS service. But we were only using/testing the 'fork' jobmanager. You were not able to submit jobs into the PBS batch queue. We will now connect Globus GRAM WS so that jobs can be submitted into the PBS batch queue. a.
Building the WS GRAM PBS jobmanager
Before building the WS GRAM PBS jobmanager you should stop the container on nodeB. While you could just SIGTERM it, you can also use the 'globus-stop-container' command after setting up the credentials properly for user 'globus':
command
export X509_USER_CERT=/usr/local/globus-4.2.1.1/etc/containercert.pem export X509_USER_KEY=/usr/local/globus-4.2.1.1/etc/containerkey.pem
Create a proxy certificate: As globus : command
grid-proxy-init Your identity: /O=Grid/OU=GlobusTest/CN=host/nodeB.isim.tn Creating proxy ........................................... Done Your proxy is valid until: Fri Feb 24 02:16:13 2010
Now stop the container on nodeB: command
globus-stop-container
Before building the PBS jobmanager you need to make sure that the PBS commands are in the path for the globus user: command
export PATH=/opt/pbs/bin:$PATH which qsub /opt/pbs/bin/qsub which qstat /opt/pbs/bin/qstat which pbsnodes /opt/pbs/bin/pbsnodes
You also need to set the PBS_HOME environment variable to point to the directory where PBS writes the server log files. If you followed the directions in this tutorial then the directory is /usr/spool/PBS: When the ls /user/spool/PBS/server_logs/ command is run there may be several log files present and files with names corresponding to the dates when PBS was running. The resulting listing shown is only an example.
command
ls /usr/spool/PBS/server_logs/ 20100228 export PBS_HOME=/usr/spool/PBS
The GRAM WS PBS jobmanager is included in the GT 4.2.1.1 source so you only need to go back to the source distribution directory and build it: -33-
command
cd gt4.0.1-all-source-installer make gt4-gram-pbs make install running /usr/local/globus-4.2.1.1/setup/globus/setup-seg-pbs.pl..[ Changing to /usr/local/globus-4.2.1.1/setup/globus ] ..Done running /opt/globus-3.0.1/setup/globus/setup-globus-scheduler-provider-pbs..[ Changing to /usr/local/globus-4.2.1.1/setup/globus ] checking for pbsnodes... /opt/pbs/bin/pbsnodes checking for qstat... /opt/pbs/bin/qstat find-pbs-provider-tools: creating ./config.status config.status: creating /usr/local/globus-4.2.1.1/libexec/globus-scheduler-provider-pbs ..Done running /usr/local/globus-4.2.1.1/setup/globus/setup-gram-service-pbs..[ Changing to /usr/local/globus-4.2.1.1/setup/globus ] Running /usr/local/globus-4.2.1.1/setup/globus/setup-gram-service-pbs ..Done
The last step is to configure the jobmanager so that it knows that rsh is being used (on nodeB): command
cd $GLOBUS_LOCATION/setup/globus ./setup-globus-job-manager-pbs --remote-shell=rsh find-pbs-tools: WARNING: "Cannot locate mpiexec" find-pbs-tools: WARNING: "Cannot locate mpirun" checking for mpiexec... no checking for mpirun... no checking for qdel... /opt/pbs/bin/qdel checking for qstat... /opt/pbs/bin/qstat checking for qsub... /opt/pbs/bin/qsub checking for rsh... /usr/kerberos/bin/rsh find-pbs-tools: creating ./config.status config.status: creating /usr/local/globus-4.2.1.1/lib/perl/Globus/GRAM/JobManager/pbs.pm b.
Testing the GRAM WS PBS jobmanager
As user globus on nodeB you should start the container again: command
/usr/local/globus-4.2.1.1/bin/globus-start-container > /home/globus/container.out 2>&1 &
As haitham : On nodeA you can test the PBS jobmanager by submitting a simple job to the PBS factory on nodeB rather than the default fork factory. If the generic user credentials for user haitham has expired the process of requesting and signing a new set will need to be repeated as will the process of creating a proxy certificate with the grid-proxy-init command. Note also when the globusrun-ws command is run that we do not use the streaming flag here: command
globusrun-ws –submit –F https://nodeB.isim.tn:8443/wsrf/services/ ManagedJobFactoryService –Ft PBS –c /bin/sleep 120 Submitting job...Done. Job ID: uuid:547f0fd6-a4b6-11da-ac33-0011d8b1eb22 Termination time: 02/24/2010 21:49 GMT Current job state: Pending Current job state: Active Current job state: CleanUp Current job state: Done Destroying job...Done.
-34-
After the job has been submitted you should be able to "watch" it run on nodeB to verify that it is running. Note that the integer of the Job id may be different than the one shown below: command
/opt/pbs/bin/qstat/qstat Job id Name User Time Use S Queue 14.nodeb STDIN haitham 0 R batch
8) Installing PBS execution hosts (Linux): a.
Introducing Torque to the Worker Nodes
As root : Now we need to tell the pbs_server on nodeB which worker nodes are available and will be running pbs_mom. We do this by creating the file /var/spool/pbs/server_priv/nodes. With your favorite text editor, add each worker node hostname on a line by itself. If they have more than one processor, add np=X next to the line. Mine looks like this: command
poste12 poste13 poste14 poste15 poste16 poste17
b.
np=2 np=2 np=2 np=2 np=2 np=2
Installing Torque on the Worker Nodes
Now we need to install a smaller version of torque, called pbs_mom, on all of the worker nodes of nodeB. Move back into the directory we untarred earlier /home/systemuser/torque-2.0.0p7. There's a handy way to create the packages for the torque clients. command
make packages Done. The package files are self-extracting packages that can be copied and executed on your production machines. Use --help for options.
You'll see some new files in the directory now if you run ls. The system has created the file torque-package-mom-linux--i686.sh. (note that “-i686” can be different according to your architecture). Copy that file to all the the worker nodes. You can either copy it over to a shared NFS mount8. command
cp torque-package-mom-linux-i686.sh
/shared/usr/local/src/
Once it's on each worker node, they each need to run the script with: command
torque-package-mom-linux-i686.sh --install
Before we can start up pbs_mom on each of the nodes, they need to know who the server is. You can do this by creating a file /var/spool/pbs/server_name that contains the 8
For more informations see: http://debianclusters.cs.uni.edu/index.php/Mounted_File_System:_NFS
-35-
hostname of the head node on each worker node. Next, if you're using a NFS-mounted file system, you need to create a file on each of the worker nodes at /var/spool/pbs/mom_priv/config with the contents: $usecp :
command
$usecp nodeb.isim.tn:/shared/home /shared/home
c.
Testing the cluster
Make sure the server monitors the pbs_moms that are running. Terminate the current queues and then start up the pbs server process again with: command
qterm pbs_server
Then, to see all the available worker nodes for nodeB in the queue, run: command
pbsnodes -a
Repeat for nodeC to get a second working cluster.
-36-