A first attempt: initial steps toward determining scientific users’ requirements and appropriate security paradigms for computational grids Bruce Beckles
Sacha Brostoff
Dr Stuart Ballard
Cambridge eScience Centre
Department of Computer Science, UCL Gower Street London WC1E 6BT +44 (0)20 7679 7214
National Institute for Environmental eScience Centre for Mathematical Sciences Wilberforce Road, Cambridge CB3 0WA +44 (0)1223 765669
[email protected]
[email protected]
Centre for Mathematical Sciences Wilberforce Road, Cambridge CB3 0WA +44 (0)1223 765348
[email protected]
ABSTRACT In this paper we analyse the results of a recent questionnaire on computational grids that was distributed to a section of the UK academic/scientific community. We evaluate the Globus Toolkit with respect to our analysis, examine its security paradigm, and then outline proposals for further exploration and resolution of the issues raised by our findings.
Categories and Subject Descriptors D.2.1 [Software Engineering]: Requirements/Specifications, H.1.2 [Models and Principles]: User/Machine Systems – human factors, software psychology, and K.6.1 [Management of Computing Information Systems]: Project and People Management – systems analysis and design
General Terms Design, Security, Human Factors.
Keywords Requirements Capture, Requirements Analysis, Grid Computing, Globus Toolkit, Distributed Computing, Security.
1. INTRODUCTION This paper contains a simple analysis of the responses (to date) of a questionnaire [2] that was designed to obtain some idea of what users and potential users in the UK academic/scientific community require from computational grids. The analysis derives some technical requirements for computational grid infrastructure, and these requirements are compared [1] to the features of the current version (3.0.2/2.4.3) of the Globus Toolkit [6], a widespread “grid middleware” toolkit which is heavily
utilised in building computational grids in the UK and European academic communities. In addition, results of the analysis relevant to security paradigms for computational grids are highlighted and used to examine the security paradigm most usually espoused for grids built using the Globus Toolkit. Methods for undertaking more advanced requirements capture exercises, and more formal requirements analyses of computational grid systems and their associated middleware are described, as are methods for designing and testing relevant security paradigms. For the purposes of this paper, except where noted otherwise, a data grid is considered to be a particular type of computational grid.
2. RATIONALE Two of the authors of this paper (Bruce Beckles, Stuart Ballard) are heavily involved in different aspects of the promotion of computational grids to the UK academic/scientific community, and in providing support to users from this community in their use of computational grids. We both found that users in this community were reluctant to engage with computational grids – Stuart Ballard reports finding it extremely difficult to generate any interest in e-Science and grid computing at many of the environmental sciences conferences which he routinely attends for this purpose. And when end-users did engage or try to engage with computational grids, they found them very difficult to use – almost invariably we would hear comments from users to this effect at workshops run to introduce them to computational grids. In addition, it seemed that the existing computational grids (those of which we are aware) did not provide many of the features our users seemed to require, or that they considered highly desirable. For instance, most new users, upon being confronted with the UK e-Science Level 2 Grid, wanted both an extremely easy mechanism for the submission of jobs (preferably a GUI) and detailed job status reporting while their job was running. Another common comment was “How do I know that when I run my code in a grid environment the results I get are reliable?”, an entirely reasonable requirement it is by no means obvious how to satisfy in the current computational grid environments.
We therefore decided, as a preliminary measure in our attempts to better understand and formulate the problems we were observing, to design a questionnaire, in language that was deliberately kept as simple and non-technical as was feasible, that would give us some idea of how computational grids were currently perceived, and what was expected of them. Distribution of the questionnaire (in both paper and electronic formats) was within those parts of the UK academic/scientific community to which we have active links, and was deliberately as wide ranging as we could reasonably make it in the time available. We also studied the current version of the Globus Toolkit, in particular its documentation and the reports of various individual users and grid-related projects that make use of it. It became apparent that, if there was a detailed software specification to which this Toolkit adhered, then it was not one which was widely known, nor was it designed with reference to a clear specification of end-user requirements. We therefore decided to concentrate some of our efforts on developing what might be considered a first attempt at a simplistic requirements analysis.
3. METHODOLOGY 3.1 Questionnaire Design A copy of the questionnaire is included as Annex 1 to this paper, and it is also, at the time of writing, still available on-line [2]. Much of the design of the questionnaire was by one of the authors, Sacha Brostoff, who is an ergonomist with experience of assessing human-computer interfaces. Combined with the other authors’ familiarity with computational grids and the UK academic and e-Science communities, we were able to produce a questionnaire which we believe should give reasonably useful results. It must be pointed out, however, that some of the “requirements” which we have asked respondents to rate are not operational requirements, and so should simply be regarded as indicative of areas of concern to the respondent. As is customary, we conducted a trial of the questionnaire on a small, but reasonably diverse, sample of individuals from our target community, and then modified it in light of their comments and the trial results. This trial was entirely conducted with paper questionnaires. Ideally we would have liked to trial the redesigned questionnaire as well, but regrettably time constraints meant this was not possible. In particular, we would like to have trialled the design of the on-line questionnaire (which was taken from the re-designed paper questionnaire). The questionnaire was designed to be applicable to experienced users of computational grids as well as to potential users, or users with little experience, of them. In part this was achieved by using language which was as simple and non-technical as possible, and by deliberately avoiding detailed definitions or in-depth preambles. Additionally we split the questionnaire into sections, with only one (clearly defined) section for those who had actually used a computational grid. We also needed to be able to assess the respondent’s knowledge of computational grids, their conceptual understanding of computational grids, and their general level of technical knowledge/skill. We therefore not only asked respondents simple yes/no questions in regard to these matters, but they were also
asked to define two key concepts: a computational grid, and the Globus Toolkit. The questionnaire is an anonymous questionnaire (and respondents had the option of returning the questionnaire via methods which would not reveal their identity), to encourage respondents to be honest who might otherwise modify their answers to make themselves more socially desirable, or who might not otherwise respond at all [9]. Respondents could give us their names and contact details if they wished to help further, such as by being interviewed. Any contact details given by respondents will not be revealed to anyone other than the authors.
3.2 Questionnaire Distribution Our questionnaire was produced in the following formats: • • •
On-line questionnaire hosted on the Cambridge eScience Centre’s WWW site, Electronic documents: Plain text and Microsoft Word 2000 formats, and Printed paper questionnaires.
We adopted the strategy, where possible, of: making the questionnaire visually appealing, addressing questionnaires to individual respondents by name, and having persons of high status among the respondents’ professional communities endorse the questionnaire, as such techniques are known to improve response rates. We sent e-mails asking people to fill in the questionnaire to everyone on the contact lists of the Cambridge eScience Centre (CeSC) and the National Institute for Environmental eScience Centre (NIEeS). Similar e-mails were sent to the mailing list of the myGrid project, to the team leaders at the European Bioinformatics Institute, posted on the sci.bio.systematics, comp.ai.neural-nets, and comp.ai.genetic Usenet groups, and sent to individual members of the UK academic/scientific community known to the authors. The Principal Investigators of the e-Science projects connected to CeSC were also asked to pass on the e-mail request to the staff on their projects. Paper questionnaires were distributed at several workshops run by NIEeS in Cambridge. The questionnaire was also advertised on the front page of the NIEeS and CeSC WWW sites. Details of the estimated numbers of people who were contacted by these various methods are given in Appendix 1, Section 0. It must be borne in mind that there was a certain amount of overlap between those who received the questionnaire by each of these methods of distribution, so the numbers given in this table must not simply be added together. The first e-mail sent out about this questionnaire, to a subset of the NIEeS contact list, contained the plain text version of the questionnaire, as this format was predicted to be more in keeping with the e-Science community’s values, in contrast to more visually appealing but proprietary document formats (though a URL for the Microsoft Word 2000 version was also included). This plain text version was dropped from later e-mails, which instead referred the recipient to the on-line questionnaire (which contained hyperlinks to the aforementioned documents, also stored on the CeSC WWW site).
3.3 Statistical Analysis
Table 3.1. Rating Scale for Section 1, Questions 2 and 3
Each questionnaire that was returned was input into a Microsoft Excel spreadsheet, and then the simple statistical functions of Excel were used to collate and process this data.
0
Partly because we have not, as yet, got as many responses as we would like (more questionnaires are still being returned even as this paper is being written), and partly because we feel that more work is needed on the questionnaire design, we suggest that our results should largely be analysed qualitatively rather than quantitatively. Although some quantitative analysis is necessary to make sense of the data we have collected, we consider this analysis to be very preliminary. Moreover, due to the lack of systematic sampling during questionnaire distribution, and the very low numbers of responses, we cannot be confident that our respondents are statistically representative of the population whose requirements we are attempting to collect [12] (though we consider our results better than an alternative in which no surveys of this kind have been attempted). We have therefore chosen not to perform any sophisticated statistical tests or analyses on the data, for the large part concerning ourselves with totals, simple proportions, mean scores and rankings. Although these may not always be the most appropriate statistics for these types of data, and for the analysis one would ideally wish to perform upon them, we believe that they provide enough of an insight into the collected responses for our purposes. Despite the simplicity of our analysis we feel that our results are both interesting and, in our opinion, relevant to those concerned with the design and implementation of computational grids in an academic environment.
3.3.1 Coding of Questionnaire Data For the purposes of analysis in Microsoft Excel, the raw data from the questionnaires was coded as follows: •
Questions with YES / NO / DON’T KNOW responses were coded as “1”, “0” or “\” respectively.
•
For, questions where respondents were asked to select from a list of options, each option was entered into Excel with a “1” if it was selected, or “0” if it wasn’t. (This includes questions where respondents were to select exactly one, or at most one, option.)
•
Questions where the respondent was free to give any text as a response (this also includes questions where the ‘text’ should actually have been a numeric value) were coded verbatim as given. Some of these responses were later categorised, as described below.
•
Questions where the respondent was asked to rank some statement on a 7-point scale were coded as an integer from 0 to 6.
The ‘free text’ responses for Section 1, Questions 2 and 3 were categorised on a 7-point scale as follows:
completely right
1
2
3
4
5
6
mostly right
more right than wrong
about equally right and wrong
more wrong than right
mostly wrong
completely wrong
Each free text response was categorised by only one judge. However, the same judge was used in all cases to maintain reliability across judgements. In Section 3, Question 1, for items ce and cf respondents were asked to give a list of operating systems of interest to them. For the purposes of our analysis, three categories of operating system were recognised: Windows OS, UNIX/Linux (including the BSDs and MacOS X) and Other OS (this last category included MacOS Classic). The list of operating systems given by a respondent was then coded as one or more of these categories (as many categories as applicable).
3.3.2 Analysis of Questionnaire Data As mentioned earlier, we have gone for as simple a statistical analysis as possible. For most questions which do not have a 7point ranked scale attached to them we have simply counted the number of respondents who gave a particular response and worked out the proportion of the total number of respondents this represents. We have even done this for some of the questions with 7-point ranked scale. For the items (the “requirements”) in Section 3, Question 1, we have worked out the mean score on the 7-point ranked scale and used this to rank these items. (As a sanity check we have also provided the maximum, minimum, mode and median scores (except for items ce and cf). It is possibly worth noting that the mode and median scores – which should be more representative for this sort of data – were, in general, reasonably close to the mean score.) It is important to note that where a respondent has not scored an item we have not taken this to be a score of zero, so that the average scores of different items may be based on different numbers of respondents. We found that there were a number of common problems with the completed questionnaires, which will be detailed later, but it is worth mentioning here that a number of respondents did not follow the instructions given properly and so we had to sanitise some of the data. For instance, 17 respondents filled in some of Section 2, Questions 2 to 16 (about their experiences of using computational grids), despite having answered “NO” to Section 2, Question 1 (which asked if they had used one or not). It was therefore important not to assume, in our counts of answers to these questions, that they were all from respondents who had used a computational grid. We are aware that there are not insignificant issues with a statistical analysis as simplistic as that described above, which is why we feel it is important not to make any substantive quantitative claims in this paper. As previously noted we shall proceed, as far as possible, along qualitative lines.
4. ANALYSIS OF RESULTS 4.1 Quantitative Results Despite our stated approach to our analysis of our collected data, clearly some quantitative results are necessary, and these are given
in Appendix 1. In the subsequent sections of this paper, we give excerpts of these results where relevant.
Relevance of Grids 80%
4.1.1 Response Rate We estimate that of the order of 1,000 individuals in the UK academic/scientific community received, either our questionnaire directly, or a request to complete the questionnaire (including instructions on how to obtain it). At the time of writing this would give us an estimated response rate of about 8% to 8.5%. We are still receiving questionnaires so we would expect this response rate to rise somewhat in the near future.
60% Are computational grids relevant to your work?
40%
Are data grids relevant to your work?
20%
0%
A small number of the responses alluded to above were not completed questionnaires, but rather recipients explaining why they felt it was inappropriate for them to fill in our questionnaire. Typical reasons were: • •
• •
They knew nothing about grid computing and therefore felt they would be unable to complete the questionnaire. They felt that someone else in their organisation/project, who knew more about grid computing, and/or had a greater degree of technical knowledge, would be a more appropriate respondent. They did not regard themselves as a potential user of computational grids or grid technology. They did not regard themselves as a member of the academic community.
There are also indications (respondents’ or prospective respondents’ comments) that some recipients felt that: •
The questionnaire was too difficult to fill in.
•
Two of the first questions, which asked the respondent to define relevant terms, were perceived as ‘challenging’ or ‘testing’ them in a way that was unwelcome.
•
The questionnaire was directed at end-users or potential end-users with little or no experience of computational grids and so it was inappropriate for someone with the recipient’s experience / technical knowledge / role to complete it.
4.2 Respondents’ Perceptions of e-Science and Computational Grids The data collected from Section 1 (which asked about respondents perceptions of computational grids) of the questionnaire is Section 1, Question 1. Have you heard of any of the following?: 0%
20%
40%
60%
80%
100%
the UK e-Science programme Grid Computing or grid technology Data Grids Computational Grids the Globus Toolkit (or the associated Globus Project or Globus Alliance) digital certificates or X.509 certificates
Chart 4.1. Awareness of e-Science and related concepts
yes
no
don't know
Chart 4.2. Relevance of grids (responses to Section 1, Questions 4 and 5) Scoring of Respondents’ definitions of key concepts What is the Globus Toolkit used for?
What is a computational grid? No. of Respondents
0
5
completely right
10
15
20
25
30
15
6
21
mostly right 10
more right than wrong about equally right and wrong
2
more wrong than right
0
mostly wrong
0 0
completely wrong
0
35
30 26
3 4
1
Chart 4.3. Understanding of key concepts (responses to Section 1, Questions 2 and 3) summarised here using bar charts derived from the tables in Appendix 1, Section 1 (which contains tabulations of all the responses to this section). It will be seen that a very large proportion of our respondents are aware of the UK e-Science programme and have heard of computational grids, whilst many have heard of data grids, the Globus Toolkit or digital certificates. More than half of our respondents felt that computational grids and data grids were relevant to their work. This suggests that our respondents were largely being drawn from those recipients who had previously heard of computational grids and (though this is less the case) who felt that grids were relevant to their work. This, in turn, offers the possibility that our respondents are not wholly unrepresentative of the probable userbase of the current and emerging computational grids in the UK. Another result worth highlighting here is the relatively low incidence of familiarity with digital certificates amongst our respondents. The significance of this is explored in Section 6 where we discuss the security paradigm currently in use on most of the computational grids that are based on the Globus Toolkit. We can also see that, given the reasonably large number of respondents who expressed familiarity with the terms “grid computing” or “grid technology”, it appears that more respondents than one might have hoped were unclear about the intended use of the Globus Toolkit and the definition of a
computational grid – in the case of such respondents this possibly indicates some confusion regarding grid-related concepts in general.
4.3 Background of Respondents
4.4.1 Categorisation of “Requirements” We classified the “requirements” from the questionnaire as: •
“Essential”: These are attributes which, should a computational grid not possess, will render it either completely unusable to, or so unsuitable that there is an extremely strong disincentive for, the prospective user.
•
“Desirable”: These are attributes which, should a computational grid not possess, will render it unattractive, but not unusable, to the prospective user. (Alternatively, they are attributes which, should a computational grid possess, will encourage a prospective user to use that grid.)
•
“Unwanted”: These are attributes which have little or no effect on how useful a user finds a computational grid, or on how likely a prospective user is to use that grid.
Tabulation of the data collected from Section 4 (which asked about the background of respondents) is given in Appendix 1, Section 4, and the bar chart below is based on that data. If we examine the responses to Section 4, Questions 4 (Job Title) we see that we have drawn respondents from a wide range of positions, such as “student”, “Senior Lecturer”, “Database Manager” and “Research Scientist”. Similarly if we examine the responses to Section 4, Question 5 (Research Area / Interest) then we see that the research areas from which our respondents are drawn are also quite wide ranging, including “Grid middleware” and many areas in the hard sciences (e.g. “Particle Physics”, “Computational Chemistry”). This further supports the idea that our sample, though small, is not as unrepresentative or biased as might have been the case. Does your work require any of the following?: High bandwidth or high throughput data transfer Remote job submission
Job scheduling
0%
10%
20%
30%
40%
50%
60%
70%
Chart 4.4. Job requirements (Section 4, Question 3) If we examine the responses to Section 4, Question 3 we see that, as part of their job, a high proportion of our respondents require high bandwidth (or high throughput) data transfer facilities, a somewhat smaller number require remote job submission capabilities, and a large minority require job scheduling facilities. As these are all areas which computational grids are generally believed to address, this further suggests that the responses of our sample will indeed be useful in illuminating the requirements which scientific users will have for computational grids.
4.4 Analysis of “Requirements” As discussed in Section 3.3.2, we have calculated the mean score of each item from Section 3, Question 1 of the questionnaire. The results of these calculations are given in Appendix 1, Section 3. Respondents were asked to rate each of the items in Section 3, Question 1 on a 7-point scale (from “Of Vital Importance” to “Of No Importance”), to which we assigned a score, specified in the table below: Table 4.1. Score for Ratings for Section 3, Question 1 Of Vital Importance
Of Very Great Importance
Of Great Importance
Of Some Importance
Of Little Importance
Of Very Little Importance
Of No Importance
6
5
4
3
2
1
0
As a rough guide, it seemed sensible to classify items with a mean score of greater than 4 as “Essential”, those with a mean score between 3 and 4 as “Desirable”, and those whose mean score was lower than 2 as “Unwanted”. When we examined the mean scores we discovered that the top 10 ranked items had a mean score of 4.3529 or higher, so we decided to make a mean score of 4.35 the boundary between “Essential” and “Desirable”. When we looked at those items whose mean score was less than 2.0000, we discovered that the item in this range with the greatest mean score had a score of 1.3134, so we decided to make a mean score of 1.50 the upper boundary of “Unwanted”. Given the limitations of our sample, it seemed sensible not to work to any greater than 2 decimal places. This gives the following categorisation (which excludes 13 of our requirements from classification, those whose mean score lies between 1.50 and 3.00): Table 4.2. Categorisation of mean score Mean Score
Categorisation
Mean Score ≥ 4.35
Essential
3.00 ≥ Mean Score > 4.35
Desirable
Mean Score ≤ 1.50
Unwanted
It is important to realise that we have not experimentally validated either this classification of user requirements, or the categorisation of the “requirements” from our questionnaire. Thus these categories, and the categorisation of the “requirements” from our questionnaire, should be regarded as necessarily imprecise, and as indicative rather than prescriptive.
4.4.2 “Essential” Requirements: Under the above scheme, we determined the following “requirements” as “Essential” (here given in descending order of mean score), listed below. For each requirement we have given the related “technical requirements”, i.e. the general technical aspects / features of a system that would be concerned with fulfilling the stated “requirement”. Note that this determination is
•
Ability to efficiently transfer large quantities of data
•
Ability to retrieve your data from the grid at any time
•
Local support
•
Ability to authenticate yourself to all grid resources in a single “sign-on” (as opposed to individually authenticating to each resource as that resource is used)
•
Ability to track the progress of your computation
•
Give me assurance that my data is not available to other grid users, if I so choose Technical Requirement(s): Security
Ability to control when computational resources you have donated to the grid are available and unavailable to it
•
Give you assurances that calculations performed on the grid for you are free from error Technical Requirement(s): Reliability / Verification
Allow you to use computational grids from your computers running the following operating systems: UNIX/Linux
•
Get the same results more quickly
•
Access to technical support of the following kinds: User community (e.g. mailing lists, etc.) support
•
Work with your unmodified, existing preferred scientific software, even if it means you do not receive the best possible performance from the grid
•
Ability to control access to computational resources which I donate to the grid
•
Clear indication of the consequences of submitting your computation to the grid
•
Get more results in the same time
•
Ability to transfer your data to the grid at any time
•
Prevent spare computational resources that you’ve donated to the grid from being used for computation that you have a moral objection to
•
Ability to view the intermediate results of your computation
one which has been made, for the most part, by one of the authors (Bruce Beckles), based largely on his knowledge of computing system design and administration – it has not been validated for these “requirements”, and should be considered merely indicative. •
Warn me before I accrue unwonted expense Technical Requirement(s): Auditing / Accounting
•
Not make my workstation crash more than once per day extra Technical Requirement(s): Stability
•
•
•
•
•
•
Get results that would not be feasible using the limited computational resources you currently have available Technical Requirement(s): Enabling Distributed Computing / Remote Job Management / Increase Productivity Access to technical support of the following kinds: Help pages on grid-related websites Technical Requirement(s): Technical Support / Documentation Help you to prove to yourself that calculations performed on the grid for you are free from error Technical Requirement(s): Verification / Auditing / Reliability Give me assurance that my program code is not available to other grid users, if I so choose Technical Requirement(s): Security
•
Help you to prove to other people that calculations performed on the grid for you are free from error Technical Requirement(s): Verification / Auditing / Reliability
•
Ability of the grid job control/submission subsystem to automatically maximise my benefits whilst minimising my costs
•
•
Get a higher quality of result in the same time
Access to technical support of the following kinds: Indepth support Technical Requirement(s): Technical Support
•
Less effort to gain access to supercomputer level resources
•
Enable me to track the status of progress of my computations on the grid Technical Requirement(s): Status Reporting
•
Allow you to donate the spare capacity of your computers running the following operating systems: UNIX/Linux
•
Less time to get access to supercomputer level resources
•
Ability to change the priority of your computation
•
Allow you to prevent commercial computation from being carried out on spare capacity that you have donated to the grid
•
Allow use of computational grid client software without changing its default settings (apart from authentication requirements e.g. username/password)
•
Ability to donate your own spare computational resources to the grid
•
Allow you to donate the spare capacity of your old equipment
•
Get the same results more cheaply
We have carried out a reasonably detailed comparison of the features offered by the Globus Toolkit (versions 2.4.3 and 3.0.2) for the GlobusWORLD 2004 conference [1], so in this paper we shall simply summarise the results of this comparison (see Section 5).
4.4.3 “Desirable” Requirements: The following “requirements” were classified as “Desirable” (given in descending order of mean score): •
Access to technical support of the following kinds: Ondemand, fast response support
•
Ability to handle extremely large quantities of data
•
Ability to cancel your computation at any point
•
Ability to do computation on the grid without donating your spare computing resources to the grid for other people to use
•
Access to technical support of the following kinds: Casual (e.g. ‘best effort’, ‘as time permits’) support
•
Ability for the grid to take over work that has already begun on conventional computational resources
•
Allow you to use computational grids from your computers running the following operating systems: Windows OS
(The related “technical requirements” for each of these items is given in Appendix 1, Section 3.) We do not intend to discuss here these “desirable” requirements in detail. It should be pointed out, however, that the scores of those requirements concerned with particular operating systems (items ce and cf) are particularly suspect, since we required our respondents to specify their operating system(s) of interest and then rate them. This makes it more likely that they would not respond to these items. Additionally, some respondents might not think about operating system availability unless particular operating systems were suggested to them. We also discovered that 12 respondents listed one or more operating systems but did not rank them, and 7 respondents rated one or both of items ce and cf but without listing any operating systems for them.
4.4.4 “Unwanted” Requirements:
carried out on our derivation process (briefly mentioned in section 4.4.2). Below we list these derived requirements with some brief comments/explanatory notes.
4.5.1 Resource Accounting: Given the distributed nature of computational grids, combined with the prevalence of a paradigm for “the Grid” which stresses that the workings of grids should be hidden from the end-user as much as possible (ideally entirely), it is unsurprising that endusers are concerned with “real” (i.e. probably financial) costs which may also, in consequence, be disguised from the user. This places a very strong requirement on the grid infrastructure and middleware: that it be possible to reliably account for all chargeable items that may possibly be used in a grid, be it network bandwidth, storage services, data retrieval services, CPU resources, technical support services, etc. Whilst many multi-user operating systems implement some sort of resource accounting, this is typically confined to the local system, or at best, a well-defined local cluster of systems. Accounting of network bandwidth is, in general, poorly implemented (if at all). In any case, a functioning grid must be able to account for all the resources used by a particular user, which, as that user’s local identity may vary between each machine in that grid, and may even not be unique on some machines, requires that there is some general accounting mechanism at the grid middleware level.
•
Give me someone to sue if things went wrong
Whilst at present most of the grids used in academia are free of financial charge, we can be certain that this will not continue to be the case; the increasing incorporation of HPC facilities into computational grids alone ensures that this will not be the case.
•
Access to technical support of the following kinds: No support (None)
4.5.2 Auditing:
•
Allow you to donate the spare capacity of your computers running the following operating systems: Other OS
•
Allow you to use computational grids from your computers running the following operating systems: Other OS
The following “requirements” were classified as “Unwanted” (given in descending order of mean score):
(The related “technical requirements” for each of these items is given in Appendix 1, Section 3.) The item “Access to technical support of the following kinds: No support (None)” (item bm) could probably have been better phrased (indeed, some respondents did express confusion at being asked to rate it) or possibly omitted altogether; the fact that it got such a low mean score (0.5500) can plausibly be interpreted as a (at least) “desirable” requirement that there is always some nontrivial level of technical support available within a reasonable time frame.
4.5 Derived Requirements Based principally on the “essential requirements” listed in section 4.4.2, we have derived some more detailed technical requirements which we believe are not unreasonable abstractions of the “requirements” given in Section 3, Question 1 of the questionnaire. These requirements are not intended to be operational requirements, but will be expressed in more general terms. Once again, it must be noted that no validation has been
Whilst resource accounting is concerned with the recording how much of a particular resource was used, when, and by whom, auditing is concerned with recording the purpose for which the resource was used, the method of access, and so on. To be able to assure end-users that a computational grid is secure, reliable and that any expense they have accrued on the grid was genuinely generated by them, it is essential that a reliable auditing process is in place. Some system analysts consider this to be part of the accounting process, but, in a grid environment, the “accounting process” becomes so complex that it is probably sensible to separate these two closely related processes.
4.5.3 Stability: In this context we are referring to the stability of individual machines, either the individual nodes in a computational grid, or the individual machines used to submit jobs to that grid. The nature of grids is such that, for many end-users, it is entirely reasonable to expect that, as almost all the processing of their job will be done remotely, the load that job submission, monitoring and result retrieval places on their individual machine should be very low – it certainly shouldn’t make the end-users machine significantly less stable than it previously was. So we are principally concerned the stability of the ‘client software’ needed to interface with a computational grid.
4.5.4 Security: We will discuss security concerns in a grid environment in more depth in Section 6, but it is worth noting here that there are many aspects to security in computing environments, and that, in a grid environment, if one is concerned about the security of one’s data and/or code, then much more is involved than simply authentication. (Authentication is the principal security concern of middleware such as the Globus Toolkit.)
4.5.5 Reliability: By ‘reliability’ we mean something analogous to ‘reproducibility’ in the experimental sciences. In order to use any computing system for scientific simulations, etc, the user must be convinced that that system will behave consistently. This is particularly difficult to guarantee in a grid environment, where the user not only typically has little or no control over the environment in which their code is executing, but may not even be able to ascertain much detail about that environment. There is not even any guarantee that the user will use the same node in a grid for successive jobs.
4.5.6 Verification: As has already been mentioned several times earlier in this section, the grid environment is one in which the end-user has very little control over that environment, and possibly very little knowledge of the details of that environment. In such a scenario, for them to have any confidence in results obtained in a grid environment, it is clearly necessary that there are robust verification procedures in place, which will verify data transmission, job execution, etc., since the user will often have no means of doing this themselves.
is a strong incentive for users to make the effort required to master it.
4.5.10 Technical Support: Since computational grids represent extremely complex interactions between discrete systems which are themselves complex, it is clear that they are at least an order of complexity greater than the discrete systems that most users are familiar with. This suggests that there is a requirement for high-level technical support, often involving close liaison with multiple levels of technical personnel in a variety of organisations. This should be borne in mind when considering the resources needed to implement and run a computational grid.
4.5.11 Documentation: As computational grids have, at least up to this point, evolved in an ‘ad hoc’ manner, making use of whatever existing software tools seemed appropriate at the time, with an emphasis on achieving tangible results relatively quickly, it is unsurprising that the level of documentation for most grid middleware is inadequate. Unfortunately, as has already been mentioned, grids are extremely complex systems, and this means that there is an overwhelming need for documentation which is, at a minimum, both accurate and complete.
4.5.12 Status Reporting: This is another key component of most computational grids which tends to be overlooked and/or poorly implemented. As mentioned in Section 2, many new users of computational grids find the lack of clear, informative status reports extremely frustrating, so much so that, on more than one occasion we have seen users abandon their attempts to use a computational grid.
4.5.7 Distributed Computing: Computational grids are often described in terms of “distributed computing”, but current computational grids are often much more like distributed batch processing systems. It is clear, however, that there is a demand amongst some users for performance which is of a higher order of magnitude than what they can achieve on their local resources. Unless they are fortunate enough to be using a computational grid which contains HPC resources (to which they have access) they will need a grid infrastructure which enables them to perform “true” distributed computation.
4.5.8 Remote Job Submission: Whilst this is a core component of most computational grids, it is one about which we have received numerous user complaints, as alluded to in Section 2. This suggests that its design and the underlying methodology should probably be (re-)examined with end-users’ requirements explicitly in mind.
4.5.9 Productivity: It is well known that, generally speaking, users need to feel that there will, in a reasonable space of time, be a tangible benefit to any new system they are being asked to use. This is particularly the case with computational grids as the technology is so new that it is still largely unproven. A clear benefit to almost every class of user is a system that increases overall productivity. If, in addition, that system can deliver an increase of productivity by enabling them to perform tasks that were previously unfeasible, then there
5. THE GLOBUS TOOLKIT The Globus Toolkit [6] is an open source software toolkit, developed by the Globus Alliance, and incorporating the work of other open source projects, which is designed to provide the core middleware services needed to implement a computational grid. In the UK academic community it appears to have been accepted as the standard Toolkit for building grids; in particular it forms the core middleware of the UK e-Science Level 2 Grid. As noted in section 4.5.11, it is a project which has evolved without (at least until recently) a clear statement of what it intended to achieve and how it intended to do this. There do not seem to be any formal specification or verification documents and so it is not possible to carry out a formal requirements analysis. We thus have a situation where it is not possible to definitively say whether or not the Toolkit is fit for its intended purpose. As this situation is clearly unsatisfactory, we have made a first attempt to address this by comparing the capabilities of the current toolkit against our derived “requirements” (listed in Section 4.5). Our results [1] are being presented at the GlobusWORLD 2004 conference, and are summarised below: •
Accounting / Auditing: Auditing is rudimentary; it has not been designed for accounting purposes
•
Stability: The data management components of the toolkit are relatively ‘stable’, but there are some reliability concerns. The stability of the GRAM components of the toolkit is variable; it depends, in large part, on the underlying job manager. In certain configurations it is extremely unstable. The MDS components are completely different between versions 2.4 and 3.0 of the toolkit. In version 2.x of the Toolkit they are extremely unstable, particularly if large numbers of hosts or large amounts of data are involved. In version 3.0 they are reportedly more stable, but they are still too new to be certain of this.
•
Security: There is no mechanism for protecting code and data on remote resources; this is considered vital by our sample.
•
•
Reliability / Verification: Data transmission in the toolkit is not robust, and some users have reported it is too unreliable for their purposes. In addition, error reporting and handling is very rudimentary, which exacerbates these problems, and there is no mechanism for program code or result verification. Technical Support: The experience of all the users to whom we spoke, (who have had experience of the technical support provided by the Globus Alliance) was negative.
•
Documentation: Again, all the users to whom we spoke who have tried to use the documentation have reported it to be inadequate and/or inaccurate.
•
Remote Job Submission / Status Reporting: Our experience is that many users find it overly complex and frequently unreliable – in part due to its opaque interface with the underlying job manager(s) upon which it is dependent. Most also complain that the status reporting is inadequate.
•
Distributed computing: Programmers to whom we spoke reported that the Toolkit APIs were not well documented and that they lacked adequate mechanisms for true “distributed computing”-type applications.
Even this simplistic “analysis” – which is, admittedly, based on results derived from a small sample – indicates that there are good reasons to suppose that the Globus Toolkit, in its current form, lacks many of the capabilities required to build computational grids that the UK academic community will find adequate.
6. SECURITY IN THE GLOBUS TOOLKIT A full security analysis of the Globus Toolkit is beyond the scope of this paper, so we shall confine ourselves to a brief overview and some comments on specific points.
The Toolkit is built on the OpenSSL toolkit [8], which it uses principally for securing its communication channels and managing digital certificates. In so far as it considers the security of individual jobs / user processes, it relies on the security of the underlying operating system of the machine on which the job is running. To most users, developers, and even system administrators, its security concerns will manifest themselves in the form of authentication issues (system administrators will also be concerned with authorisation, but the Toolkit’s approach to this issue is extremely straightforward and somewhat simplistic). The Toolkit uses X.509 digital certificates for authentication. These certificates are protected either by the file protection mechanisms of the underlying operating system, or by a pass phrase, or both. An important point is that it is not possible to use the Toolkit to do anything meaningful unless you are in possession of a so-called “personal” X.509 certificate issued by a Certification Authority known to the Toolkit. In our sample, 22 respondents said they had previously used a computational grid (Section 2, Question 1). Of these, 6 said they did not have a digital certificate (Section 1, Questions 6 and 7), but that they had used the Globus Toolkit (Section 2, Question 2). Three of these cases are even more interesting: •
One of these six also said they had not heard of digital certificates or X.509 certificates (Section 1, Question 1), and that they used the Globus Toolkit at least once a day, and had installed the Toolkit themselves (Section 2, Question 3).
•
Another respondent (of these six) said that they used the Globus Toolkit from one to four times a week,
•
And another respondent (of these six) had also installed the Globus Toolkit themselves.
The most likely explanation for those who didn’t install the Toolkit themselves is that someone else obtained a certificate on their behalf, set it up for them, but never explained what the certificate was or why it was necessary – this seems less likely for the respondent who used the toolkit on a weekly basis. It is more difficult to come up with a plausible explanation for those two who did install the Toolkit themselves. When we consider the relatively small number of total respondents who had heard of digital certificates or X.509 certificates (see Section 4.2), it appears likely that X.509 certificates are relatively poorly understood by our sample, and, indeed, actively misunderstood by some regular users of computational grids. This is in keeping with the experiences of authors Bruce Beckles and Stuart Ballard concerning users and their experiences of digital certificates. We have both found that there is a great deal of confusion on the subject of X.509 certificates amongst our respective user communities, which is exacerbated by the complex procedures for obtaining X.509 certificates from the UK eScience CA for use with the Globus Toolkit, managing these certificates, and exporting them from one format / system to another. This means, for instance, that users often end up with multiple copies of the private key in a number of electronic locations, sometimes each copy protected by a different pass phrase, and occasionally a copy of the key which is either protected by a trivial pass phrase or by none whatsoever. It is well known that users are, in general, unlikely to use strong passwords or pass phrases (e.g. “Password Clues” (Petrie, 2002)
[10]) when they are unaware of the significance of what is being protected or risks facing it, when they are already overburdened with passwords, and for a host of other reasons [15, 13]. If, in addition, it is difficult for them to change their password or pass phrase then the protection is further weakened. It is worth mentioning that various public key infrastructure technologies for end-users have been found to be difficult to use. For example, one study showed PGP cannot be used by normal end-users to encrypt e-mail, despite having an attractive graphical user interface [16]. Also, the more attack vectors available the weaker the protection. Since the use of X.509 certificates by the Globus Toolkit actively requires users to move their certificate from location to location, or to maintain copies of it in several locations, this means the opportunities for attack and for acquisition of the user’s certificate are increased. For a detailed analysis of the security implications of the use of X.509 certificates by the Globus Toolkit, see “Grid Security and its use of X.509 Certificates” (Lock and Sommerville, 2002) [7].
computation in their work, and extrapolating from their answers. This could perhaps be continued on a larger scale. •
Fieldwork should be conducted using established techniques. For example, “contextual interviews” [3], where the analyst goes to the user’s premises for a few hours and takes the role of an apprentice, asking why things have been done, observing what is happening, and recording the user’s answers. In this way, the analyst can build up a model of the user’s work: how it flows, how long it takes, its inputs and outputs and dependencies, where it currently works well and where it does not, the departures from orthodox procedures which ‘grease the wheels’, etc. In this way a better picture can be built up of where computational grids fit into UK researchers’ work, and how they can be made most useful. As there are many established systems for designing software and associated systems it may be difficult to decide which one to use. •
The archetypal users could be illustrated by constructing “personas” for them: a fictitious individual (fleshed out with background details) who represents the archetype, and simplifies further design work by grounding it in a concrete example [9]. In effect, you design for one particular person and his or her particular needs and tasks, rather than for the more nebulous – and so difficult to achieve consensus about – concepts of users and what they need.
7. “A SECOND ATTEMPT…” 7.1 Proposals for Requirements Analysis and Capture
•
•
The biggest difficulty we predict will be identifying which UK researchers who do not currently use computational grids could most benefit from them. We are as yet unsure how to approach this task. Thus far we have used our personal and professional acquaintances among the UK research community, asking them if they use significant amounts of
Produce design: Ideally, a design would be produced which detailed all the interface elements that a user (or ideally each persona) would interact with: screen displays and the transitions between them (in detail), the contents of error messages and other feedback (making sure to use terminology appropriate for the end-users rather than the programmers), etc. All these specifications should be documented in detail, along with the reasoning behind them, in an attempt to persuade the open source community that it is worth adhering to the design rather than using designs which have not been as well researched. Scenarios (perhaps humorous ones) would accompany the design, to illustrate how it fulfils the persona’s needs.
Identify user groups: This would include both existing user groups as well as potential user groups (who do not yet use computational grids, but can reasonably be expected to benefit from them). This should also consider the requirements of people along the different parts of the UK academic and commercial research and grid computing chains who are likely at some point to be affected by computational grids, or deal with their output – e.g. journal editors, as well as the researchers themselves, technical support workers, intellectual property rights lawyers, HPC and other computer administrators, university and corporate finance departments. Although some of these people may not interact with computational grids directly, they may interact with the researchers or the researchers’ output, and if their requirements are not fulfilled, the researcher may be put at a disadvantage and so be less likely to use the grids.
Analyse data: This would produce a set of archetypal users, and scenarios for their use of computational grids. From this the archetypal users’ requirements would be derived, which would form the basis of future design work.
This suggests that there are good “social” reasons for regarding the current Globus Toolkit security paradigm as in need of review, even without considering the technical issues involving the Toolkit’s use of X.509 certificates (such as those discussed in [7]).
We recommend that a formal requirements engineering process, centrally funded, be undertaken by a newly created official UK e-Science computational grid requirements team which has the support of the UK e-Science Directorate. As a starting point for discussion, we propose the following six-stage process:
Conduct fieldwork:
•
Test design: Paper prototyping [14] or other low fidelity prototyping techniques would be employed to conduct user trials/usability tests with potential users: Participants would be asked to conduct representative tasks using the prototype, and various indices of their success, failure, effort and “user experience” measured. This would avoid the expense of having to produce working software which may have to be thrown away or otherwise significantly altered. Moreover, with low fidelity prototypes the design can easily be altered ‘on-the-fly’ during testing. A relatively small number of test participants would be required, on the order of a dozen, and testing need only take 3 to 4 days in total.
•
Release design: Once the design had reached its usability targets, it could be released to the open source community for development. It should be presented as providing the community with an interesting challenge, namely, how can they overcome the technical difficulties inherent in implementing the design? It is important that it should not be perceived as an exercise in disrespecting and disempowering programmers and increasing their “drudge work”.
7.1.1 Questionnaire Design The requirements used in the current questionnaire were generated from the authors’ own experiences of academia and supporting academics. It is suggested that an improved questionnaire would ideally be based upon fieldwork with existing and potential users of the type described in the previous section. To ensure that a revised questionnaire gets representative responses, it is necessary to engage in more conventional surveying techniques, where a target population of academics is defined and systematically sampled. This depends upon our being able to define a valid population of users (which, in itself, is not a straightforward process, as described in the previous section), and identify its members so that we may choose which ones are sampled, rather than risking biases due to self-selection [12]. Well-designed surveys are difficult to prepare and expensive to conduct. If it were possible to base the questionnaire on fieldwork, then the questionnaire would take on the role of validating the fieldwork’s findings, rather than generating the requirements itself. A questionnaire survey could then be used as a compliment to the user trials that we have proposed as part of the requirements process, checking that our computational grid interface design is appropriate for its intended end-users. Respondents’ comments to the current questionnaire indicate that its design could have been improved, and this is particularly true of the un-trialled on-line version of the questionnaire. It is clear that any revised questionnaire should be trialled amongst a selection of the intended sample audience, in all the different formats in which it will be distributed. If an on-line questionnaire is to be used, the advice of those with web design expertise in this area should be sought.
7.2 Proposals for developing and testing Security Paradigms Developing completely new security paradigms is difficult work, as there are so many potential pitfalls from technical, organisational, social and human performance perspectives [15, 13]. It is proposed that a forum be created for researchers and practitioners of identity management and authentication, particularly those who work from human-computer interaction or human factors engineering perspective, to promote discussion and focus research in these issues upon grid computing. In the meantime, it is proposed that a literature review be commissioned to review current perceptions of best practice in digital certificate implementation. Because security, and in particular authentication, are such important requirements, we suggest bringing together leading practitioners and researchers in a workshop to discuss a more appropriate application of existing digital certificate technologies to computational grids. These
practitioners and researchers could be identified from the literature review.
7.3 Additional Suggestions We feel that it would be extremely helpful if some group, concerned with the infrastructure of computational grids in the UK, who would be prepared to undertake the task of overseeing the proposals in Sections 7.1 and 7.2, could be found (or created, if necessary). Such a group would need to be officially sanctioned in order to be successful, and this might be a role that a body such as the UK e-Science Engineering Task Force would be prepared to adopt. Since our proposals are concerned with the middleware used to build computational grids, it is clear that the group must have a strong emphasis on computational grid infrastructure. We also suggest that it may be profitable to implement some sort of top-down version control in computational grid middleware development projects, so that their adherence to appropriate requirement specifications can be assured.
8. CONCLUSIONS Our survey, even with all its weaknesses, has been a useful exercise and enabled us to better formulate the problems facing the development and implementation of usable computational grids in the UK, as well as providing valuable experience which has informed our proposals for future attempts to investigate and resolve these problems. We have shown that there are indeed grounds for believing that the current versions of the Globus Toolkit do not meet the requirements of the UK academic/scientific community. (For other perspectives on this issue, which nevertheless support this very general statement, see [4], [11].) Our analysis suggests that the current versions of the toolkit have significant shortcomings with respect to the needs of members of the UK academic/scientific community. It is our belief that these shortcomings are systemic, and thus cannot be dealt with by simply implementing more, or different, features. We suggest that a formal requirements engineering process should be undertaken to address these problems. We have also highlighted that the current security paradigm of the Globus Toolkit, which has been adopted by most implementers of computational grids, does not adequately take into account the behaviour of end-users with regard to security, and so is flawed. We suggest that a new security paradigm be developed as a matter of urgency, and that this development involve those with a human-computer interaction or human factors engineering perspective on the problems of security.
9. ACKNOWLEDGMENTS We would like to thank Kate Caldwell for her careful and close reading of this paper. Thanks are also due to Andrew Usher and Dr Proshun Sinha-Ray for their comments and advice. We would also like to thank Dr Rob Procter for allowing us the additional time necessary to complete this paper.
10. REFERENCES [1] Beckles, B., Ballard, S., and Brostoff, S. What Do I Want?: An Analysis of Potential Grid Users’ Requirements with
reference to the Globus Toolkit. Presentation to be given at the GlobusWORLD 2004 conference, 2004; copies of the presentation slides are available from the authors on request. [2] Beckles, B., Brostoff, S., and Ballard, S. Questionnaire on Computational Grids. Cambridge eScience Centre, 2003, http://www.escience.cam.ac.uk/questionnaire/CGquestionnai re.html [3] Beyer, H., and Holtzblatt, K. Contextual Design: Defining Customer-centered Systems. Morgan Kaufmann Publishers, London, 1998. [4] Chin, J., and Coveney, P. V. Towards tractable toolkits for the Grid: a plea for lightweight, usable middleware. Centre for Computational Science, Department of Chemistry, University College London, 2003, http://www.realitygrid.org/lgpaper.html [5] Cooper, A. The Inmates Are Running the Asylum: Why High-tech Products Drive Us Crazy and How to Restore the Sanity. Sams, 1999. [6] The Globus Alliance. The Globus Toolkit. http://www-unix.globus.org/toolkit/ [7] Lock, R., and Sommerville, I. Grid Security and its use of X.509 Certificates. DIRC internal Conference submission 2002. Lancaster DIRC, Lancaster University, 2002, http://www.comp.lancs.ac.uk/computing/research/cseg/proje cts/dirc/papers/gridpaper.pdf [8] The OpenSSL Project. The OpenSSL toolkit. http://www.openssl.org/
[9] Oppenheim, A. N. Questionnaire Design, Interviewing and Attitude Measurement. Continuum International Publishing Group - Academic and Professional, 2000. [10] Petrie, H. Password Clues. CentralNic, 2002, http://www.centralnic.com/page.php?cid=77 [11] Rixon, G. Problems with Globus Toolkit 3 and some possible solutions. Institute of Astronomy, University of Cambridge, 2003, http://wiki.astrogrid.org/bin/view/Astrogrid/GlobusToolkit3 Problems [12] Rosenthal, R., and Rosnow, R. The Essentials of Behavioural Research (second edition). McGraw Hill Book Company, Singapore, 1991. [13] Sasse, A., Brostoff, S., and Weirich, D. Transforming the “weakest link” – a human-computer interaction approach to usable and effective security. BT technology journal, 19(3), 122-131, 2001. [14] Snyder, C. Paper prototyping: The fast and easy way to design and refine user interfaces. Morgan Kaufmann Publishers, London, 2003. [15] Weirich, D., and Sasse, M. A. pretty good persuasion: a first step towards effective password security in the real world. Paper presented at the new security paradigms workshop, cloud croft, NM, 2001. [16] Whitten, A., and Tygar, J. D. Why Johnny can't encrypt: a usability evaluation of PGP 5.0. Paper presented at the 9th USENIX security symposium, Washington, 1999.