IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007
2687
Development of Human Performance Measures for Human Factors Validation in the Advanced MCR of APR-1400 Jun Su Ha, Poong Hyun Seong, Member, IEEE, Myeong Soo Lee, and Jin Hyuk Hong
Abstract—Main control room (MCR) man-machine interface (MMI) design of advanced nuclear power plants (NPPs) such as APR (advanced power reactor)-1400 can be validated through performance-based tests to determine whether it acceptably supports safe operation of the plant. In this paper, plant performance, personnel task performance, situation awareness, workload, teamwork, and anthropometric/physiological factor are considered for the human performance evaluation. For the development of human performance measures, attention is paid to considerations and constraints such as the changed environment in an advanced MCR, needs for a practical and economic evaluation, and suitability of evaluation criteria. Measures generally used in various industries and empirically proven to be useful are adopted as the main measures with some modifications. In addition, complementary measures are developed to overcome some of the limitations associated with the main measures. The development of the measures is addressed based on theoretical and empirical background. Finally we discuss the way in which the measures can be effectively integrated. The HUPESS (HUman Performance Evaluation Support System) which is in development is also briefly introduced. Index Terms—Advanced MCR, anthropometry/physiological factor, human performance, HUPESS, personnel task performance, plant performance, situation awareness, teamwork, workload.
I. INTRODUCTION
R
ESEARCH and development for enhancing reliability and safety in nuclear power plants (NPPs) have been mainly focused on areas such as automation of facilities, securing safety margin of safety systems, and improvement of main process systems. However the studies of Three Mile Island (TMI), Chernobyl, and other NPP events have revealed that deficiencies in human factors such as poor control room design, procedure, and training, are significant contributing factors to NPPs incidents and accidents [1]–[5]. Accordingly more attention has been focused on the human factors study. As processing and information presentation capabilities of modern computers are increased, modern computer techniques have been gradually introduced into the design of advanced Manuscript received March 8, 2007; revised August 6, 2007. This was supported in part by “The Development of the HFE V&V System for the Advanced Digitalized MCR MMIS” project. J. S. Ha and P. H. Seong are with the Department of Nuclear and Quantum Engineering, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Korea (e-mail:
[email protected];
[email protected]). M. S. Lee and J. H. Hong are with the MMIS Group, Korea Electric Power Research Institute, Daejeon 305-380, Korea (e-mail:
[email protected];
[email protected]). Digital Object Identifier 10.1109/TNS.2007.907549
main control rooms (MCRs) of NPPs [6], [7]. The design of instrumentation and control (I&C) systems for various plant systems is also rapidly moving toward fully digital I&C [8], [9]. For example, CRT(or LCD)-based displays, large display panels (LDP), soft-controls, a computerized procedure system, and an advanced alarm system were applied to APR-1400 (Advanced Power Reactor-1400) [10]. Hence the role of the operators in advanced NPPs shifts from a manual controller to a supervisor or a decision-maker [11] and the operator’ tasks have been more cognitive works. As a result, human factors engineering became more important in designing a MCR of an advanced NPP. In order to support the advanced reactor design certification reviews, the human factors engineering program review model (HFE PRM) was developed with the support of U.S. NRC [4]. Integrated system validation (ISV) is part of this review activity. An integrated system design is evaluated through performance-based tests to determine whether it acceptably supports safe operation of the plant [12]. NUREG-0711 and NUREG/CR-6393 provide the general guideline for the ISV. However in order to validate a real system, appropriate measures should be developed in consideration of the actual application environment. A lot of techniques for the evaluation of human performance have been developed in a variety of industrial area. Especially, OECD Halden Reactor Project has been conducting lots of studies regarding human factors in nuclear industry [13]–[18]. There were also R&D projects concerning human performance evaluation in South-Korea [10], [19]. The studies provide not only valuable background but also measures for human performance evaluation. In this paper, human performance measures are developed in order to validate the MMI design in the advanced MCR of APR1400. Plant performance, personnel task performance, situation awareness, workload, teamwork, and anthropometric and physiological factor are considered as factors for the human performance evaluation. Measures generally used in various industries and empirically proven to be useful are adopted as main measures with some modifications. In addition, helpful measures are developed as complementary measures in order to overcome some of the limitations associated with the main measures. The development of the measures is addressed based on the theoretical and empirical background and also based on the regulatory guidelines for the ISV such as NUREG-0711 and NUREG/CR6393. In addition, for the development of the measures in each of the factors, attention is paid to considerations and constraints, which will be addressed in the following section.
0018-9499/$25.00 © 2007 IEEE Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
2688
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007
Fig. 1. Factors for human performance evaluation.
II. HUMAN PERFORMANCE EVALUATION: NEEDS, CONSIDERATIONS, CONSTRAINTS, AND PERFORMANCE CRITERIA The objective of the ISV is to provide evidence that the integrated system adequately supports plant personnel in the safe operation of the relevant NPP [12]. The safety of a NPP is a concept which is not directly observed but must be inferred from available evidence. The evidence can be obtained through a series of performance-based tests. Consequently, if the integrated system is assured to be operated within acceptable performance ranges, the integrated system is considered to supports plant personnel in the safe operation. The operator’s tasks are generally performed through a series of cognitive activities such as monitoring the environment, detecting data or information, understanding and assessing the situation, diagnosing the symptoms, decision-making, planning responses, and implementing the responses [5]. Hence, the MMI design of a MCR should have capability to support the operators in performing these cognitive activities by providing sufficient and timely data and information in an appropriate format. Effective means for the system control should be provided in an integrated manner as well. If the MMI design has the capability, the operators can effectively monitor and detect the data and information representing the plant status, understand the state of the plant system correctly, which also support appropriate diagnosing the plant system, decision-making, and thus responses planning, and then implement the responses. Consequently, the suitability of the MMI design of a MCR is validated by evaluating human (operator) performance resulting from the series of cognitive activities. A dynamic mock-up including the simulator for the APR-1400 is utilized as a validation facility. In this paper, plant performance, personnel task performance, situation awareness, workload, teamwork, and anthropometric and physiological factors are considered for the human performance evaluation (see Fig. 1), which is also recommended in regulatory guideline [4], [12]. A. Considerations and Constraints In this paper, human performance measures are developed based on some considerations and constraints. Firstly, the
operating environment in an advanced MCR is changed from the conventional analog based MMI to digitalized one. As O’Hara and Robert [97] pointed out, there are three important trends in the evolution of advanced MCRs such as increased automation, development of compact and computer-based workstations, and development of intelligent operator aids. Increases in automation result in a shift of operator’s roles from a manual controller to a supervisor or a decision-maker. The role change is typically viewed as positive from a reliability standpoint since unpredictable human actions can be removed or reduced. Thus the operator can better concentrate on supervising the overall performance and safety of the system by automating routine, tedious, physically demanding, or difficult tasks. However inappropriate allocation of functions between automated systems and the operator may results in adverse consequences such as poor task performance, out-of-loop control coupled with poor situation awareness, and so on [12]. In addition, the shift in the operator’s role may lead to a shift from high physical to high cognitive workload, even though the overall workload can be reduced. Computer–based workstation of advanced MCRs, which has much flexibility offered by software–driven interface such as various display formats (e.g., lists, tables, flow charts, graphs, etc.) and diverse soft-controls (e.g., touch screen, mice, joy sticks, etc.), is thought to affect the operator performance as well. Information is typically presented in pre-processed or integrated forms rather than raw data of parameters and much information is condensed in a small screen. In addition, the operator has to manage the display in order to obtain data and information which he or she wants to check. Hence poorly designed displays may mislead and/or confuse the operator and thus increase excessively cognitive workload, which can lead to human errors. Due to these changes of the operating environment, the operator’s tasks in an advanced MCR are conducted in a different way from the conventional one. Hence enhanced attention should be paid to operator task performance and cognitive measures such as situation awareness and workload. Secondly, the evaluation of human performance should be practical and economic. Since the aim of the performance evaluation considered in this paper is eventually to provide an effective tool for the validation of MMI design of an advanced MCR, evaluation techniques should be practically able to provide technical basis in order to get the operation license. In addition, the ISV is performed through a series of tests which require considerable resources (e.g., time, labor, or money) from preparation to execution. Hence economic methods which are able to save resources are required. In order to consider these constraints, techniques proven to be empirically practical in various industries are adopted as main measures with some modifications and complementary measures are developed to supplement the limitations associated with main measures. Both the main measure and the complementary measure are used for the evaluation of plant performance, personnel task performance, situation awareness, and workload. Teamwork and anthropometric and physiological factors are evaluated with only main measure. In addition, all the measures are developed to be evaluated simultaneously without interfering with each other. For example, if simulator-freezing techniques such as SAGAT
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400
(situation awareness global assessment technique) or SACRI (situation awareness control room inventory) are adopted for the evaluation of situation awareness, it is thought that the simultaneous evaluation of workload might be interfered by that of situation awareness. Thirdly, evaluation criteria for the performance measures should be clear. If it is not applicable to provide clear criteria, the criteria should, at least, be reasonable in the state-of-the-art. As mentioned before, empirically proven techniques in various industries are adopted as main measures with some modifications in order to provide clear or reasonable criteria. More specifically, we focused on the techniques which have been used in nuclear industry so that we may utilize the results of the studies as reference criteria. Main measures are used to determine whether the performance is acceptable or not, whereas complementary measures are used to compare and then scrutinize the performance among operators or shifts or supplement the limitation of the main measures. B. Performance Criteria The performance measures represent only the extent of the performance in the relevant measures. For example, if NASA-TLX (national aeronautic and space administration task load index) which uses 7 point scale is used for the evaluation of workload during operators’ tasks in NPPs, some scores such as 4 or 6 represent the extent of the workload induced by the relevant tasks. The scores can be interpreted into the evaluation results with reasonable criteria for the evaluation. Hence the acceptability of the performance in each of the measures should be evaluated on the basis of performance criteria. The literature [12] summarizes approaches to establishing criteria, which vary based on types of comparisons such as requirement referenced, benchmark referenced, normative referenced, and expert-judgment referenced. Firstly, the requirement referenced is a comparison of the performance in the integrated system considered with an accepted and quantified performance requirement based on engineering analysis, technical specification, operating procedures, safety analysis reports, and/or design documents. Specific values in the plant parameters required by technical specification and time requirements for critical operator actions can be used as criteria for the requirement referenced comparison. When the requirement referenced comparison is not applicable, the other approaches are typically employed. Secondly, the benchmark referenced is a comparison of the performance in the integrated system considered with that of a benchmark system which is predefined as acceptable under the same or equivalent conditions. There was a project for the ISV of a modernized NPP MCR which is based on the benchmark referenced comparison [98]. The MCR of the 30-year-operated NPP was renewed with modernization of the major part of the MCR MMI. In the project, it was judged that the human performance level in the existing MCR could be used as an acceptance criterion for the human performance in the modernized MCR. Hence if the human performance in the modernized MCR is evaluated as better than or at least equal to that in the existing MCR, the modernized MCR can be considered as acceptable. On the other hand, if a totally new MCR (i.e., an advanced MCR) is considered for the ISV, this approach is also applicable. For example, if the operator workload in an advance MCR is not exceeding that in a reference MCR (conventional one) which is identified as acceptable, this
2689
can be used as criteria for the benchmark referenced comparison. Thirdly, the normative referenced comparison is based on norms established for performance measures through its use in many system evaluations. The performance in the integrated system considered is compared to the norms established under the same or equivalent conditions. In aerospace industry, the use of the Cooper-Harper scale and the NASA-TLX for workload assessment are examples of this approach [12]. Finally, the expertjudgment referenced comparison is based on the criteria established through the judgment of subject matter experts (SMEs). In the following section, the human performance measures are described one by one with the performance criteria considered in this paper. III. HUMAN PERFORMANCE MEASURES A. Plant Performance The principal objective of operators in a NPP MCR is to operate the NPP safely. The operators’ performance can be evaluated by observing whether the plant system is operated within acceptable safety level which can be specified by the process parameters of the NPP. The operators’ performance which is measured by observing, analyzing, and then evaluating process parameters of a NPP is hence referred to as plant performance. Since a NPP is usually operated by a crew as a team, the plant performance is considered as a crew performance rather than individual performance. The plant performance is a result of the operators’ activities including individual tasks, cognitive activities, teamwork, and so on. Hence, measures of the plant performance can be considered as product measures, whereas the other measures for personnel task performance, situation awareness, workload, teamwork, and anthropometric and physiological factor can be considered as process measures. Product measures provide an assessment of results while process measures provide an assessment of how that result was achieved [17]. Since the achievement of safety and/or operational goals in NPPs is generally determined by values of process parameters, the plant performance can be directly interpreted into whether the goals in NPPs are achieved or not, which is considered as a favorable merit. There are usually values (e.g., set-points) required to assure the safety of NPPs (or the sub-systems of a NPP) in each of process parameters. Another favorable merit of the plant performance is that objective evaluation can be conducted because explicit data are obtainable. In a loss of coolant accident (LOCA), for example, an important goal is to maintain the pressurizer level, which can be evaluated by examining the plant performance measure regarding the pressurizer level. However information on how the pressurizer level is maintained in the required level is not provided by the plant performance measures, which is considered as a demerit of the plant performance measures. Braarud and Skraaning [98] argued that the plant performance measure in isolation do not inform about human performance. As Moracho [17] pointed out, the plant performance should be considered as a global performance of a crew’s control, that is, a product. The human performance accounting for the process should be evaluated by the other measures for personnel task performance, situation awareness, workload, teamwork, and anthropometric and physiological factor. Also
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
2690
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007
there can be another challenging case that the plant can be operated within acceptable ranges even though design faults in human factors exist. For example, a highly experienced crew can operate a plant system within acceptable range, even though the MMI is poorly designed. This is another reason that the other performance measures should be considered to complement the plant performance [12]. In order for the plant performance to be more informative, attention should be deliberately paid to three problems such as preparation of test scenarios, selection of important process parameters, and integrated analysis with the other measures. Firstly, test scenarios must be designed so that effects of MMI design (e.g., a new design or design upgrade) of interest can be manifested in operators’ performance, which is also expected to improve the quality of evaluations with the other performance measures. Secondly, process parameters sensitive to and representative of operators’ performance must be selected as important process parameters. Thirdly, the plant performance should be analyzed with the other measures in an integrated manner, which will be addressed in more detail in chapter 4. In this paper, operational achievement in important process parameters is considered for the evaluation of the plant performance. Several important process parameters are selected by SMEs (process experts). It is used as a main measure for the evaluation of the plant performance whether the values of the selected process parameters are maintained within upper and lower operational limits (within acceptable range) or not. Beside, discrepancy between operationally suitable values and observed values in the selected process parameters is utilized in order to score the plant performance as a complementary measure. Also at the end of test scenarios, the process parameters should be within a range of values, which is called target range, to achieve the plant safety. The elapsed time from event (e.g., transient or accident) to getting into the target range in each of the selected process parameters is calculated with simulator logging data. 1) Main Measure: Checking Operational Limits: First of all, if test scenarios are developed, SMEs (process experts) select important process parameters (empirically 5 to 7) for each of the scenarios. After reviewing operating procedures, technical specifications, safety analysis reports, design documents, and so on, upper and lower operational limits for the safe operation of NPPs are determined by the SMEs (process experts). During a test (a validation test), it is confirmed whether the values of the selected parameters exceed those of the upper and lower limits or not. If the values don’t exceed the limits, the plant performance is evaluated as acceptable. The evaluation criterion of this measure is based on the requirement referenced comparison. The values of the parameters can be obtained from the logging data of a simulator. 2) Complementary Measure: Discrepancy Score & Elapsed Time From Event to Target Range: During the test, the discrepancies between operationally suitable values and the observed values in the selected process parameters are calculated. This kind of evaluation techniques were applied to PPAS (plant performance assessment system) and effectively utilized for the evaluation of plant performance [13], [17]. The operationally suitable value is assessed as a range value not a point value by SMEs (process experts), because it is not easy to assess the operationally suitable value as a specific point value. Hence, it
has upper and lower bounds. The range value should represent good performance expected for a specific scenario (e.g., LOCA or transient scenario). Also the assessment of the operationally suitable value should be based on some references such as operating procedures, technical specifications, safety analysis reports, design documents, and so on. If the value of a process parameter is getting beyond the range (e.g., upper bound) or getting under the range (e.g., lower bound), the discrepancy is used for calculation of the complementary measure. In mathematical forms, first, discrepancy in each of the parameters is obtained, as follows:
(1)
where discrepancy of parameter at time during the test, value of the parameter at time during the test, upper bound value of the operationally suitable value, lower bound value of the operationally suitable value, mean value of the parameter during initial steadystate, simulation time after an event occurs. Here, the discrepancy between observed value and the operationally suitable value in each parameter is normalized by dividing it by the mean value of parameter obtained during initial steady-state, because all the discrepancy in the parameters are eventually integrated into a measure, that is, a kind of total discrepancy. The normalized discrepancy of parameter is summed up over test time . (2) where, averaged sum of normalized discrepancy of parameter over the test time, . The next step is to obtain weights of the selected process parameters. The analytic hierarchy process (AHP) is used as tool for evaluating the weights. The AHP has the merits of being useful to structure a decision problem hierarchically and to obtain the weighting values quantitatively. The AHP serves as a framework to structure complex decision problems and provide judgments based on the expert’s knowledge and experience to derive a set of weighting values by using the pair-wise comparison [20]. The averaged sums of the parameters are multiplied by the weights of the relevant parameters and then the multiplied values are summed up, as follows: (3) where, total discrepancy during the test, total number of the selected parameters, weighing value of parameter .
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400
2691
At the end of the test, another measure for the discrepancy is calculated, which can represent the ability of a crew to complete an operational goal. (4) where, discrepancy of parameter at the end of the test, value of parameter at the end of the test, upper bound value of the operationally suitable value, lower bound value of the operationally suitable value, mean value of the parameter during initial steady state. The normalized discrepancy of parameter is multiplied by the weight of each parameter and then the multiplied values are summed up, as follows: (5) where, total discrepancy at the end of the test, Low total discrepancy means better plant performance. This measure is used for comparing the performance among the crews or test scenarios rather than for determining it is acceptable or not. Finally, the elapsed time from an event (e.g., transient or accident) to getting into the target range in each of the selected process parameters is calculated with simulator logging data. This measure is basically based on the fact that shorter time spent in accomplishing a task goal represents good performance. This measure is calculated at the end of test. Considering some fluctuation in the parameter, the time when the parameter is stabilized should be considered as the measure. The evaluation criteria of these measures are hence based on both the requirement referenced and the expert-judgment referenced comparisons. B. Personnel Task Performance Even though plant performance is maintained within acceptable ranges, design faults or shortcomings may result in unnecessary work being placed on operators. Personnel task measures provide complementary data to plant performance measures. Personnel task measures can reveal potential human performance problems, which were not found in the evaluation of the plant performance [12]. As mentioned before, personnel tasks in the MCR can be summarized as a series of cognitive activities. Consequently, operators’ tasks can be evaluated by observing whether they monitor or detect the relevant data or information to the situation, whether they perform correct responses, and then whether the sequence of the series of activities is appropriate [18]. 1) Main Measure: Confirming Indispensable Tasks & Completion Time: Whether the cognitive activities are performed correctly or not can be evaluated by observing a series of tasks. Some elements of the cognitive activities are observable, even
Fig. 2. Hierarchical task analysis.
though the others are not observable but inferable. The activities related to detection or monitoring and execution can be considered as observable activities, whereas the other cognitive activities can be inferred from the observable activities [21]. Consequently, personnel task performance can be evaluated by observing whether the operators monitor and detect the appropriate data and information, whether they perform appropriate responses, and finally whether the sequence of the processes is appropriate. Primary task and secondary task should be evaluated for the personnel task evaluation. For an analytic and logical measurement, a validation test scenario is hierarchically analyzed and then an optimal solution for the scenario is developed, as shown in Fig. 2. Since operators’ tasks in NPPs are generally based on the goal-oriented procedure, the operating procedure provides the guide for the development of the optimal solution. Main goal refers to a goal to be accomplished in a scenario. The main goal is located at highest rank and it breaks down into the sub-goals to achieve it: the sub-goals can also break down, if needed. There are detections, operations, and sequences to achieve the relevant sub-goal in the next rank. Detections and operations break down into detailed tasks to achieve the relevant detections and operations, respectively. Tasks located in the bottom rank in Fig. 2 comprise a crew’s tasks required for completion of the main goal. Top-down and bottom-up approaches are utilized for the development of the optimal solution. Next, indispensable tasks required for safe NPP operation are determined by SMEs (process experts). During the test, SMEs (the same or other process experts) observe the operators’ activities, collect data such as operators’ speech, behavior, cognitive process, and logging data, and then evaluate whether the tasks located in the bottom rank are appropriately performed or not. If all the indispensable tasks are satisfied, personnel task performance is considered as acceptable. The evaluation criterion of this measure is hence based on both the requirement referenced and the expert-judgment referenced comparisons. It should be noted that there is possibility that the operators may implement the tasks in different way from the optimal solution according to their strategy which is not considered by the SMEs (process experts) in advance. In this case, the SMEs should check and record the
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
2692
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007
operators’ activities during the test and then some part of the optimal solution are revised based on the observed activities after the test. The task performance is reevaluated with the revised solution and the collected data. In addition, task completion time is also evaluated. Time to complete each of the tasks located in the bottom rank is evaluated based on experience and expertise of the SMEs. The summation of the evaluated times can be interpreted as a required time to complete a goal. If the real time spent for the completion of a goal in a test is less than or equal to the required time, time performance of the personnel task is considered as acceptable. 2) Complementary Measure: Scoring Task Performance: The main measure, a kind of descriptive measure, is complemented by scoring the task performance, which can be used for analyzing and comparing performance among crews or test scenarios. First, the weights of the elements in the optimal solution shown in Fig. 2 are calculated using the AHP. Second, as mentioned above section, the operators’ activities are observed and evaluated during a test. Third, it is evaluated by SMEs (process experts) whether the respective tasks are satisfied in an appropriate sequence. Finally, the task performance is scored with the observed and evaluated data and the weights of the tasks. Higher score means higher task performance. Hence the evaluation criterion of this measure is based on the expert-judgment referenced comparison. This kind of measures was used in OPAS (operator performance assessment system) and reported to be reliable, valid, and sensitive indicator of human performance in dynamic operating environments [18]. In mathematical forms, first, there are two kinds of calculation such as task score and sequence score. Each task score is calculated, as follows:
C. Situation Awareness (SA) In NPPs, the operator’s actions must always be based on identification of the operational state of the system. As shown in TMI accident [22], incorrect SA may contribute to the propagation or occurrence of accidents. Consequently, SA is frequently considered as a crucial key to improve performance and reduce error [23]–[25]. There have been discussed definitions of SA in the literature [26]–[29]. One of the most influential perspectives on SA has been put forth by Endsley who informally notes that SA concerns “knowing what is going on” [29]. Endsley defined more precisely that “situation awareness is the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future” [29]. Considering that the operator’s tasks in NPPs can be summarized as a series of cognitive activities such as monitoring, detecting, understanding, diagnosing, decision-making, planning, and implementing, the operator’s tasks can be significantly influenced by the operator’s SA. Consequently, it is recognized that correct SA is one of the most critical contribution to safe operation in NPPs. Moreover the advanced MCR in APR-1400 adopts new technologies such as CRT(or LCD)-based displays, large display panels (LDP), soft-controls, a computerized procedure system, and an advanced alarm system. Even though operators are expected to be better aware of situation with the new technologies, there is also possibility that the changed operational environment can deteriorate SA of the operators. As O’Hara and Robert [97] pointed out, there can be difficulty in navigating and finding important information through computerized systems, loss of operator vigilance due to automated systems, and loss of the ability to utilize well-learned and rapid eye scanning patterns and pattern recognition from spatially fixed parameter displays. Hence, the new design of an advanced MCR should be validated throughout the ISV tests, that is, performance-based tests. Measurement techniques which were developed for SA measurement can be categorized into 4 groups such as performance-based, direct query & questionnaire, subjective rating, and physiological measurement techniques [12], [25]. O’Hara et al. [12] pointed out that performance-based techniques have both logical ambiguities in their interpretation and practical problems in their administration. Thus they may not be well suited for ISV tests. Direct query & questionnaire techniques can be categorized into post-test, on-line-test, and freeze techniques according to the evaluation point over time [30]. This kind of techniques is based on questions and answers regarding the SA. Among them, it takes up much time to complete the detailed questions and answers generally used in the post-test technique, which can lead to incorrect memory problems of operators. In addition, the operator has a tendency to overgeneralize or rationalize their answers [31]. The on-line-test techniques require questions and answers during the test to overcome the memory problem. However, the questions and answers can be considered as another task, which may distort the operator performance [12]. The freeze techniques require questions and answers by randomly freezing the simulation to overcome the demerits of the post-test and on-line-test techniques. One of the most representative techniques is SAGAT (Situation Awareness Global Assessment Technique) which has been employed
(6) where, follows:
task- score. Each sequence score is calculated, as
(7) sequence- score. Finally personnel task score where, can be calculated by summing up the weighted task scores and sequence scores. (8) where, the personnel task score total number of the tasks in the bottom rank total number of the sequences considered weighting value of taskweighting value of sequence-
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400
across a wide range of dynamic tasks including air traffic control, driving, and NPP control [32]. The SAGAT has the advantages of being easy to use (in a simulator environment), possessing good external indices of information accuracy, and possessing well-accepted face validity [33]. However a criticism of the SAGAT has been that the periodic interruptions are too intrusive, contaminating any performance measures, which is related to the concern that the questions may cue participants (e.g., operators) to some details of the scenario, setting up an expectancy for certain types of questions [12], [34], [35]. Meanwhile Endsley has shown that performance measures (e.g., kills and losses in an air-to-air fighter sweep mission) are not significantly affected by the conditions of simulation freeze or non-freeze [32] and by question point in time, question duration, and question frequency for the SAGAT measurement [36], [37]. However she recognized that it is never possible to “prove” that SAGAT does not influence performance, pointing out that all of the studies collected so far indicate that it does not appear to significantly influence performance as long as the stops (or freezes) are unpredictable to the subject [32]. There are studies of the SACRI (situation awareness control room inventory) which is adapted after SAGAT for use in a NPP [38], [39]. The SACRI has been developed for the use in the NORS simulator in the Halden reactor project (HRP). Subjective rating techniques typically involve assigning a numerical value to the quality of SA during a particular period of event [40]. Subjective ratings techniques are popular because these techniques are fairly inexpensive, easy to administer, and non-intrusive [33], [40]. However, there have been criticisms. First, participants’ (or operators’) knowledge may not be correct and the reality of the situation may be quite different from what they believe [41]. Second, SA may be highly influenced by self-assessments of performance [33]. Third, operators will probably be inclined to rationalize or overgeneralize about their SA [41]. In addition, some measures such as SART and SA-SWORD include workload factors rather than limiting the techniques to SA measurement itself [12]. Physiological measurement techniques have been used to study complex cognitive domains such as mental workload and fatigue and very few experiments have been conducted to study SA [42]. Even though physiological measures are likely to require high cost of collecting, analyzing, and interpreting the measures, compared with the subjective rating and performance-based measurement techniques, they have unique properties considered attractive to researchers in SA field. First, it does not require intrusive interference such as freezing the simulation. Second, it can provide continuous indication on the SA in contrast to the above-mentioned techniques. Third, it is possible to go back and assess the situation, because it is continuously recorded. In nuclear industry, eye fixation measurement has been used as an indicator for SA, which is called VISA (visual indicator of situation awareness) [16]. In an experimental study of the VISA, time spent on the eye fixation has been proposed as a visual indicator of SA. The results of the VISA study showed that SACRI scores were correlated with the VISA, which was somewhat inconsistent between two experiments in the study. Even though these techniques cannot provide clearly how much information is retained in memory, whether the information is registered correctly, or what comprehension the subject has of those elements
2693
[31], [42], it is believed that physiological techniques can be potentially helpful and useful indicators regarding SA. In this paper, a subjective rating measure is used as the main measure for the SA evaluation, even though it has some drawbacks mentioned before. Eye fixation measurement is also used for the complementary measure. 1) Main Measure: KSAX: KSAX [10] is a subjective ratings technique adapted from the SART [43]. After completion of a test, the operators subjectively assess their own SA on a rating scale and provide the description or the reason why they give the rating. One of the crucial problems in the use of SART was that workload factors could not be separated from the SA evaluation. In the KSAX, Endsley’s SA model has been applied to the evaluation regime of the SART. The KSAX has been successfully utilized in the evaluation of suitability for the design of soft control and safety console for the APR1400 [10]. Since SA is evaluated based on questionnaire after a test, the operators are not interfered by the evaluation activities. Consequently, it does not affect the evaluations of the other performance measures, especially cognitive workload, which leads to economic evaluation of human performance for the ISV. All the measures considered in this paper can be evaluated in one test. In addition, the KSAX results from an antecedent study for the APR-1400 [10] can be utilized as a criterion based on the benchmark referenced comparison for the ISV, which is considered as an important merit [12]. The questionnaire of the KSAX consists of several questions regarding the level 1, 2, and 3 SAs defined by Endsley. Usually 7 point scale is used for the measurement. The rating scale is not fixed but the use of 7 point scale is recommended, because the antecedent study used 7 point scale. The questions used in KSAX are made such that SA in an advanced NPP is compared with that of the already licensed NPPs. Generally, the operators who have been working in the licensed NPPs are selected as participants for the validation tests. Therefore if the result of SA evaluation in an advanced NPP is evaluated as better than or equal to that in the licensed NPP, the result of the SA evaluation is considered as acceptable. The evaluation criterion of this measure is hence based on the benchmark referenced comparison. 2) Complementary Measure: Continuous Measure Based on Eye Fixation Measurement: The subjective measure of SA can be complemented by a continuous measure based on eye fixation data which is a kind of physiological measures. Since KSAX is evaluated subjectively after a test, it is not possible to continuously measure the operator’s SA and to secure the objectivity. A physiological method generally involves the measurement and data processing of one or more variables related to human physiological processes. The physiological measures are known as being objective and can provide continuous information on activities of subjects. These days, there are developed eye tracking systems which have capability to measure a subject’ eye movement without direct contact. Hence the measurement of the eye movement is not intrusive to the operators’ activities. In the majority of cases, the primary means of information input to the operator are through the visual channel. An analysis of the manner in which the operator’s eyes move and fixate gives an indication of the information input. Hence, even though the eye fixation measurement cannot tell the operator’s SA ex-
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
2694
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007
actly, we believe that it can extract the indication regarding the operator’s SA, which then can be used as a complementary indicator for the SA evaluation. In NPPs, there are a lot of information sources that should be monitored but the operators have only limited capacity of attention and memory. Because it is impossible to monitor all information sources, the operators continuously decide where to allocate their attentional resources. This kind of cognitive skill is called selective attention. The operators use this cognitive skill to overcome the limitations of human attention. The stages of information processing depend on mental or cognitive resources, a sort of pool of attention or mental effort that is of limited availability and can be allocated to processes as required [44]. When an abnormal situation occurs in an NPP, the operators try to understand what is going on in the plant. The operators receive information from the environment (e.g., indicators or other operators) and process the information to establish situation model based on their mental model. As O’Hara et al. [45] summarized, a situation model is an operator’s understanding of the specific situation, and the model is constantly updated as new information is received. Mental model refers to the general knowledge governing the performance of highly experienced operators. Mental model includes expectancies on how the NPPs will behave in various abnormal situations. For example, when a LOCA occurs, the pressurizer pressure, temperature, and level will decrease, and the containment radiation will increase. These expectancies form rules on the dynamics of the NPPs and the mental model is established based on these rules [46]. When an abnormal or accident situation occurs, operators usually first recognize it by the onset of salience such as alarm or deviation in process parameters from the normal condition. Then, they develop their situation awareness or establish their situation model by selectively attending the important information sources. The maintenance of their situation awareness or confirmation of their situation model is accomplished by iterating the selective attention. The selection of the information sources to attend is typically driven by four factors: salience, expectancy, value, and effort [44]. The operators are expected to attend salient information sources. With the expectancy our attention is shifted to the specific sources which are most likely to provide information. For example, the pressurizer pressure, temperature, and level in NPPs decrease in both accidents, LOCA and SGTR (steam generator tube rupture). The two accidents can be distinguished by observing the containment radiation or feed and steam flow deviation (also there is other information that distinguishes the two accidents). The value of the containment radiation is changed in LOCA not in SGTR. The feed/steam flow deviation in one of the steam generators is changed in SGTR not in LOCA. If the pressurizer pressure, temperature, and level decrease, the operators may frequently look at the salient indicators of the pressure, the temperature, and the level and may consider that the accident may be a LOCA or a SGTR. Then they are expected to attend to the indicator of the containment radiation or the indicators representing the feed/steam flow deviation. If the containment radiation increases, the operators probably consider the accident as a LOCA. However, there is possibility of the failure of the indicator of the containment radiation, even though the likelihood is very small. Hence, they may
look at the indicators representing the feed/steam flow deviation in order to get more information (or evidence). In this case, if the accident is a LOCA, no change would be observed in the feed/steam flow deviation. This information is also important to understand the situation correctly, even though no salience is provided in this information. Also the frequency of looking at or attending to information source is modified by how valuable it is to look at. As mentioned before, there can be other information that distinguishes the two accidents. However, there are usually representative dynamics in NPPs which should be established in the operators’ mental model through training and experience. The mental model determines the values of the information sources. Therefore, well-trained and experienced operators are expected to attend the valuable information sources more frequently. Finally, selective attention may be inhibited if it is effortful compared to its value. If there are two information sources with the same value (importance), the operators may attend the information source easy to access. To summarize the selective attention, the operators are expected to attend the information sources which are salient, important (valuable), and easy to access. Consequently, in order for the operators to effectively monitor, detect, and thus understand the state of a system, the operator should allocate their attentional resources not only to the salient information sources but also to valuable information sources. The eye fixations on areas of interest (AOIs) that are important for solving the problems can be considered as an index of monitoring and detection, which then can be interpreted into the perception of the elements (level 1 SA). As we think about or manipulate perceived information in working memory, an action is delayed or not executed at all [44]. Consequently, time spent on the AOIs by the operators can be understood as an index for the comprehension of their meaning (level 2 SA). As mentioned before, the selective attention is associated with expectancy for the near future. The projection of their status in the near future (level 3 SA) is, therefore, can be inferred from the sequence of the eye fixations. In this paper, the eye fixation on the AOIs, the time spent on AOIs, and the sequence of the fixations are used for the SA evaluation. SMEs (process and/or human factors experts) analyze the eye fixation data after the completion of a test. It is recommended that the analysis be performed for specific periods representing the task steps in the optimal solution of the personnel task performance. For example, the times spent for achieving the sub-goals in the optimal solution can be used as the specific periods for the analysis. The attention should be paid to finding out deficiencies of the MMI design or the operators’ incompetence leading to inappropriate ways of the eye fixations. For each of the periods, SMEs analyze the eye fixation data and evaluate the SA as one of three grades such as excellent, appropriate, or not appropriate. The evaluation criterion of this measure is hence based on the expert-judgment referenced comparisons. Even though this technique has the drawback that the eye fixation data should be analyzed by the SMEs, which requires much effort and time, it is thought that only the SMEs can provide meaningful evaluation from the eye fixation data, because the SMEs have usually most knowledge and experience about the system and the operation. The authors performed an experimental study with a simplified NPP simulator [47]. In the experiments, the eye fixation data
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400
Fig. 3. Example of the eye fixation measurement.
during complex diagnostic tasks were analyzed. The results showed that the eye fixation patterns of subjects with high, medium, or low expertise were different in the same operation conditions. The subject who has more knowledge about the system fixated various information sources with short fixation time; iteratively fixated at the important information sources; and then reported the situation with high confidence. However, the subject who has poor knowledge spent much time on salient information sources; didn’t fixate various information sources important to solve the problem; and then reported the situation with low confidence (it seemed just a guess). As shown in Fig. 3, a computerized system for the eye fixation analysis facilitate the SA evaluation. The number centered in the circle represents the order of the fixation. The area of the circle is proportional to the fixation time which is displayed at the bottom of the circle. D. Workload Workload has important relationship to human performance and error [12]. Despite its importance, generally accepted definition of cognitive workload is still not available [48]–[50]. O’Donnell and Eggemeier defined the workload as the portion of the operator’s limited capacity actually required to perform a particular task [51]. Consequently, more mental resources are required as the cognitive workload is increased. If the cognitive workload exceeds the limit of the operator capacity, human errors may occur and then human performance would be deteriorated [52]. In advanced MCRs, advanced information technologies are applied and thus the environment requires the operators to play the role of supervisor or decision-maker rather than manual controller. The operator’s tasks are expected to require increased mental activities rather than physical activities. Consequently, the evaluation of the cognitive workload has been considered as one of the most important factors to be evaluated for the ISV. Generally, techniques for measuring cognitive workload can be divided into two broad types: predictive and empirical [12]. Predictive techniques are usually based on mathematical modeling, task analysis, simulation modeling and expert’s opinions. These techniques do not require operators to participate in simulation exercises. Thus, they are typically used in the
2695
early stages of design process and therefore, are thought not to be suitable for the ISV stage [12]. Empirical techniques can be divided into three types: performance-based, subjective ratings, and physiological measures [53]. Performance-based techniques are categorized into primary task measures and secondary task measures. Primary task measures are not suitable for the measurement of cognitive workload associated with monitoring or decision-making tasks like in NPPs and secondary task measures have the drawback that it can contaminate human performance by interfering the primary tasks [44]. Subjective ratings techniques measure the cognitive workload experienced by a subject (or an operator) through a questionnaire and an interview. Since subjective measures have been found to be reliable, sensitive to changes in workload level, minimally intrusive, diagnostic, easy to administer, independent of tasks (or relevant to a wide variety of tasks) and possessive of a high degree of operator acceptance, they have been most frequently used in a variety of domains [54]–[60]. There are representative subjective measures such as overall workload (OW), modified cooper-harper scale (MCH), subjective workload assessment technique (SWAT), and national aeronautic and space administration task load index (NASA-TLX). Hill et al. verified the models of SWAT, NASA-TLX, OW, and MCH by examining the reliability of the methods and evaluated that NASA-TLX is superior in validity and NASA-TLX and OW are superior in usability [60]. Physiological techniques measure the physiological change of autonomic or central nervous system associated with cognitive workload [44]. Electroencephalogram (EEG), evoked potential, hear rate related measures, and eye movement related measures are representative tools for cognitive workload evaluation based on the physiological measurements. Even though various studies reported that the EEG measures have proven sensitive to variations of mental workload during tasks such as in-flight mission [63], [64], air traffic control [65], automobile driving [66], and so on, the use of EEG is thought to be limited for the ISV, because usually multiple electrodes should be attached to an operator’ head to measure the EEG signals, which may restrict the operator’s activities and thus may contaminate the operator’s performance in dynamic situations. With regard to evoked potential (EP) or event relate potential (ERP) analysis, wave patterns regarding latencies and amplitudes of each peak are analyzed after providing specific stimulations. The EP is thought not to be applicable to the study on complex cognitive activities in the ISV, because event evoking the EP should be simple and iterated quite many times [67]. Measures of heart rate (HR) and heart rate variability (HRV) have proven sensitive to variations in the difficulty of tasks such as flight maneuvers and phases of flight (e.g., straight and level, takeoffs, landings) [68], [69], automobile driving [66], air traffic control [65], and electroenergy process control [70]. However since the heart rate related measures are likely to be influenced by the physical or psychological state of a subject, they do not always produce the same pattern of effects with regard to their sensitivity to mental workload and task difficulty [71], [73]. The eye movement related measures are generally based on blinking, fixation, and pupillary response. There have been lots of studies which suggested that the eye movement related measures could be used as effective tools for the evaluation of cognitive workload [74]–[78]. Conventionally,
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
2696
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007
cumbersome equipments such as a head mounted eye tracking system were used to obtain the eye movement data, which is thought to be intrusive to the operator’s tasks and hence it was considered to be inappropriate to the cognitive workload evaluation for the ISV. However recently there have been developed eye tracking systems which can measure the eye movement data without direct contact (non-intrusively) [79], [80]. In this paper, NASA-TLX, a most widely used subjective ratings technique, is used as the main measure for the evaluation of cognitive workload and continuous measures based on eye movement data are used as the complementary measures. 1) Main Measure: NASA-TLX: A subjective measure is considered as an indicator related to the participants’ internal experience. As mentioned before, subjective rating techniques have been most widely used for the evaluation of the workload in the various fields. Especially, NASA-TLX has been extensively used in multitask contexts such as real and simulated flight tasks [81]–[84], air combat [85], [86], remote-control of vehicles [87] and simulator-based NPP operation [10], [14]–[16], [78], [88], [89]. NASA-TLX is a recommended instrument for assessing cognitive workload by U.S. Nuclear Regulatory Commission (NRC) [90]. In addition, the NASA-TLX results from antecedent studies for the APR-1400 [10], [89] can be utilized as reference criteria for the ISV, which is considered as an important merit. NASA-TLX divides the workload experience into the six components: mental demand, physical demand, temporal demand, performance, effort, and frustration [91]. After completion of a test, the operators subjectively assess their own workload on a rating scale and provide the description or the reason why they give the rating. In this paper, the six questions used in NASA-TLX are made such that workload in an advanced NPP is compared with that in the already licensed NPPs. Hence if the result of the NASA-TLX in an advanced NPP is evaluated as lower than or equal to that in the licensed NPP, the result of the workload evaluation should be considered as acceptable. Usually 7 point scale is used for the measurement. The rating scale is not fixed but the use of 7 point scale is recommended, because the antecedent studies used 7 point scale. The evaluation criterion of this measure is hence based on the benchmark referenced comparison. 2) Complementary Measure: Continuous Measures Based on Eye Movement Measurement: In the similar way to the evaluation of SA, the subjective measure of the cognitive workload can be complemented by continuous measures based on eye movement data. Since the NASA-TLX is evaluated subjectively after the completion of a test, it is not possible to continuously measure the operator’s workload and to secure the objectivity. Hence continuous measures based on eye movement data are utilized as complementary measures for the evaluation of the cognitive workload. Blink rate, blink duration, number of fixation, and fixation dwell time are used as indices representing the cognitive workload. Blinking refers to a complete or partial closure of the eye. Since visual input is disabled during eye closure, a reduced blink rate helps to maintain continuous visual input. The duration and the number of eye blinks should decrease when the cognitive demands of the task increase [75], [76]. A recent study showed that blink rates and duration during the diagnostic tasks in simulated NPP operation correlated with NASA-TLX
and MCH scores, which means that they can be used as a cognitive workload index [78]. Even though some studies have shown that higher level of arousal or attention increased the blink rate [92], [93], considering that the operator’s tasks in NPPs are a series of cognitive activities, the increased blink rate can be used as a clue indicating the point that requires high level of concentration or attention. The eye fixation parameters include the number of fixations on area of interest and the duration of the fixation, also called dwell time. The more the eye fixations are made for a problem-solving, the more information processing is required. Longer fixation duration means that much time is required to correctly understand the relevant situation or object. In other words, if an operator experiences higher cognitive workload, the number of fixations and the fixation dwell time are increased. The number of fixations and the fixation dwell time were found to be sensitive to the measurement of the mental workload [74], [94]. More specifically, the dwell time can serve as an index of the resources required for information extraction from a single source [44]. Bellenkes et al. [95] found that dwells were largest on the most information-rich flight instrument and that dwells were much longer for novice than expert pilots, reflecting the novice’s greater workload. The authors also found that the subject with low expertise spent more time for fixation on a single component than the subject with high expertise during complex diagnostic tasks in simulated NPP operations [47]. In addition, the eye fixation pattern (or visual scanning) can be used as a diagnostic index of the source of workload within a multi-element display environment [44]. The authors observed that more frequent and extended dwells were made for the fixation on more important instruments during the diagnostic tasks [47]. Bellenkes et al. [95] also found that long novice dwells were coupled with more frequent visits and hence served as a major “sink” for visual attention. Little time was left for novices to monitor other instruments, and as a result, their performance declined on tasks using those other instruments. Consequently, the eye fixation parameters can be effectively used for evaluating the strategic aspects of resource allocation. The evaluation of these measures should be performed by SMEs to find out valuable aspects. Hence these measures are based on the expert-judgment referenced comparison. The author has performed an experimental study to investigate the cognitive workload during complex diagnostic tasks during simulated NPP operations [78]. This study showed that the eye movement related measures such as blink rate, blink duration, number of fixation, and fixation dwell time correlate with NASA-TLX and MCH scores. Hence we conclude that continuous measures based on eye movement data are very useful tools for complementing the subjective rating measure. E. Team Work A NPP is operated by a crew not an individual operator. There are individual tasks which should be performed by the relevant operators and there are some tasks which require cooperation of the crew. The cooperative tasks should be appropriately divided and then allocated to the relevant operators to achieve the operational goal. The advanced MCR of APR-1400 is equipped with the large display panel designed to support team performance by providing common reference display for discussions. The advanced MCR design also allows operators to be located nearer to
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400
one another than the conventional MCRs and to access the plant information from workstations allocated to the relevant operators for exclusive use. These interface changes are expected to improve the operator performance by facilitating verbal and visual communication among the operators [88], [96] and thus improve the team work. In order to evaluate the team work, BARS (behaviorally anchored rating scale) is used in this paper [88]. The BARS include task focus/decision-making, coordination as a crew, communication, openness, and team spirit. In each of these components, several example behaviors (positive or negative) and anchors (or critical behaviors) indicating good/bad team interactions are identified by SMEs (usually process expert and/or human factors expert) during a test. The example behaviors and the anchors identified are used as criteria for final (or overall) rating of team work by the SMEs after the test. Usually 7 point scale (1–7) is used for the BARS ratings with 7 being the best team interaction. The rating scale is not fixed but the use of 7 point scale is recommended, because the BARS results with 7 point scale from an antecedent study for the APR-1400 [10] can be utilized as reference criteria. In this measure, attention should be focused on the findings which are considered to influence the team work. Finally, the experts determine whether the teamwork is acceptable or not based on the experience and knowledge. Hence, the evaluation criterion of this measure is based on the expert-judgment referenced comparisons. F. Anthropometric-Physiological Factors Anthropometric and physiological factors include such concerns as visibility and audibility of indication, accessibility of control devices to operator reach and manipulation, and the design and arrangement of equipment [12]. Generally many of the concerns are evaluated earlier in the design process with HFE V&V checklist. Since the ISV is a kind of feedback step for the design validation and improvement, attention should be focused on those anthropometric and physiological factors that can only be addressed in real or almost real (simulation with high fidelity) operating conditions; e.g., the ability of the operators to effectively use or manipulate various controls, displays, workstations, or consoles in an integrated manner [12]. Consequently, items related to these factors in HFE V&V checklist are selected before the validation test and then reconfirmed during the validation test by SMEs. Also it should be checked whether there are anthropometric and physiological problems caused by unexpected design faults, which can be performed during the test or after the test with audio/video (AV) recording data. The evaluation criterion of this measure is hence based on both the requirement referenced (HFE V&V checklist) and the expert-judgment referenced comparisons. IV. DISCUSSIONS: STRATEGIES FOR EFFECTIVE HUMAN PERFORMANCE EVALUATION In this paper, human performance measures are developed for human factors validation, so called ISV, in the advanced MCR of the APR-1400. The measures for plant performance, personnel task performance, situation awareness, workload, team work, and anthropometric/physiological factors are thought to provide multilateral information for the ISV. Preferably the evaluation of human performance should be performed in an inte-
2697
grated manner to produce results with the most information. In order for the human performance to be effectively evaluated, the times of operators’ activities should be recorded during the validation test. The operators’ activities include the bottom-rank tasks considered in the evaluation of the personnel task performance, the example behaviors and the critical behaviors in the teamwork evaluation, and activities belonging to the anthropometric and physiological factors. The time-tagging can be easily conducted with a computerized system. All that SMEs (as evaluators) have to do are just to check items listed in a computerized system based on their observation. The computerized system can record automatically the checked items and the relevant times. This time-tagged information can facilitate the integrated evaluation of the human performance. Firstly, the plant performance can be connected to personnel task performance with the time-tagged information. A computerized system for human performance evaluation can be connected with the relevant simulator to acquire logging data representing the plant state (e.g., process parameters and alarms) and control activities performed by operators. It can be evaluated whether the plant system is well operated or not by observing and evaluating the process parameters. Even though the plant performance is maintained within acceptable ranges, design faults or shortcomings may require unnecessary work or inappropriate manner of operation. This kind of problem can be solved by analyzing the system state with the operators’ activities. If the operators’ activities are time-tagged, the system state can be analyzed with the operators’ activities. Since the logging data provided by the simulator are time-tagged as well, inappropriate or unnecessary activities performed by the operators can be compared with the logging data representing the plant state. This kind of analysis can provide diagnostic information on the operators’ activities. For example, if the operators should navigate the workstation or move around in a scrambled way in order to operate the NPP within acceptable ranges, the MMI design of the MCR should be considered as inappropriate and thus some revisions should be followed, even though the plant performance is maintained within acceptable ranges. Secondly, the eye tracking measures for the SA and workload evaluation can be connected to the personnel task performance with the time-tagged information. The eye tracking measures can be analyzed for each of the tasks defined in the optimal solution. This means that we can evaluate the SA and workload in each task step by considering the cognitive aspects specified by the task attribute, which is expected to increase the level of detail for the measurement. Also the eye fixation data can be used for determining whether the operator are monitoring and detecting the environment correctly or not. This information can be used for the evaluation of personnel task performance. Thirdly, the evaluation of the personnel task performance, the teamwork, and the anthropometric/physiological factors can be analyzed in an integrated manner with the time-tagged information, which is expected to provide diagnostic information for the human performance evaluation. Teamwork is required in the context of the operators’ tasks many of which would be the series of cognitive activities. The example behaviors and the critical behaviors attributable to the teamwork can be investigated in the series of the operators’ tasks with time line analysis. Hence, it can be analyzed whether
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
2698
Fig. 4. Human performance evaluation with HUPESS.
behaviors attributable to the teamwork contribute to good or poor performance of the operators’ tasks or whether the operators’ tasks overloaded inhibit the teamwork. Also the anthropometric/physiological problems unexpected in advance but observed during a test can be analyzed in the context of the operators’ tasks, which may be useful for analyzing the cause of the anthropometric/physiological problems. Finally, AV recording data can be effectively utilized with the real-time evaluation data. The AV recording data can provide the information which may be missed or not processed by SMEs during a test. In addition, considering that the operators’ tasks in NPPs are generally based on the goal-oriented procedure, the operators’ tasks are analyzed and then constructed into an optimal solution in a hierarchical form. The optimal solution consists of the main goal, the sub-goals, the observable cognitive tasks, and the sub-tasks. The relative importance (or weight value) of the elements in the optimal solution is obtained by using the AHP. This means that the operators’ tasks can be ranked with the weight values of the tasks. Hence, we can allocate the analysis resources according the relative importance of the tasks. For example, a specific task in a context of the operators’ tasks is more important than other tasks, we can analyze the specific task with more resources (e.g., more time can be allocated or more additional consideration can be allocated to the analysis). It is expected that the analysis of the human performance in a test takes a lot of time and moreover many tests covering sufficient spectrum of the operational situations in the NPP should be performed to validate the MMI design. Consequently, the importance-based approach is thought to be an efficient strategy. The authors have been developing a computerized system for the human performance evaluation which is called “HUman Performance Evaluation Support System (HUPESS)”. This system is developed based on the measures and the strategy considered in this paper. Hence, the authors expect that the human performance can be evaluated in an integrated and effective way with the HUPESS. Let us introduce the HUPESS in brief. The HUPESS is interfaced with the APR-1400 simulator which is equipped in the dynamic mock-up, as shown in Fig. 4. The HUPESS acquires the simulator logging data during a test. The logging data include the data representing the plant system events and status (e.g., status change of controlled components, alarms and flags, and process variables/parameters) and the data representing operator activities (e.g., display navigation,
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007
alarm control, soft control, and CPS (computerized procedure system) control/navigation). The plant performance is evaluated with the logging data by the HUPESS during the test. The operators are operating the plant system during the test and then evaluate the KSAX and the NASA-TLX after the test. The SMEs are observing the operators’ activities and checking the activities related to the personnel task performance, the teamwork, and the anthropometric/physiological factors during the test and complete the evaluations of the personnel task performance and the BARS based on the observations after the test. The HUPESS includes an eye tracking system (ETS) and an AV recording system. The HUPESS acquire the eye tracking data from the ETS and process them into the measures for the SA and the workload evaluations. Also, the test is recorded by the AV system. The data observed and checked, evaluated, and recorded during the test can be further evaluated by time line analysis in an integrated way. In addition, the HUPESS has very useful functions such as various statistical analyses and convenient reporting function. The HUPESS was designed to be effectively used for the ISV in the advanced MCR of the APR-1400 through reviews by SMEs including 1 process expert and 2 human factors experts. Consequently, the HUPESS is expected to be used as an effective tool for the ISV in the advanced MCR of Shin Kori 3 & 4 NPPs (APR-1400 type) which are under construction in South-Korea. V. SUMMARY AND CONCLUSIONS The MMI design in the advanced MCR of APR-1400 can be validated through performance-based tests to determine whether it acceptably supports safe operation of the plant. In this paper, plant performance, personnel task performance, situation awareness, workload, teamwork, and anthropometric/physiological factor are considered as factors for the human performance evaluation. For the development of measures in each of the factors, attention is paid to considerations and constraints such as the changed environment in an advanced MCR, needs for a practical and economic evaluation, and suitability of evaluation criteria. Measures generally used in various industries and empirically proven to be useful are adopted as main measures with some modifications. In addition, helpful measures are developed as complementary measures in order to overcome some of the limitations associated with the main measures. The development of the measures is addressed based on the theoretical and empirical background and also based on the regulatory guidelines for the ISV such as NUREG-0711 and NUREG/CR-6393. Consequently, we conclude that the measures developed in this paper can be effectively used for the ISV in the advanced MCR of the APR-1400. Also a computerized system for the human performance evaluation, called HUPESS, is briefly introduced. The HUPESS is in development based on the measures developed and the strategies discussed in this paper. The HUPESS is expected to be used as an effective tool for the ISV in the advanced MCR of Shin Kori 3 & 4 NPPs (APR-1400 type) which are under construction in South-Korea. ACKNOWLEDGMENT The authors would like to thank J. C. Ra and S. B. Jo of Korea Power Engineering Company (KOPEC) and Y. C. Shin and J. H. Kim of Korea Hydro and Nuclear Power (KHNP) for
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400
their continuous support and valuable comments. The authors also express their gratitude to Prof. S. N. Byun of Kyunghee University, Prof. J. H. Park of Hankyong National University, and Dr. S. N. Choi of Korea Institute of Nuclear Safety (KINS) for their valuable advices, comments, and encouragement. REFERENCES [1] Functional Criteria for Emergency Response Facilities US Nuclear Regulatory Commission. Washington, DC, 1980, NUREG-0696. [2] Clarification of TMI Action Plan Requirements US Nuclear Regulatory Commission. Washington, DC, 1980, NUREG-0737. [3] J. M. O’Hara, W. S. Brown, P. M. Lewis, and J. J. Persensky, HumanSystem Interface Design Review Guidelines NUREG-0700, Rev.2, US NRC, 2002. [4] J. M. O’Hara, J. C. Higgins, J. J. Persensky, P. M. Lewis, and J. P. Bongarra, Human Factors Engineering Program Review Model 2004, NUREG-0711, Rev.2, US NRC. [5] M. Barriere, D. Bley, S. Cooper, J. Forester, A. Kolaczkowski, W. Luckas, G. Parry, A. Ramey-smith, C. Thompson, D. Whitehead, and J. Wreathall, Technical Basis and Implementation Guidelines for a Technique for Human Event Analysis (ATHEANA) 2000, Rev.01, NUREG1624, US NRC. [6] S. H. Chang, S. S. Choi, J. K. Park, G. Heo, and H. G. Kim, “Development of an advanced human-machine interface for next generation nuclear power plants,” Reliab. Eng. Syst. Safety, vol. 64, pp. 109–126, 1999. [7] I. S. Kim, “Computerized systems for on-line management of failures: A state-of-the-art discussion of alarm systems and diagnostic systems applied in the nuclear industry,” Reliab. Eng. Syst. Safety, vol. 44, pp. 279–295, 1994. [8] H. Yoahikawa, T. Nakagawa, Y. Nakatani, T. Furuta, and A. Hasegawa, “Development of an analysis support system for man-machine system design information,” Contr. Eng. Practice, vol. 5, no. 3, pp. 417–425, 1997. [9] H. Yoshikawa, “Human-machine interaction in nuclear power plants,” Nucl. Eng. Technol., vol. 37, no. 2, pp. 151–158, 2005. [10] S. J. Cho et al., “The Evaluation of Suitability for the Design of Soft Control and Safety Console for APR1400,”. Daejeon, Korea, 2003, KHNP, TR. A02NS04.S2003.EN8. [11] T. B. Sheridan, Telerobotics, Automation, and Human Supervisory Control. Cambridge, MA: MIT Press, 1992. [12] J. M. O’Hara, W. F. Stubler, J. C. Higgins, and W. S. Brown, Integrated System Validation: Methodology and Review Criteria 1997, NUREG/CR-6393, US NRC. [13] G. Andresen and A. Drøivoldsmo, Human Performance Assessment: Methods and Measures 2000, HPR-353, OECD Halden Reactor Project. [14] P. Ø. Braarud, Subjective Task Complexity in Control Room 2000, HWR-621, OECD Halden Reactor Project. [15] P. Ø. Braarud and H. Brendryen, Task Demand, Task Management, and Teamwork 2001, HWR-657, OECD Halden Reactor Project. [16] A. Drøivoldsmo et al., Continuous Measure of Situation Awareness and Workload 1988, HWR-539, OECD Halden Reactor Project. [17] M. Moracho, Plant Performance Assessment System (PPAS) for Crew Performance Evaluations. Lessons Learned from an Alarm Study Conducted in HAMMLAB 1998, HWR-504, OECD Halden Reactor Project. [18] G. Jr. Skraning, The Operator Performance Assessment System (OPAS) HWR-538, OECD Halden Reactor Project, 1998. [19] B. S. Sim et al., The Development of Human Factors Technologies: The Development of Human Factors Experimental Evaluation Techniques. Daejeon, Korea, 1996, KAERI/RR-1693. [20] T. L. Saaty, The Analytic Hierarchy Process. New York: McGrawHill, 1980. [21] E. Hollnagel, Cognitive Reliability and Error Analysis Method. Amsterdam, The Netherlands: Elsevier, 1998. [22] J. Kemeny, The Need for Change: The Legacy of TMI, Report of the President’s Commission on the Accident at Three Miles Island. New York: Pergamon, 1979. [23] M. J. Adams, Y. J. Tenney, and R. W. Pew, “Situation awareness and cognitive management of complex system,” Human Factors, vol. 37, no. 1, pp. 85–104, 1995. [24] F. T. Durso and S. Gronlund, “Situation awareness,” in The Handbook of Applied Cognition, F. T. Durso, R. Nickerson, R. W. Schvaneveldt, S. T. Dumais, D. S. Lindsay, and M. T. H. Chi, Eds. New York: Wiley, 1999, pp. 284–314.
2699
[25] M. R. Endsley and D. J. Garland, Eds., Situation Awareness: Analysis and Measurement. Mahwah, NJ: Erlbaum, 2001. [26] C. P. Gibson and A. J. Garrett, “Toward a future cockpit-the prototyping and pilot integration of the mission management aid (MMA),” presented at the The Situational Awareness in Aerospace Operations, Copenhagen, Denmark, 1990, unpublished. [27] R. M. Taylor, “Situational Awareness Rating Technique (SART): The development of a tool for aircrew systems design,” presented at the The Situational Awareness in Aerospace Operations, Copenhagen, Denmark, 1990, unpublished. [28] M. M. Wesler, W. P. Marshak, and M. M. Glumm, “Innovative measures of accuracy and situational awareness during landing navigation,” presented at the The Human Factors and Ergonomics Society 42nd Annual Meeting, 1998, unpublished. [29] M. R. Endsley, “Toward a theory of situation awareness in dynamic systems,” Human Factors, vol. 37, no. 1, pp. 32–64, 1995. [30] D. H. Lee and H. C. Lee, “A review on measurement and applications of situation awareness for an evaluation of Korea next generation reactor operator performance,” IE Interface, vol. 13, no. 4, pp. 751–758, 2000. [31] R. E. Nisbett and T. D. Wilson, “Telling more than we can know: Verbal reports on mental process,” Psycholog. Rev., vol. 84, pp. 231–295, 1997. [32] M. R. Endsley, , M. R. Endsley and D. J. Garland, Eds., “Direct measurement of situation awareness: Validity and use of SAGAT,” in Situation Awareness Analysis and Measurement. Mahwah, NJ: Lawrence Erlbaum, 2000. [33] M. R. Endsley, , T. G. O. Brien and S. G. Charlton, Eds., “Situation awareness measurement in test and evaluation,” in Handbook of Human Factors Testing and Evaluation. Mahwah, NJ: Lawrence Erlbaum, 1996. [34] N. B. Sarter and D. D. Woods, “Situation awareness: A critical but illdefined phenomenon,” Int. J. Aviation Psychol., vol. 1, no. 1, pp. 45–57, 1991. [35] R. W. Pew, , M. R. Endsley and D. J. Garland, Eds., “The state of situation awareness measurement: Heading toward the next century,” in Situation Awareness Analysis and Measurement. Mahwah, NJ: Lawrence Erlbaum, 2000. [36] M. R. Endsley, “A methodology for the objective measurement of situation awareness,” in Situational Awareness in Aerospace Operations, Neuilly-Sur-Seine, France, 1990, (AGARD-CP-478; pp. 1/1-1/9), NATO-AGARD. [37] M. R. Endsley, “The out-of-the-loop performance problem and level of control in automation,” Human Factors, vol. 37, no. 2, pp. 381–394, 1995. [38] S. G. Collier and K. Folleso, , D. J. Garland and M. R. Endsley, Eds., “SACRI: A measure of situation awareness for nuclear power plant control rooms,” in Experimental Analysis and Measurement of Situation Awareness. Daytona Beach, FL: Embri-Riddle Univ. Press, 1995, pp. 115–122. [39] D. N. Hogg, K. Fosllesø, F. S. Volden, and B. Torralba, “Development of a situation awareness measure to evaluate advanced alarm systems in nuclear power plant control rooms,” Ergonomics, vol. 38, no. 11, pp. 2394–2413, 1995. [40] M. L. Fracker and M. A. Vidulich, , Y. Queinnec and F. Daniellou, Eds., “Measurement of situation awareness: A brief review,” in Proc. 11th Congr. Int. Ergonomics Association Designing for Everyone, . London, U.K.: Taylor & Francis, 1991, pp. 795–797. [41] M. R. Endsley, “Measurement of situation awareness in dynamic systems,” Human Factors, vol. 37, no. 1, pp. 65–84, 1995. [42] G. F. Wilson, , M. R. Endsley and D. J. Garland, Eds., “Strategies for psychophysiological assessment of situation awareness,” in Situation Awareness Analysis and Measurement. Mahwah, NJ: Lawrence Erlbaum, 2000. [43] R. M. Taylor, “Situational awareness rating technique (SART): The development of a tool for aircrew systems design,” in Situational Awareness Aerospace Operations, Neuilly-Sur-Seine, France, 1990, pp. 3/1–3/17, AGARD-CP-478, NATO- AGARD. [44] C. D. Wickens and J. G. Hollands, Engineering Psychology and Human Performance, 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 2000. [45] J. M. O’Hara, J. C. Higgins, W. F. Stubler, and J. Kramer, ComputerBased Procedure Systems: Technical Basis and Human Factors Review Guidance 2002, NUREG/CR-6634, US NRC. [46] M. C. Kim and P. H. Seong, “A computational model for knowledgedriven monitoring of nuclear power plant operators based on information theory,” Reliab. Eng. Syst. Safety, vol. 91, pp. 283–291, 2006. [47] J. S. Ha and P. H. Seong, “An experimental study: EEG analysis with eye fixation data during complex diagnostic tasks in nuclear power plants,” presented at the Int. Symp. Future I&C for NPPs (ISOFIC), Chungmu, Korea, 2005.
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.
2700
[48] C. D. Wickens, “Workload and situation awareness: An analogy of history and implications,” Insight, vol. 94, 1992. [49] N. Moray, Mental Workload: Its Theory and Measurement. New York: Plenum Press, 1979. [50] P. Hancock and N. Meshkati, in Human Mental Workload, NY, 1988, North-Holland. [51] R. D. O’Donnell and F. T. Eggemeier, “Workload assessment methodology,” in Handbook of Perception and Human Performance: Vol. II. Cognitive Processes and Performance, K. R. Boff, L. Kaufman, and J. Thomas, Eds. New York: Wiley, 1986. [52] D. A. Norman and D. G. Bobrow, “On data-limited and resource-limited process,” Cognit. Psychol., vol. 7, pp. 44–64, 1975. [53] R. Williges and W. W. Wierwille, “Behavioral measures of aircrew mental workload,” Human Factors, vol. 21, pp. 549–574, 1979. [54] S. G. Charlton, , S. G. Charlton and T. G. O. Brien, Eds., “Measurement of cognitive states in test and evaluation,” in Handbook of Human Factors Testing and Evaluation. Mahwah, NJ: Lawrence Erlbaum , 2002. [55] F. T. Eggemeier and G. F. Wilson, “Subjective and performance-based assessment of workload in multi-task environments,” in Multiple Task Performance, D. Damos, Ed. London, U.K.: Taylor & Francis, 1991. [56] S. Rubio, E. Diaz, J. Martin, and J. M. Puente, “Evaluation of subjective mental workload: A comparison of SWAT, NASA-TLX, and workload profile,” Appl. Psychol., vol. 53, pp. 61–86, 2004. [57] W. W. Wierwille, M. Rahimi, and J. G. Casali, “Evaluation of 16 measures of mental workload using a simulated flight task emphasizing mediational activity,” Human Factors, vol. 27, pp. 489–502, 1985. [58] G. Johannsen, N. Moray, R. Pew, J. Rasmussen, A. Sanders, and C. Wickens, “Final report of the experimental psychology group,” in Mental Workload: Its Theory and Measurement, N. Moray, Ed. New York: Plenum, 1979. [59] N. Moray, “Subjective mental workload,” Human Factors, vol. 24, pp. 25–40, 1982. [60] S. G. Hill, H. P. Iavecchia, J. C. Byers, A. C. Bittier, A. L. Zaklad, and R. E. Christ, “Comparison of Four Subjective Workload Rating Scales,” Human Factors, vol. 34, pp. 429–440, 1992. [61] B. Sterman and C. Mann, “Concepts and applications of EEG analysis in aviation performance evaluation,” Biol. Psychol., vol. 40, pp. 115–130, 1995. [62] A. F. Kramer, E. J. Sirevaag, and R. Braune, “A psychophysiological assessment of operator workload during simulated flight missions,” Human Factors, vol. 29, no. 2, pp. 145–160, 1987. [63] J. Brookings, G. F. Wilson, and C. Swain, “Psycho-physiological responses to changes in workload during simulated air traffic control,” Biol. Psychol., vol. 42, pp. 361–378, 1996. [64] K. A. Brookhuis and D. D. Waard, “The use of psychophysiology to assess driver status,” Ergonomics, vol. 36, pp. 1099–1110, 1993. [65] E. Donchin and M. G. H. Coles, “Is the P300 component a manifestation of cognitive updating?,” The Behavioral and Brain Science, vol. 11, pp. 357–427, 1988. [66] L. C. Boer and J. A. Veltman, “From workload assessment to system improvement,” presented at the The NATO Workshop on Technologies in Human Engineering Testing and Evaluation, Brussels, 1997, unpublished. [67] A. H. Roscoe, “Heart Rate Monitoring of Pilots during Steep Gradient Approaches,” Aviation, Space Environmental Med., vol. 46, pp. 1410–1415, 1975. [68] R. Rau, “Psychophysiological assessment of human reliability in a simulated complex system,” Biol. Psychol., vol. 42, pp. 287–300, 1996. [69] A. F. Kramer and T. Weber, , J. T. Cacioppo, Ed. et al., “Application of Psychophysiology to Human Factors,” in Handbook of Psychophysiology. Cambridge, U.K.: Cambridge Univ. Press, 2000, pp. 794–814. [70] P. G. A. M. Jorna, “Spectral analysis of heart rate and psychological state: A review of its validity as a workload index,” Biol. Psychol., vol. 34, pp. 237–257, 1992. [71] L. J. M. Mulder, “Measurement and analysis methods of heart rate and respiration for use in applied environments,” Biol. Psychol., vol. 34, pp. 205–236, 1992. [72] S. W. Porges and E. A. Byrne, “Research methods for the measurement of heart rate and respiration,” Biol. Psychol., vol. 34, pp. 93–130, 1992. [73] G. F. Wilson, “Applied use of cardiac and respiration measure: Practical considerations and precautions,” Biol. Psychol., vol. 34, pp. 163–178, 1992. [74] Y. Lin, W. J. Zhang, and L. G. Watson, “Using eye movement parameters for evaluating human-machine interface frameworks under normal control operation and fault detection situations,” Int. J. Human Computer Studies, vol. 59, pp. 837–873, 2003.
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007
[75] J. A. Veltman and A. W. K. Gaillard, “Physiological indices of workload in a simulated flight task,” Biol. Psychol., vol. 42, pp. 323–342, 1996. [76] L. O. Bauer, R. Goldstein, and J. A. Stern, “Effects of information-processing demands on physiological response patterns,” Human Factors, vol. 29, pp. 219–234, 1987. [77] J. H. Goldberg and X. P. Kotval, “Eye movement-based evaluation of the computer interface,” in Advances in Occupational Ergonomics and Safety, S. K. Kumar, Ed. Amsterdam, The Netherlands: IOS Press, 1998. [78] C. H. Ha and P. H. Seong, “Investigation on Relationship between Information Flow Rate and Mental Workload of Accident Diagnosis Tasks in NPPs,” IEEE Trans. Nucl. Sci., vol. 53, no. 3, pp. 1450–1459, Jun. 2006. [79] , [Online]. Available: http://www.seeingmachines.com/ [80] , [Online]. Available: http://www.smarteye.se/home.html [81] R. Shively, V. Battiste, J. Matsumoto, D. Pepiton, M. Bortolussi, and S. Hart, “In flight evaluation of pilot workload measures for rotorcraft research,” in Proc. 4th Symp. Aviation Psychology, Columbus, OH, 1987, pp. 637–643. [82] V. Battiste and M. Bortolussi, “Transport pilot workload: A comparison of two subjective techniques,” in Proc. Human Factors Society 32nd Ann. Meeting, Santa Monica, CA, 1988, pp. 150–154. [83] M. Nataupsky and T. S. Abbott, “Comparison of workload measures on computer-generated primary flight displays,” in Proc Human Factors Society 31st Ann. Meeting, Santa Monica, CA, 1987, pp. 548–552. [84] P. S. Tsang and W. W. Johnson, “Cognitive demand in automation,” Aviation, Space, Experiment. Med., vol. 60, pp. 130–135, 1989. [85] A. V. Bittner, J. C. Byers, S. G. Hill, A. L. Zaklad, and R. E. Christ, “Generic workload ratings of a mobile air defense system (LOS-F-H),” in Proc. Human Factors Society 33rd Ann. Meeting, Santa Monica, CA, 1989, pp. 1476–1480. [86] S. G. Hill, J. C. Byers, A. L. Zaklad, and R. E. Christ, “Workload assessment of a mobile air defences system,” in Proc. Human Factors Society 32nd Ann. Meeting, Santa Monica, CA, 1988, pp. 1068–1072. [87] J. C. Byers, A. V. Bittner, S. G. Hill, A. L. Zaklad, and R. E. Christ, “Workload assessment of a remotely piloted vehicle (RPV) system,” in Proc. Human Factors Society 32nd Ann. Meeting, Santa Monica, CA, 1988, pp. 1145–1149. [88] A. Sebok, “Team Performance in Process Control: Influences of Interface Design and Staffing,” Ergonomics, vol. 43, no. 8, pp. 1210–1236, 2000. [89] S. N. Byun and S. N. Choi, “An evaluation of the operator mental workload of advanced control facilities in Korea next generation reactor,” J. Korean Inst. Indust. Eng., vol. 28, no. 2, pp. 178–186, 2002. [90] C. Plott, T. Engh, and V. Bames, Technical Basis for Regulatory Guidance for Assessing Exemption Requests From the Nuclear Power Plant Licensed Operator Staffing Requirements Specified in 10 CFR 50.54 2004, NUREG/CR-6838, US NRC. [91] S. G. Hart and L. E. Staveland, “Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research,” in Human Mental Workload, P. A. Hancock and N. Meshkati, Eds. Amsterdam, The Netherlands: North-Holland, 1988. [92] J. A. Stern, L. C. Walrath, and R. Golodstein, “The endogenous eyeblink,” Psychophysiology, vol. 21, pp. 22–23, 1984. [93] Y. Tanaka and K. Yamaoka, “Blink activity and task difficulty,” Perceptual Motor Skills, vol. 77, pp. 55–66, 1993. [94] J. H. Goldberg and X. P. Kotval, “Eye movement-based evaluation of the computer interface,” in Advances in Occupational Ergonomics and Safety, S. K. Kumar, Ed. Amsterdam, The Netherlands: IOS Press, 1998. [95] A. H. Bellenkes, C. D. Wickens, and A. F. Kramer, “Visual scanning and pilot expertise: the role of attentional flexibility and mental model development,” Aviation, Space, Environment. Med., vol. 68, no. 7, pp. 569–579, 1997. [96] E. M. Roth, R. J. Mumaw, and W. F. Stubler, “Human factors evaluation issues for advanced control rooms: A research agenda,” Proc. IEEE, pp. 254–265, 1993. [97] J. M. O’Hara and R. E. Hall, “Advanced control rooms and crew performance issues: Implications for human reliability,,” IEEE Trans. Nucl. Sci., vol. 39, no. 4, pp. 919–923, Aug. 1992. [98] P. Ø. Braarud and G. Jr. Skraaning, “Insights from a benchmark integrated system validation of a modernized npp control room: Performance measurement and the comparison to the benchmark system,” in NPIC&HMIT 2006, Albuquerque, NM, Nov. 2006, pp. 12–16.
Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.