The state of the system during execution depends simultaneously upon the ... On axis x is Total time (in seconds), and on axis y Overflow (number of overflowing ...
Performance Analysis On A Process Control Micro-World: An Approach To Mental Workload Assessment M. DÍAZ and P. PONSA Research Group on Knowledge Engineering (GREC), Technical University of Catalonia, Spain
I. DALMAU Center of Research and Development on Ergonomics and Prevention (CEP), Technical University of Catalonia, Spain
Abstract The aim of this paper is to explore the effect of instructions with different requirements on mental workload in a simulated control process task. The mental workload assessment procedure applied included two complementary measures: performance analysis and subjective perception. In a experimental laboratory study the task was carried out by a sample of 31 electronic engineering students. The students were distributed randomly to three different conditions. They all performed a series of 20 trials on a simulated control process interface (micro-world) and then answered the NASA-TLX questionnaire. Then subjectes answered the EPQ questionnaire too in order to control personality variability.
Introduction Computer technology and his applications on industrial domain are always growing and sometimes generate problems. One set of problems concerns the relationship between the user and the computer interactive systems, such as process control interfaces. Although computer based workplaces tend to reduce physical workload, the cognitive requirements are more demanding. In control and supervision workplaces this change is most evident, being information interchange and decision making the main activity. These cognitive demands results in operator mental workload. That mental strain produces fatigue and seldom performance alterations that eventually result in system disturbances and accidents. Optimizing interaction would serve to improve operator well-being and system effectiveness. There are two main difficulties in mental workload research in control processes. First, there is not one unique procedure generally accepted for cognitive workload assessment. Furthermore it is assumed that mental workload has to be assessed through many indicators including subjective workload estimates. The second source of problems is the complexity of behavior we are concerned with. Performance based assessment procedures assume that mental workload has an assessable effect on execution, but there are important restrictions of this general statement (Rubio and Díaz, 2000). Furthermore, dynamic systems control is a complex activity that can hardly be reduced to directly observable behavior (Díaz and Ponsa 2001). It seems necessary to define procedures of mental workload assessment that complement the precision of laboratory experimentation with the representativeness of in field research (Rassmussen, 1993). The simulated task in this experimental study –that belongs to a broader research in work analysis (Díaz and Ponsa 2000; Díaz and Ponsa 2001) is the control of a dynamic interactive system called micro-world. The micro-world tends to reproduce the demands on perception, information processing, and decision making operators meet when interacting with actual control interfaces (Howe and Vicente, 1998).The state of the system during execution depends simultaneously upon the subject behavior and upon autonomous evolution in a object -event -action scheme (Brehmer et al, 1991). Some conditions of control process workplace design are specially contributory to mental stress and its impairing effects such as behavior alterations and fatigue. One of this conditions are instructions. The main issue of this paper is to explore how much does different instructions demand on operator mental resources and how does these different assignments affect mental workload. Another objective of this study is to neutralize the possible influence of the personality variables. It has been scientifically proved that differences in personality (according to the PEN system – psychoticism, extroversion, neuroticism-) result in different performances in perceptive, control and learning tasks (Eysenck and Eysenck, 1989). Method Essentially, this work consisted on analyzing the activity 31 electronic engineering students in a simulated control process interface. Their age ranged from 20 to 30 years old and they were mostly men. Students were distributed randomly and equally in three groups which corresponded to the three experimental conditions. They all performed a
series of 20 trials on a simulated control process interface (micro-world) and then answered the NASA-TLX questionnaire. Finally they answered de EPQ-A questionnaire in order to control personality variability. The current version of micro-world in which we are working on is a hydraulic system with five open tanks connected by pipes of diverse diameters and controlled by valves (on/off). The upper and the bottom tanks have the same capacity. In the initial state all the water is inside the uppermost tank and the other four are empty. The goal is to move water from the uppermost tank to the bottommost one. The valve’s state closed (or open) is represented with red color (or green color). Subject operates clicking the binary valves with the mouse. The three different instructions were: 1. Moving water from the uppermost tank to the bottommost without overflow and as quick as possible (Fast and carefully condition, “F&C”) 2. Moving water from the top tank to the bottom tank without overflow (Carefully condition, “C”) 3. Moving water from the top tank to the bottom tank as quick as possible (Fast condition, “F”) The main hypothesis were that different exigencies would result in different performance and in assessable mental workload differences. We assumed that in general individuals try to accomplish the explicit requirements adjusting their behavior to the stated goal or goals in instructions. So, subjects in Fast condition would be speed oriented and subjects in carefully condition accuracy oriented. In regard to mental workload, we expected that individuals in the Fast and carefully group would report higher mental workload. In this case conflicting goals would make subjects trade off between speed and accuracy. In the microworld the actions that could increase speed (i.e. opening paths, letting the water fill the tanks) increments system instability and the risk of failure -overflowing. On the other hand, trying to be accurate implies a sacrifice of effectiveness. It was expected that subjects in Fast and Fast and carefully groups experiment more time strain so had higher scores on NASATLX Temporal demand subescale. Results Instructions and performance We obtain a graphic representation of the three experimental groups (F&C= *, C = +, and F = o) distribution in relation to execution. On axis x is Total time (in seconds), and on axis y Overflow (number of overflowing episodes along the whole task) (Figure 1).
Figure 1: Execution distribution As we expected, the subset of subjects in Fast group shows low values on Total time –all of them under 400 secondsbut more overflowing episodes (Table 1).
Total time Overflow Best Trial
F&C 400-410 + 451* 6-12 15,8-19,2
C 300-500+ 681* 3-16 14,8-18,3
F 335-396 5-19 15,8-20,7+ 28,7*
*Outer cases Table 1: Performance measures ranges for groups
From the ANOVA we remark that the means of Total time are very close between subjects assigned to F&C (mean=412,5 s.) and C (mean=437,6 s.), and higher than Fast subjects performance (mean=362,3 s.) (Table 2). This difference is not statistically significant (F= 2,74; p= 0,09). In relation to efficacy, measured here by the number of overflow episodes, subjects assigned to Fast condition shows more overflowing episodes (mean=13,13) in comparison with the other two groups (means about 9,5) This difference is not statistically significant (F= 1,52; p= 0,24) (Table 2). To sum up, subjects assigned to Fast condition present a performance characterized by speed and poor accuracy, as we expected. Subjects assigned to F&C and to C do not differ neither in speed nor in accuracy.
F&C C F
Tt Mean = 412,5 Sd = 19,13 Mean = 437,56 Sd = 102,34 Mean = 362,32 Sd = 23,34
Overf Mean = 9,83 Sd = 2,14 Mean = 9,44 Sd = 4,13 Mean = 13.13 Sd = 6,20
Best T Mean = 412,5 Sd = 19,13 Mean = 437,56 Sd = 102,34 Mean = 362,32 Sd = 23,34
Table 2: Summary of performance measures descriptius
We studied also the variable Best trial that measures the time in seconds of the quickest execution on one trial without overflowing. We explored two questions: in which moment in the series of trials does this best execution appear and to which extend instructions affect this variable.
Subject 6 5 2 8 7 1
F&C Best trial 15.8 17 18.1 18.3 18.3 19.2
Order 18 17 15 6 9 17
Subject 11 12 13 19 14 17 16 18
C Best trial 14.8 16.1 16.9 16.7 17.1 17.2 18.2 18.3
Order 19 18 15 18 20 6 17 10
Subject 26 24 28 21 29 30 25 27
F Best trial 15.8 16.1 16.8 17.2 17.4 18.4 20.7 28.7
Order 14 13 19 6 16 4 4 3
Table 3: Ratings on Best trial for groups
Attending to the moment that this best trial appears, we remark that best trials appear mostly in the last trials (more than the 50% grouped in the last five trials) as we might consider a clear and foreseeable manifestation of learning effect. More remarkable are the cases of best trial placed at the beginning of the activity (more than 25% up to the 6th trial). The Fast
group shows a more premature appearance of best trial (mean of Order = 9.8) respect Fast and carefully (mean = 13,6) and Carefully (mean = 15,3) groups (Table 3). While is quite simple decide which is the best trial and to make a ranking with them it is more difficult decide who is the more efficient subject. The subjects that performed the quickest best trials are not consistently the same that performed the best tasks taken as a whole using global measures such as Total time or Overflow that are referred to the 20 trials. Instructions and mental workload Overall scores are remarkable similar in the three groups (F&C = 11,4; C = 11,5 and F = 11,6). Nevertheless the profiles of mental demand attending the four subescales selected are quite different specially in interaction dimensions (effort and frustration) (Figure 2).
70
60
50
40 30
OS MEN
20
TEMP
Mean
10
EFF FRUS
0 1,00
2,00
3,00
Instructions
1 =F&C, 2 = C and 3 = F Figure 2 : TLX mental workload profiles of the different instructions conditions
In mental demand and temporal demand there are not significance difference between groups assigned to the three instructions (F=0,51, p=0,60 and F=0,42, p=0,66 respectively), probably because of the shortness of the task. On the other hand, instructions seems to affect frustration (F= 4,95, p=0,01) and effort (F= 3,06, p=0,06). The condition that shows a lower level of Effort is Carefully. In frustration, Fast group shows higher average scores. In mental demand and temporal demand there are not significant differences between groups (Table 4).
Os Mean = 11,43 F&C Sd = 2,54 Mean = 11,55 C Sd = 2,91 Mean = 11,62 F Sd = 3,22
Men Mean = 52,50 Sd = 24,12 Mean = 57,44 Sd = 24,18 Mean = 45,72 Sd = 28,67
Temp Mean = 53,20 Sd = 22,70 Mean = 58,66 Sd = 31,29 Mean = 47,27 Sd = 28,40
Eff Mean = 33,20 Sd = 16,15 Mean = 15,11 Sd = 13,96 Mean = 25,72 Sd = 17,19
Frus Mean = 3,40 Sd = 5,23 Mean = 6,88 Sd = 11,03 Mean = 25,63 Sd = 26,41
Table 4: Summary of NASA ratings (weighted scores) descriptives
Mental workload and performance In the correlation analyses Best trial is the performance measure that presents a higher correlation with mental workload scores. The relationship –as expected- is negative, the faster the best execution, the higher the workload reported in overall score (r =-.584), mental demand (r = -.448), temporal demand (r = -.423) and effort (r = -.310). On the other hand, the relationship with frustration is positive (r =.527). The other performance measure more related to workload dimensions is Overflow. The relationship –as expected- is negative in overall score (r = -.336), mental demand (r = -.385), temporal demand (r = -.216) and effort (r = -.274), and positive in frustration (r =.331). On the other hand, the relationship between overflow and frustration is a direct one, the more the overflowing episodes, the higher frustration ratings (Table 5). We can say that subjects that managed to obtain a low level of failures in the overall task and short times in their best trial experiment higher mental workload, specially in mental demand subescale than those that obtained poorer results.
Os Men Temp Eff Frus
Tt ,067 ,294 ,065 -,073 -,301
Overf -,336 -,385 -,216 -,274 ,331
Best Trial -,584 -,448 -,423 -,310 ,258
Table 5: Pearson correlation coefficients between NASA ratings and performance measures
Instructions and personality In this case the results show that the groups distributed themselves homogeneously way; there were no significant differences in the distribution of the students in the groups. Another remarkable point was to see an important correlation between the neuroticism dimension and the global score at NASA-TLX (r = –.33). Conclusions To sum up, instructions seem to affect performance and discriminate between requirement of speed on the one hand and requirement for speed and accuracy, and accuracy on the other. Subjects confronted with only time pressure have best results in time and poorer in accuracy. Subjects required to behave fast and accurately do not differ from the group with accuracy demand that seems spontaneously engaged to a -unstated- time goal. In regard to mental workload, instructions affect effort and frustration and do not affect neither the overall mental workload index nor mental demand and temporal demand subescales. On the other hand, it is probable that the particular task design and/or the time feedback provided after every trial should counteract the foreseeable instruction effect and homogenize performance between groups. Then the three conditions of instructions even though different in its explicit content are similarly interpreted in the task context. In view of this results, we think that further experimental design in micro-world should to stress the differences between instructions and to avoid the homogenizing effect of temporal feedback. With regard to the particularly scarce differences between groups in NASA-TLX scores, we point to the limitations of this scale to capture the load in such a brief task. In the current version of micro-world in which we are working on the main component of mental workload is the complexity of decision making processes and fatigue has a slight influence. In relation to performance analyses, we have not found a unique satisfactory measure to classify integrating successful execution on one trial and in the whole task. Probably a wide set of different measures should to be explored in order to select them according to their relevance to particular purposes (learning skills, concentration capability, resistance to fatigue, reliability, reaction time etc). Another complementary line to explain this results is that the speed orientation or the accuracy orientation effect induced by instructions would become more visible in a qualitative analyses focused on activity rather than in results as execution time or failures. Maybe the operational mode approach would prove to be more pertinent to capture differences in complex activities like dynamic systems control processes (Díaz and Ponsa 2000; Díaz and Ponsa 2001).
References Brehmer, B., Leplat, J. & Rassmussen, J. (1991) Use of Simulation in the Study of Complex Decision Making. In: Brehmer, B., Leplat, J. & Rassmussen, J. (eds.) Distributed Decision Making: Cognitive Models for Cooperative Work. Chichester: Wiley & Sons. Díaz, M. & Ponsa, P. (2000) Risk estimation on dynamic system control in simulated environment. In Tomás, E.,
Remeseriro, C. & Fernández, J.A. (eds.) Work and Organizational Psychology and Human Resources: new approaches. Madrid, Spain: Biblioteca Nueva. Díaz, M. & Ponsa, P. (2001) Mental Workload Assessment from Performance Analysis on a Simulated Process Control System. In Proceedings of the International Conference on Computer-Aided Ergonomics and Safety (CAES’01), Maoui, Hawaii 29th july-3d august. In press. H Eysenck, H.J. & Eysenck, S.B.G. (1989) EPQ. Cuestionario de personalidad para niños y adultos. Madrid: TEA Ediciones S.A. Hockey, G. K., Briner, R. B., Tattersall, A. J. & Wiethoff, M. (1989) Assessing the impact of computer workload on operator stress: the role of system controllability. Ergonomics, 34 (11) 1401-1418. Howe D. E & Vicente, K. J. (1998) Measures of operator performance in complex, dynamic microworlds: advancing the state of the art. Ergonomics, 41 (4) 485-500. Jorgensen, A. H., Garde, A. H., Laursen, B. & Jensen B. R. (1999) Applying the concept of mental workload to ITwork. In: Straker, L. & Pollock, C. (eds.) CybErg 1999. Rassmussen, J. (1993) Analysis of Tasks, Activities and Work in the Field and in Laboratories. Le Travail Humain, tome 56, nº 2-3/1993, 133-155. Rasmussen, J., Pejtersen A. M. & Goodstein L.P. (1994) Cognitive systems Engineering. Chichester: Wiley & Sons. Vicente, K. J. (1999) Cognitive work analysis. London: Lawrence Erlbaum Associates.