Subjective Fidelity Assessment of Rotorcraft Training Simulators Emma Timson, Philip Perfect, Mark White*, Gareth D. Padfield School of Engineering, University of Liverpool Robert Erdos, Arthur W. Gubbels Flight Research Laboratory, National Research Council, Canada
Abstract This paper describes the potential applications of a Simulation Fidelity Rating (SFR) scale, to support the subjective assessment of rotorcraft training simulators. In this paper, the SFR scale has been used to examine pilot sensitivity and subsequent task strategy adaptation to simulator transport delay variation in the Precision Hover manoeuvre. The current FAA AC 120-63 JAR-FSTD H simulator qualification standard requires a transport delay error of no more than 100ms to meet the most stringent Level D criteria. The current study has highlighted that the majority of the evaluation pilots indicated that transport delays of this order would compromise the utility of the simulator to provide training for the Precision Hover manoeuvre. For an additional 100ms transport delay, the pilot’s SFRs ranged from Level 1 fidelity (simulation fit for purpose) to Level 3 fidelity (simulation not fit for purpose). The factors influencing the SFRs awarded and their impact on the application of the SFR scale are discussed in this paper. The methodology detailed in this paper has potential for more objective quantification of the utility of rotorcraft training simulators. ICAO
NOTATION A/s ATD NA T XA η η̅ ̇ pk σn, σN θ p bw
Attack points per second (1/s) Additional Transport Delay (ms) Number of attack points (nd) Total time (s) Pilot lateral cyclic control (in) Phase Change (nd) Control deflection (in) Mean control deflection (%) Peak lateral cyclic attack rate Roll attitude (°) cumulative, total RMS (nd) Pitch attitude (°) Phase Delay (sec) Bandwidth frequency (rad/sec)
ACRONYMS ADS AG ASRA COF FAA FFS FSTD GARTEUR HQR
Aeronautical Design Standard Action Group Advanced Systems Research Aircraft Cut Off Frequency Federal Aviation Authority (US) Full Flight Simulator Flight Simulation Training Device Group for Aeronautical Research and Technology in Europe Handling Qualities Rating
*Contact
[email protected]
IWG JAA JAR MTE NASA NRC PH RAeS RCAH RMS SFR STD ToT UoL VMS
International Civil Aviation Organisation International Working Group Joint Aviation Authority Joint Aviation Requirements Mission Task Element National Aeronautics and Space Administration (US) National Research Council (Canada) Precision Hover Royal Aeronautical Society (UK) Rate Command Attitude Hold Root Mean Squared Simulation Fidelity Rating Synthetic Training Device Transfer of Training University of Liverpool Vertical Motion Simulator
INTRODUCTION The use of piloted flight simulation for training as a substitute for real world flying, reduces costs, increases safety and allows the operator to control external parameters such as environmental conditions and operational situations. Whilst Zero Flight Time Training is prevalent in the fixed wing community this not the case in the rotary wing community. However, some training providers are beginning to offer Zero Flight Time Training for rotorcraft. For example, HeliSim offers 100%
conversion training on the EC225 Full Flight Simulator [1]. To ensure the increased adoption of such tools for rotary-wing pilot training, well defined assessment of the fidelity of rotary-wing simulators is essential. In the current Joint Aviation Authorities (JAA) and Federal Aviation Authority (FAA) rotary-wing simulator qualification standards [2], [3], simulator fidelity is assessed on a component level and, following a subjective evaluation, the simulator is qualified at a certain level of utility for training. These criteria may lead to the impression that a simulator qualified to the highest standard (Level D) is suitable for all training [4]. However, experienced pilots find instances where the simulator cueing or flight model is not representative of the actual flight experience, even in Level D simulators. Conversely, for some tasks where it is assumed Level D is required, the same training value may be achieved using a lower order device and simpler models. It is therefore suggested that a method for determining fitness for purpose on a task-by-task basis is necessary to ensure training resources are being economically and efficiently utilised. A critical examination of the rotorcraft simulation fidelity standards in Europe, JAR-FSTD H [2] by GARTEUR HC-AG-12, reported that many of the rotary-wing simulator tolerances have been derived from their fixed wing counterparts, JAR-FSTD A [6], and the engineering science base behind these standards is lacking [7]. The GARTEUR AG-12 work also highlighted the need for the evaluation of overall fidelity of the integrated system of pilot and machine [7]. A further recommendation was that supporting data and analysis techniques are required to verify that adhering to the current criteria guarantees that a simulator is of sufficient quality for the required purpose. In particular, the report recommends that the handling qualities metrics in ADS-33E-PRF [8], be utilised as fidelity metrics, to complement the current JAR-FSTD H metrics. It could be argued that differences in the assigned ADS-33E-PRF handling qualities, denoted by Handling Qualities Ratings (HQRs) between flight and simulation, suggest the simulator would not be appropriate for training because the performance attained and/or the compensation required is not the same in the simulator and flight. Although disparate HQRs might suggest the simulator is unfit for purpose, a match of HQRs between simulation and flight is not considered sufficient for fitness for purpose [9]. Previous research at NASA Ames [10] with a rate response type helicopter with Level 1 ADS-33E-PRF Handling Qualities [8] investigated the effect of varying simulator transport delay on
HQRs. Transport delay is defined as "the total Synthetic Training Device (STD) system processing time required for an input signal from a primary pilot flight control until motion system, visual system or instrument response. It does not include the characteristic delay of the helicopter to be simulated." [2]. The results showed that additional transport delays of 83ms (not including the simulators inherent visual transport delay of 10ms and motion transport delay of 90ms) resulted in degradation of the average HQRs from Level 1 to Level 2 in the Vertical Motion Simulator for seven tasks. An additional 200ms resulted in an average Level 2 rating and a 383ms time delay resulted in consistent Level 3 HQRs being awarded for the same tasks. This would suggest that a simulator with an additional 83ms transport delay (above the characteristic delay of the baseline) would lead to compromised training utility. However, for an FSTD to be qualified to Level C or D the transport delay must be 100 ms or less after control movement and for Levels A and B must be 150 ms or less after control movement [2]. Therefore this simulation would still be considered acceptable by the Level D criteria. The focus of the current paper is to further investigate the effect of transport delay on simulation fidelity and the validity of the current JAR-FSTD H tolerance. JAR-FSTD H [2] and FAA AC-120-63 [3] are two of a number of rotary-wing simulation standards set by various national authorities. This has led to the requirement for simulator operators with customers from multiple countries to qualify their simulators to several different standards, In response to this, a Royal Aeronautical Society (RAeS) led International Working Group (IWG) was initiated with the aim of consolidating the current qualification standards, including JAR-FSTD H and FAA AC 120-63, into a single comprehensive standard, reducing the current 27 simulator types into 7. From this work, two new manuals for training simulator qualifications have been developed; ICAO 9625 ed. 3 Vol. I [11] for fixed-wing and ICAO 9625 ed. 3 Vol. II [12] for rotary-wing simulators. The ICAO documents address the need for a training-focused fidelity assessment of synthetic training devices and the need to assess fidelity on a task-by-task basis: "The process outcome defines levels of fidelity of simulation features required to support the training tasks associated with existing pilot licensing, qualification, rating or training types" [12]. A rotarywing flight simulation evaluation handbook is to follow its fixed-wing counterpart [13] in the near future to aid the utilisation of ICAO 9625 ed. 3 Vol. II. However, it is beyond the scope of the IWG work to gather supporting data or to address the validity of
the standards. It is proposed that the methods described within this paper could be used alongside the ICAO documents for task-by-task fidelity assessment. The following two sections of this paper describe the current approaches to assessing simulator fidelity and a proposed new approach to subjective fidelity assessment. The facilities used in the study and the methodology are then briefly described. Results from the trials into the effects of transport delay on pilot perceived fidelity are then presented. This is followed by a discussion of the research to date, along with some conclusions and recommendations for further work.
Previous work at the University of Liverpool (UoL) [15] has utilised another quantitative, time domain, metric of pilot activity for fidelity assessment; the control attack [22]. This is defined as the ratio of peak rate of change of a control deflection, 𝜂̇ 𝑝𝑘 , to the magnitude of the control deflection, 𝜂, or;
Αtt =
There are many suggested methods for measuring pilot behaviour, cognitive and/or physical. One of these is to measure the cut-off frequency [17]. Cutoff frequency is related to the crossover frequency [18] and can be defined as the frequency at which 50% of the cumulative power of the (frequency content of the) control input signal is reached, or 70.7% of the root mean square [17]. 𝑓 = 𝐶𝑂𝐹 𝑤ℎ𝑒𝑛
𝜎𝑛 𝜎𝑁
= 0.707 (eq. 1)
The cut-off frequency metric has been used in a number of flight vs. simulation comparisons and has shown good correlation with subjective pilot opinion of the fidelity of the simulation [19-21]. This paper will further investigate the correlations between subjective results and the cut-off frequency.
𝜂
(eq. 2)
The attack metric is calculated for each discrete control input from the time histories of the control inputs as shown in Figure 1. -0.33 -0.34
MEASURING THE UTILITY OF SIMULATION FOR PILOT TRAINING
-0.35
XA [inch]
The current study proposes an approach for measuring the fidelity of the overall piloted simulation, as suggested by GARTEUR HC AG-12. Traditional metrics for quantifying fidelity are focused on the physical and functional similarity of the simulator systems to the real aircraft or "the degree to which a flight simulator matches the characteristics of the real airplane" [14]. This is often referred to as "component fidelity". Inherent limitations in any ground-based simulator will limit the realism of systems with even the highest level of component fidelity, undermining the correlation between this concept of fidelity and the associated utility of the simulator. An additional definition for “overall fidelity”, has been proposed in previous work and is more concerned with the behaviour of the pilot in the simulator [15]. This overall fidelity can be defined as " the simulator's ability to induce the pilot trainee to output those behaviours known to be essential to control and operation of the actual aircraft in performance of a specific task " [16].
𝜂̇ 𝑝𝑘
-0.36 -0.37
-0.38 -0.39 -0.4 35
35.2
35.4 35.6 Time [s]
35.8
36
Figure 1 – extraction of discrete control deflection for XA There are a number of metrics derived from the values of control attack that then describe the control activity in a single number. These are defined as; i.
Attack per second, A/s - The average number of discrete control inputs the pilot makes per second. 𝑁 𝐴/𝑠 = 𝑇𝐴 (eq. 3)
ii.
Attack Rate - ̅pk A measure of rapidity of control movements.
iii.
Mean Control Deflection, 𝜂̅ - The average amplitude of individual control deflections.
𝜂̅ =
𝜂1 +𝜂2 +𝜂3 +𝜂4 +…𝜂𝑁𝐴 𝑁𝐴
(eq. 4)
Although quantitative methods for assessing simulator fidelity are important, the ultimate determination of the utility of a simulator for training must be carried out by a pilot. This is noted in JAR-FSTD H; "Whereas functional tests verify the acceptability of the simulated helicopter systems and their integration, subjective tests verify the fitness of the FSTD in relation to training, checking and testing tasks" [2].
The current requirements for subjective assessment of rotorcraft simulators, Paragraph 3.2.4 of Section 2 of JAR-FSTD H states: “When evaluating functions and subjective tests, the fidelity of simulation required for the highest level of qualification should be very close to the helicopter. However, for the lower levels of qualification the degree of fidelity may be reduced in accordance with the criteria contained (in the document)” Satisfying this requirement is left open to interpretation by the operator and qualifying body. This has led to a lack of structure and standardisation throughout the industry with a multitude of different rating scales and subjective methods for evaluating simulator fidelity being used. A number of these ratings scales, and their relative merits are detailed in [23]. What is required for robust subjective assessment of simulation utility is a standardised, prescriptive subjective rating scale for the pilot‟s assessment of fidelity as part of the overall qualification process. THE SIMULATION FIDELITY RATING (SFR) SCALE A Simulation Fidelity Rating (SFR) Scale [23] has been developed with the intention that it could be used to complement current quantitative evaluation methods such as those in the ICAO 9625 documents [11]. The current research highlights the use of the SFR scale for rotary-wing training simulator qualification. However, it is proposed that the SFR scale could be used for qualification of simulators of all aircraft types and indeed other training simulators, such as medical simulations. If internationally accepted, such a scale would provide structure and standardisation to subjective testing. The scale has been designed to evaluate simulator utility in terms of fitness-for-purpose on a task by task basis. Using this approach, any level of device can be assessed to determine a "training envelope": a comprehensive determination of tasks for which that particular device can be used. This complements the approach taken by the International Working Group (IWG) of the Royal Aeronautical Society (RAeS) with the ICAO 9625 documents. The SFR scale has been developed at the University of Liverpool (UoL)/NRC Ottawa, through consultation with test pilots and a series of simulation trials [23] [24]. The scale is shown in Figure 3.
The SFR scale structure is based on the CooperHarper Handling Qualities Rating (HQR) scale [8]. This format was selected because it mitigates against ambiguity through a prescriptive, decision making process. It is also proposed that the simulator evaluations are conducted by test pilots, similar to handling qualities evaluations. A similar structure to the HQR scale is appealing as it is familiar to most test pilots. Akin to the HQR scale, the output from the SFR is the assignment of a rating which lies within ‘levels’. In the case of the SFR scale, these levels define the training utility based on the simulator’s perceived deficiencies - Level 1 denoting the highest utility of the simulation for a given purpose (see Table 1The individual ratings then represent the subjective judgement by the pilot on the extent of their adaptation of task strategy (negligible, minimal, moderate, considerable or excessive) and comparative task performance between the simulator and flight (equivalent, similar or dissimilar). The task strategy is not only the physical activity of the pilot on the controls but includes aspects such as control shaping, and the perception and utilisation of task cues. The comparative task performance is based on desired and adequate performance criteria [8]. It should be emphasized that, unlike the HQR scale, the SFR scale is not inherently ordinal; each SFR rating represents a combination of the task strategy adaptation and comparative performance descriptors. Further details of the development of the SFR scale are reported in [23]. Pilots are provided with performance feedback from the engineers if unsure to allow for an informed decision on comparative performance to be made. However, perception of task strategy adaptation is inherently more subjective. A questionnaire has been developed to complement the SFR scale and to aid the pilot in communicating the perceived strategy adaptation. The same descriptors are used as in the SFR scale and the format allows the pilot to consider each axis individually. Because the SFR scale does not identify what the simulator deficiencies are, the questionnaire can be used to ascertain which aspects of the simulation underpin the rating. The primary use of the SFR scale is to gather fidelity ratings from pilots for subjective assessment of the simulator. However, it is also envisaged that the SFR scale could be used to help derive acceptance tolerances between flight and simulation for individual parameters and integrated system. Plotting quantitative metrics [9] as a function of SFR allow for boundaries to be set for acceptable limits for specific tasks. For this approach to be successful,
a large number of test pilots would be required and the proposed metric would have to be validated. The practical implications of both applications and lessons learnt from this study are detailed in the discussion section of the paper. Table 1 - Definitions of Fidelity Levels
1
2
3
Simulation training alone is sufficient to train the pilot to operational standards of performance. There is complete ToT from the simulator to the aircraft in this task. Additional training in the aircraft would be required in order to achieve operational standards of performance in the aircraft. There is limited positive ToT from the simulator to the aircraft in this task. Exposure to the simulator will confer inappropriate or dangerous behaviours to the pilot (negative ToT occurs), meaning that the simulator is not suitable for training to fly the aircraft in this task.
awarded by the pilot. The pilot was then informed that a 'modification had been made to the simulation' but was not told the nature of the change, whether it be to the flight model, motion/visual system, control feel etc. and the manoeuvre was re-flown. The SFR was obtained after the first attempt at flying the modified simulation as it was felt that the pilot’s greatest sensitivity to variations in simulator deficiencies will be upon first exposure to a specific model/vehicle. The pilot was then asked to complete three more runs in the modified configuration to allow for adaptation of task strategy prior to awarding a HQR for the modified simulation. As well as subjective ratings, data from the simulation trials were analysed to compare performance and pilot strategy between the simulations.
HELIFLIGHT-R The simulator trials were conducted using the UoL's reconfigurable full motion simulator, HELIFLIGHTR [25], shown in Figure 2. The simulated aircraft for this work is the National Research Council of Canada (NRC), Flight Research Laboratory's Bell 412 Advanced Systems Research Aircraft (ASRA) [26]. The Bell 412 ASRA is a FBW aircraft which can be operated with various control laws and various stability characteristics, allowing it to be used as an in-flight simulator. In this paper, the Rate Command Attitude Hold (RCAH) flight control law was used. The flight models have been developed for real-time operation in HELIFLIGHT-R using Advanced Rotorcraft Technology's multi-body flight dynamics modelling environment, FLIGHTLAB [27]. PILOTED SIMULATOR FIDELITY TRIALS The aim of the piloted trials reported herein was to determine the effect of transport delay variations on overall fidelity and fitness for purpose of a simulator for training the flying skills required for a Precision Hover (PH) manoeuvre. The effects of transport delay were isolated by comparing a baseline simulation (F-B412 RCAH) to a number of modified simulations where the only the transport delay was varied. Each pilot was allowed up to four practice runs at the PH manoeuvre in the baseline configuration to ensure that a consistent technique had evolved. An HQR for the baseline configuration was then
Figure 1 - HELIFLIGHT-R Full Flight Simulator
The PH manoeuvre is a limited agility ADS-33EPRF [8] mission task element (MTE) that involves transition to hover and hover maintenance phases. The course is shown in Figure 4, and the performance criteria are detailed in Table 2. The manoeuvre is initiated from the hover. From here, the aircraft translates at 45° relative to the heading of the rotorcraft at a ground speed of between 6 and 10 knots, at an altitude of less than 20 feet. The ground track should be such that the rotorcraft will arrive over the target hover point. The hover should be captured in one smooth manoeuvre following the initiation of deceleration – “it is not acceptable to accomplish most of the deceleration well before the hover point and then to ‘creep up’ to the final position” [8].
Figure 3 - The Simulator Fidelity Rating (SFR) Scale
The relationship between the heights of the pole and the hover board is such that, when over the target hover point and aligned with both the marker on the pole and the hover board, the rotorcraft will be at the reference height of 10 feet.
ATD=200ms and ATD=300ms. This lead to Total Transport Delays (TTD) of 100ms (baseline), 150ms, 200ms, 300ms and 400ms.
Table 1 - Precision Hover Performance Criteria
The transport delay of a vehicle directly impacts the system bandwidth and phase delay. Bandwidth, ωbw is a stability measure that defines the range of control input frequencies over which a pilot can apply closed-loop control without threatening the stability of the aircraft. The phase delay, τp, is a measure of the slope of the phase response beyond 180° phase, and is defined as
Criteria Attain stabilised hover within X seconds of initiation of deceleration Maintain a stabilised hover for at least X seconds Maintain the longitudinal and lateral position within ±X feet on the ground Maintain altitude within ±X feet Maintain heading within ±X ° There shall be no objectionable oscillations during the transition to, or during the hover
Desired Perf.
Adequate Perf.
5
8
30
30
3
6
2 5
4 10
For the purpose of this paper; ATD=TTD-100 (ms).
𝛥Φ
2ω180 𝜏𝑝 = 57.3 ×2𝜔
180
Applies
Does not Apply
(eq. 5)
where ΔΦ2ω180 is the phase change between ω180 and 2ω180 [9]. The bandwidths and phase delays of the various transport delay models were calculated using frequency sweeps in FLIGHTLAB. The pitch and roll results are shown against the ADS-33E-PRF tolerances for 'all other' MTEs (Figure 5 and Figure 6). The Precision Hover manoeuvre is a limited agility MTE and therefore the predicted fidelity is determined from the tolerances for 'all other' MTEs. Figure 5 and Figure 6 show that the baseline and ATD=100ms cases lie well within Level 1. The ATD=200ms case is borderline Level 1/Level 2 and the ATD=300ms case lies within level 2 for both roll and pitch bandwidth. As per the earlier discussion, the change in HQ level for the ATD≥200ms cases would suggest a fidelity deficiency. It may also be the case that cumulative deficiencies in roll and pitch could lead to poorer assigned handling qualities [29].
Figure 4 - Precision Hover Course
PREDICTED FIDELITY ANALYSIS The inherent transport delay in the HELIFLIGHT-R simulator is approximately 100ms. The values for the Additional Transport Delays (ATDs) in the model were ATD=50ms, ATD=100ms,
As stated earlier, JAR-FSTD H defines the transport delay as Transport delay is defined as "the total Synthetic Training Device (STD) system processing time required for an input signal from a primary pilot flight control until motion system, visual system or instrument response. It does not include the characteristic delay of the helicopter to be simulated." [2]. In the case of comparing simulations to a baseline simulation, the "helicopter to be simulated" is the baseline simulation and so the characteristic time delay of the helicopter to be simulated is 100ms in this case. The JAR tolerance on transport delay is therefore concerned with the additional transport delay rather than the total transport delay. It would therefore be predicted that the pilots will perceive a degradation in training utility for ATD≥100ms.
Figure 5 -Pitch Hover Bandwidth, All Other MTEs Figure 2 - Precision Hover HQRs Table 2 - Precision Hover HQRs Additional Time Delay Pilot A Pilot B Pilot C Pilot D Pilot E Pilot F Pilot G
0ms
50ms
2 3 3 3 3 4 4
x x x x 4 4 4
100ms
200ms
2 5 4 6 5 4 4 x = Not tested
4 5 x x 6 6 7
300ms 5 x 6 7 x x 9
Simulation Fidelity Ratings Figure 6 - Roll Hover Bandwidth, All Other MTEs
SUBJECTIVE RESULTS The mean, maximum and minimum HQRs and SFRs awarded by the test pilots for varying levels of Additional Transport Delay (ATD) are shown in Figure 7 and Figure 8below. The dashed lines in Figure 7 and Figure 8represent the boundaries between the various levels of handling qualities and simulation fidelity, defined by the respective rating scales. As noted earlier, the SFR scale is not an ordinal scale and each rating represents a combination of task strategy adaptation and comparative performance. Therefore there is limited value in the average rating but these serve to show the overall rating trend. The individual pilot ratings can be found in Table 3 and Table 4. As expected, HQRs and SFRs both increase as the transport delay increases. However, large spreads are seen from one pilot to another for both SFRs and HQRs. Handling Qualities Ratings
Figure 3 - Precision Hover SFRs Table 3 - Precision Hover SFRs Additional
50ms
100ms
200ms
300ms
Time Delay Pilot A Pilot B Pilot C Pilot D Pilot E Pilot F Pilot G
x 1 x 3 x 8 x 5 2 2 2 4 4 7 x = Not tested
4 9 x x 9 9 7
8 x 9 8 x x 10
E
Pilot Comments From Questionnaire F The key points drawn from the simulation fidelity questionnaires are given in Table 5.
Pilot
A
[ms]
No perceived differences
200
Low pass controls
300
Decel more difficult, longer to stabilise, lots of input filtering.
100
Aggression on the capture only, only point where fidelity is an issue. (SFR 3)
200
Struggled in the capture, response is unpredictable
100
The capture was more difficult, tried to back out of the loop for the hover. Generally more aggressive. Considerable adaptation in lateral and longitudinal cyclic and pedals. (SFR 8)
300
Unrepresentative performance driven by time to steady state. Hunting for attitudes and backing out of the loop in pitch and roll.
100
The lateral positioning during the stabilisation phase was degraded. A lot of left stick required (SFR 5)
300
Ridiculous time to stable, this and station keeping difficulty drove SFR, "would be scared if I was a student"
C
100
Slightly busier in lateral and longitudinal cyclic. (SFR 2)
200
Had to damp down inputs, loss of performance in all areas
50
Something different but unsure what
100
Lateral cyclic inputs were magnified by about 2. (SFR 4)
200
Couldn't be aggressive, lateral cyclic seemed the most different,
filter
applied
to
A bit more activity in pitch and roll noticed, performance slightly degraded
100
The lateral cyclic was easily excited - had to back out. Adaptation in multiple axes. (SFR 7)
200
Reduced aggression, longitudinal station keeping most difficult, backing out of PIOs
300
Horrendous time delay noticed
G
100
B
D
Comments
Something slightly different in lateral and longitudinal cyclic.
50
Table 4 - 100ms additional transport delay pilot comments
ATD
50
DISCUSSION The SFRs in Table 4 show that only 2 out of 7 pilots rated the ATD=100ms simulation in Level 1 Fidelity for the Precision Hover task. A simulation with this much additional transport delay compared to the helicopter to be simulated (in this case the baseline simulation) would be on the boundary of compliance with JAR-FSTD H Level D transport delay criteria [2] which allows for zero flight time training in the simulator. The worst SFRs were in the Level 3 fidelity region (simulation not fit for purpose). This supports the findings of the GARTEUR AG-12 work that meeting the quantitative criteria does not necessarily guarantee a simulation that is fit for purpose. It is therefore proposed that the validity of this tolerance should be investigated. Although only the Precision Hover task has been investigated in the paper, it is very likely that the maximum allowable transport delay would be dependent on the task to be trained. For example, tracking tasks are much more bandwidth sensitive than more open-loop style manoeuvring. There is significant variance observed in the HQRs from one pilot to another. A spread of two HQR points for an MTE is generally accepted as a natural variation amongst a group of pilots, but much larger
spreads are seen in this study. Figure 7 shows that Pilot A's HQRs agree with the predicted HQs in Figure 5 and Figure 6, whereas the HQRs from Pilot F and Pilot G are significantly higher. It is suggested that the spread in HQRs may be due to some pilots flying at a higher gain than required by the task to increase precision, inadvertently turning the task into a tracking task. This in turn would increase their workload and therefore their HQRs. Note that the ADS-33E-PRF bandwidth requirements are much more strict for target tracking and acquisition tasks (see Figure 9) and the assigned ratings from Pilot F and Pilot G are more in line with these predicted HQs.
with the deficiencies better, or his interpretation of the briefing was different. The latter can lead to; •
differing interpretation of task performance requirements
•
different level of selected task aggression
The pilots may have a different experience due to; •
variations in piloting technique
•
variations in the environmental disturbances (controlled in this case)
•
different levels (inevitable)
of
training/experience
Figure 9 Roll Hover Bandwidth- Target Tracking and Acquisition It is expected that an increase in the attack metrics would reflect an increase in workload, e.g. a higher attack rate suggests the pilot is using more rapid inputs. Figure 10 displays the various attack parameters from the 100ms additional time delay test point for all seven pilots and the corresponding awarded SFRs This supports the suggestion as pilots using a higher attack rate award higher HQRs for the same test point. There is an even larger spread observed for the SFRs (Figure 8). The spread of SFRs impacts the conclusions that can be drawn in terms of proposing a tolerance on transport delays for the PH and similar manoeuvres. However, this does not mean that any ratings should be disregarded as anomalous. In the light of a rating spread, it is best practice to inspect each rating on a case by case basis [28]. There are several possible explanations for the better ratings from Pilot A. Either he used a sufficiently different control strategy that he was able to cope
Figure 10 - The Effect of Cyclic Control Activity in the Hover Maintenance Phase on HQRs for 100ms additional transport delay. The piloting technique taught to test pilots for flying with transport delay is to back out of the control loop (reduce frequency and magnitude of inputs) to avoid pilot induced oscillations. The pilot comments shown in Table 5 highlight that the majority of pilots noted backing out of the lateral cyclic control loop as the most prominent aspect of their task strategy adaptation for the 100ms additional transport delay case. It would be expected that a reduction in lateral cyclic (XA) cut-off frequency would reflect this adaptation. Figure 11 shows the lateral cyclic cut-off frequency for the 100ms additional transport delay case (as a
percentage of the baseline value) against the level of adaptation awarded by each of the pilots via the SFR scale. There does appear to be a trend between the extent to which the pilot backs out of the control loop and the perceived adaptation for most pilots. This suggests that their SFRs, although variable, are valid. This correlation also strengthens the proposal for cut-off frequency as a quantitative metric for assessment of overall fidelity [17]. While pilots consistently noted backing out of the lateral control loop, this was not the case for the longitudinal cyclic and a trend in reducing longitudinal cut-off frequency was not seen. This suggests the precision hover strategy is dominated by the lateral cyclic strategy rather than the longitudinal. This may be due to the cueing environment as the pilot is more strongly cued in the lateral axis than he is in the longitudinal axis for the PH task.
Figure 11 - Relationship between reduction of lateral cyclic cut-off frequency and adaptation awarded (ATD=100ms only) Despite the general trend, there is one clear anomaly observed in Figure 11. The negligible adaptation awarded by Pilot A correlates to a much more marked reduction in cut-off frequency than the trend would suggest. Pilot A may not have correctly perceived his level of adaptation in this case. Pilot A is a test pilot who regularly flies the Bell 412 ASRA in the Precision Hover manoeuvre. He also often flies the aircraft with high levels of transport delay to instruct trainee test pilots on how to identify and adapt their control strategy to deal with this type of vehicle deficiency. For this reason it is suggested that through such experience, Pilot A has become highly proficient at the task and has evolved a strategy for flying the PH that inherently compensates for transport delay. Pilot A noted that "the hover task was undemanding and so could be executed with a low level of control activity. By
retaining that strategy the effect of time delay did not become evident, but the strategy was not adapted in response to the delay". This may account for his good HQRs and SFRs compared to other pilots and it may well be that his experience of flying with transport delay is not typical of most pilots. With Pilot A's ratings removed, a significant spread still remains in the SFRs observed for the 100ms time delay case (SFR 2 to SFR 8). These variations in pilot sensitivity may be related to the fundamental piloting strategy and task aggression used by the pilots. The Precision Hover manoeuvre requires the pilot to transition to the hover point with a ground speed between 6 and 10 kts and decelerate in a single smooth motion. If a pilot decelerates to the hover more slowly and/or uses a two stage deceleration then the task aggression is reduced: therefore reducing the extent to which the aircraft dynamics, and thereby the transport delay, are excited. Similarly, during the hover maintenance, the pilot is required to maintain lateral and vertical position within ±3ft. In any of the RCAH variants, the attitude hold augmentation allows the pilot to maintain accurate plan position with very little workload in the absence of environmental disturbances. However, if the pilot engages with the controls, trying to further increase positional accuracy (beyond what is required by the task definition) then the effect of the transport delay will become more apparent. Note that in these cases, the pilot has not executed the task as required, suggesting poor task definition or insufficient briefing. To determine whether the variation in pilot selected aggression was responsible for the spread in SFRs, the peak ground speed during the transition was measured for each 100ms additional transport delay run. The results are plotted against the SFR awarded in Figure 12. It is clear that the test point that resulted in an SFR 4 (from pilot F) included a transition that was much faster than required by the task (Table 2). However, the highest and lowest SFRs awarded came from runs with lower speeds, suggesting that there is no strong correlation between selected transition ground speed and SFR awarded. The variations may also have been due to the piloting strategy and task aggression used during the hover maintenance phase of the manoeuvre, as was the case for the HQRs. Figure 13 displays the various attack parameters from the 100ms additional time delay test point for all seven pilots and the corresponding awarded SFRs. There is a positive
correlation between the size and rapidity of the cyclic control inputs and the SFR awarded. The equivalent plots for 50ms, 200ms and 300ms can be seen in Appendix A and show the same correlations, but with a reduced range of SFRs awarded. This correlation supports the hypothesis that, in the presence of transport delay, a pilot using larger, more rapid, inputs generally awards a higher (poorer) SFR due to increased excitation of unfavourable oscillations.
change in this awarded level would lead to a change in fidelity level awarded (SFR 8 to SFR 6). Consequently, this SFR should be rejected as an anomalous result. It is important that all results are inspected not just the apparently anomalous results. Appendix B includes the performance plots for the remaining six pilots. They suggest that pilots F and G may have also been lenient with their comparative performance (Figures B5 and B6), although not to the extent that they should be deemed as anomalous.
Faster than the 10kts ground speed limit
Figure 12 Effect of Ground Speed on SFR (100ms) Again, there appears to be one anomaly in Figure 13. The pilot that awarded an SFR 8 appears to be using smaller, less rapid inputs than would be expected by the trend. The SFR 8 was awarded by Pilot C and was the poorest rating awarded for 100ms of additional time delay. SFR 8 denotes "considerable or less" adaptation with "similar performance not attainable", as per the SFR scale (Figure 3).
Figure 43 - The Effect of Cyclic Control Activity in the Hover Maintenance Phase on SFRs for 100ms additional transport delay.
Pilot C awarded a HQR=3 for the baseline simulation and a HQR=4 for the 100ms additional time delay, indicating desired performance was achieved in both cases. The guidelines for the SFR scale would declare this as equivalent performance achieved [23]: "Equivalent performance: The same level of task performance (desired, adequate) is achieved for all defined parameters in simulator and flight." Figure 14 shows the performance attained in the baseline simulation and 100ms additional time delay case. Pilot C achieved adequate performance in both the baseline and 100 ms cases due to altitude and longitudinal position errors. The lateral position and heading were desired in both simulations and therefore meet the criteria for equivalent performance. It appears that Pilot C did not correctly perceive his (comparative) performance and a
Figure 14 - Pilot C 100ms Time Delay PH Comparative Performance
After analysis of the pilot ratings, a number of questions relating to the industrial application of the SFR scale are posed:
guidelines is the focus of ongoing work to address this. These guidelines will include notes on task definition and intended purpose definition as well as anomaly investigation and rejection of SFRs.
1. What type of assessment pilot is required? 2. What additional material and training is required? 3. How will the training tasks be defined? 4. How many evaluation pilots are required? 5. What would be the formal process for the detection and rejection of anomalous ratings? 6. How will this methodology fit in with current evaluation methods? While these questions have not yet been fully answered, the following comments aim to prompt further discussion and research. Questions 1&2: The evaluation pilot's ability to assess their performance and task strategy is imperative for meaningful ratings. It is proposed that the pilot should also be properly trained in the use of the HQR and SFR scales. It is suggested that test pilots are the most suitable for this fidelity assessment method. However, the evaluation pilot would ideally have recently flown the aircraft to be simulated in the specific tasks to be assessed. This requirement may ultimately compromise the number of appropriate pilots, as it may be expected that experienced test pilots with familiarity on type on the specific tasks to be assessed may be in short supply. A further complication to piloted assessment is what may be termed “vested interest”. Any pilot with considerable previous experience on type, who has “invested” effort in the development of typespecific compensation strategies may be ill-disposed toward any vehicle that doesn‟t require the skills he has to offer. Conversely, pilots associated with the training establishment who are biased to successful qualification of the simulation may overlook simulation deficiencies to ensure this outcome. Question 3: As stated in the SFR scale section of this paper, the assessment is to be conducted on a task-by-task basis for a specific purpose. This requires very clear definition of the intended purpose of the simulation and of the task to be assessed. If a different task is flown to that intended, the SFRs will not reflect the simulation's utility for training that task. Similarly, if the intended purpose of the simulation is misinterpreted then the SFR may be compromised. The large scatter in the results shown in this paper suggests that the current methodology for fidelity assessment using the SFR scale may require further refinement. The development of
Questions 4&5: In industrial practice, a decision would have to be made in light of case by case analysis of the SFRs. It is clear that it would not be appropriate to side with the most flattering SFR, although it is possible this may be tempting when pushing for simulator qualification. Due to the non ordinal nature of the SFR scale, it is also suggested that the average SFR would not be meaningful. It is proposed that the poorest, SFR should be taken after the removal of anomalous ratings. The justification for this reasoning is that if all pilots are to be trained using the simulation, it only takes one pilot having a disparate experience in the aircraft than in the simulation to lead to the use of an inappropriate strategy, and possibly task failure or even worse. This is in line with the findings of Hoh for HQRs [28]. If this approach is taken for the current study, the defining SFRs would be as shown in Table 6(with Pilot C's ADT=100ms anomaly removed). These SFRs conclude that a 50ms Additional Transport Delay (ADT) simulation would have some utility for training the skills required for the Precision Hover manoeuvre in the baseline simulation. ADT≥100ms would render the simulation not fit for purpose as negative transfer of training may occur. Table 5 - Worst Awarded SFRs
Additional Transport Delay [ms]
50
100
200
300
SFR
4
7
9
10
Whilst using the worst rating is a robust approach for simulator evaluation, this may become impractical for manufacturers and customers pushing for simulator qualification. If one pilot gives a very poor rating compared to the majority of pilots one can imagine an urge to disregard the poor rating as anomalous. A systematic approach to anomaly detection using quantitative metrics and pilot comments as outlined in this paper should therefore by further developed and implemented in and practical use of the SFR scale.
CONCLUDING REMARKS This paper has reported an assessment methodology for subjective simulation fidelity. It is proposed that such a method could be used to complement the
current simulator fidelity qualification standards as well as provide evidence and justification for quantitative fidelity tolerances. The paper has detailed simulator trials where the effect of transport delay on the training utility for a Precision Hover (PH) manoeuvre has been investigated using the HELIFLIGHT-R facility at UoL. Initial results from piloted trials have been presented to highlight influencing factors, emerging trends and the relationship between quantitative metrics and subjective ratings given by evaluation pilots. For Level D flight simulators, a transport delay of 100 ms is acceptable for certification to Level D, which allows the training of all manoeuvres. However this study has shown that ADT=100ms (TTD=200ms in this case) would compromise transfer of training to the baseline simulation (TTD=100ms) for the Precision Hover manoeuvre. The methodology for using the SFR scale to determine of fidelity tolerances has been described. Pilot briefing and task definition are critical for this methodology. The way in which pilots perceive their task adaptation has a significant effect on the SFRs awarded. This can be the difference between a rating that deems the simulator fit for purpose or not. It has been found that pilots that use higher frequency (higher gain) strategies tend to be more sensitive to transport delays in the Precision Hover manoeuvre. The control attack rate and mean control deflection metrics have reflected this and have shown good correlation with the SFRs and HQRs. This supports the proposal for attack metrics as measures of pilot compensation. A need for care when analysing and interpreting subjective fidelity ratings has been highlighted. Suspected anomalous test points must be analysed on a case by case basis and ratings must not be disregarded without justification. An increase in simulation transport delay causes most pilots to back out of the control loop. This change in strategy can be seen quantitatively as a percentage reduction in lateral cyclic cut-off frequency which correlates well with the extent of adaptation reported, showing that cut-off frequency is sensitive to bandwidth deficiencies. As a.a result, cut-off frequency is proposed as a metric for quantitative fidelity assessment. This quantitative metric can also be used to investigate anomalous SFRs in light of a large spread for a given test point.
ACKNOWLEDGEMENTS The research reported in this paper has been partfunded by the UK EPSRC through grant EP/G002932/1 and the US Army International Technology Center (UK) (reference W911NF-11-10002). The authors would like to thank all of the test pilots involved in the work so far, without whom this work would not be possible, especially Andrew Berryman and Stephan Carignan for their continuing support in this work. The contributions of test pilots Lt Cdr Lee Evans (Royal Navy), Lt Cdr John Holder (Royal Navy), Lt Christopher Knowles (Royal Navy) and Flt Lt Russ Cripps (Royal Air Force) of the UK Rotary Wing Test and Evaluation Squadron are also gratefully acknowledged. The use of the Bell 412 ASRA facility is gratefully acknowledged, along with the support of Flight Research Laboratory at the NRC, Ottawa. The authors would also like to thank all who attended the 'Simulation Fidelity Workshop' (http:/www.flightlab.liv.ac.uk /fidelity) at the 67th AHS forum in May 2011; comments and suggestions made at this event have contributed greatly to the research effort. REFERENCES 1.
http://www.flightglobal.com/news/articles/civilsimulators-special-ec225-simulator-flight-teststartling-realism-325614/
2. Anon.,
'JAR-FSTD H, Helicopter Flight Simulation Training Devices', Joint Aviation Authority, 2008.
3.
Anon., “AC 120-63, Helicopter Simulation Qualification”, Federal Aviation Authority, 1994.
4.
Rolfe, J.M. and Caro, P.W., 'Determining the training effectiveness of flight simulators: Some basic issues and practical developments', Applied Ergonomics, volume 13 no. 4, 1982. pp. 243-250.
5.
Advani, S., ‘International Committee for Aviation Training in Extended Enveloped Draft Master Plan’, Royal Aeronautical Society, July 2009
6.
Anon., 'JAR-FSTD-A - Aeroplane Flight Simulation Training Devices', Joint Aviation Authority, 2008.
7.
Padfield, G.D., Casolaro, D., Hamers, M., Pavel, M., Roth, G. and Taghizad, A., 'Validation Criteria for Helicopter Real-Time Simulation Models: Sketches from the Work of GARTEUR HC-AG12', European Rotorcraft Forum, 2004.
8.
Anon. 'ADS-33E-PRF, Handling Qualities Requirements for Military Rotorcraft', U. S. Army, 2000.
9.
White, M. D., Perfect, P., Padfield, G., 'Progress in the Development of Unified Fidelity metrics for Rotorcraft Flight Simulators', 66th American Helicopter Society Forum, Phoenix, Arizona, US, 11-13 May 2010.
22. Padfield, G.D., Charlton, M.T., Jones, J.P., Bradley, R., ‘Where does the workload go when pilots attack manoeuvres?’, 20th European Rotorcraft Forum, Amsterdam, The Netherlands, Sept 1994
10. Mitchell, D. G., Hoh, R. H., Atencio, A. Jr., 'Ground Based Simulation Evaluation of the Effects of Time Delays and Motion on Rotorcraft Handling Qualities', Aeroflightdyanmics Directorate Report ADA256 921, 1992.
23. Perfect, P., Timson, E., White, M. D., Padfield. G., 'A Rating Scale for Subjective Assessment of Simulator Fidelity', 37th European Rotorcraft Forum, Milan Italy, 13 -15th May, 2011.
11. Anon. 'Manual of Criteria for the Qualification of Flight Simulation Devices, Volume I – Aeroplanes', International Civil Aviation Organization , ICAO 9625 Third Edition -2009. 12. Anon. 'Manual of Criteria for the Qualification of Flight Simulation Devices, Volume II – Helicopter', International Civil Aviation Organization , ICAO 9625 Third Edition - Draft. 2011. 13. Anon. 'Aeroplane Flight Simulation Training Device Evaluation Handbook', International Civil Aviation Organization , ICAO 9625 Third Edition - Draft. 2011. 14. Rehmann, A.J., 1995, 'A Handbook for Flight Simulation Fidelity Requirements for Human Factors Research', DOT/FAA/CT-TN95/46. 15. Perfect, P. et al, 'Integrating Predicted and Perceived Fidelity for Flight Simulators', Proceedings of the 36th European Rotorcraft Forum, Paris, France, September 2010. 16. Heffley, R. K., et al, 'Determination of motion and Visual System Requirements for Flight Training Simulators', Systems Technology Inc., Technical Report No. 546, August 1981. 17. Tischler, M. B., Remple, R. K., 'Aircraft and Rotorcraft System Identification - Engineering Methods with Flight Test Examples', American Institute of Aeronautics and Astronautics Education Series, 2006, p182. 18. McRuer, D., Krendel, E., 'Mathematical Models of Human Pilot Behaviour', Systems Technology Inc., January 1974.' 19. Antencio, A., Jr., 'Fidelity Assessment of a UH60A Simulation on the NASA Amers Vertical Motion Simulator', NASA TM 104016, USAATCOM TR 93-A-005, 1993. 20. Blanken, C. L., and Pausder, H. J., 'Investigation of the Effects of Bandwidth and Time Delay on Helicopter Roll Axis Handling Qualities', Journal of the American Helicopter Society, Vol. 39, No. 3, 1994, pp. 24-33. 21. Lusardi, J./, Blanken, C., and Tischler, M. B., 'Piloted Evaluation of a UH-60 Mixer Equivalent Turbulence Simulation Model', Proceedings of the 49th American Helicopter Society Forum, May 2003.
24. Timson, E., Perfect, P., White, M.D., Padfield, G.D. and Erdos, R., 'Pilot Sensitivity to Flight Model Dynamics in Flight Simulation', Proceedings from the 37th European Rotorcraft Forum, Gallarate, Italy, September 2011. 25. White, M. D., Perfect, P., Padfield, G.D., Gubbels, A. W. and Berryman, A. C., 'Acceptance testing of a rotorcraft flight simulator for research and teaching: the importance of unified metrics', Proceedings of the 35th European Rotorcraft Forum, Hamburg, Germany, September 2009. 26. Gubbels, A.W., Ellis, D.K., 'NRC Bell 412 ASRA FBW Systems Description in ATA100 Format', Institute for Aerospace Research, National Research Council Canada, Report LTR-FR-163, 2000. 27. DuVal, R.W., 'A Real-Time Multi-Body Dynamics Architecture for Rotorcraft Simulation', The Challenge of Realistic Rotorcraft Simulation, RAeS Conference, London, November 2001. 28. Hoh, R. H., 'Lessons Learned Concerning the Interpretation of Subjective Handling Qualities Pilot Rating Data', Proceedings of the AIAA Atmospheric Flight Mechanics Conference, Portland, OR, August 20-22nd, 1990. 29. Blanken, C., Hoh, R., Key, D., 'Test Guide for ADS-33E-PRF',US Army Special Report AMRAF-08-07, April 2008
Appendix A - Correlations of task strategy with SFR awarded
Figure A1 -- The Effect of Cyclic Control Activity in the Hover Maintenance Phase on SFRs for 50ms additional transport delay.
Figure A2 - The Effect of Cyclic Control Activity in the Hover Maintenance Phase on SFRs for 200ms additional transport delay.
Figure A3 - The Effect of Cyclic Control Activity in the Hover Maintenance Phase on SFRs for 300ms additional transport delay.
Appendix B - 100ms Comparative Performance
Figure B1 - Pilot A - Equivalent Performance
Figure 5 - Pilot E - Equivalent Performance
Figure B2 - Pilot B - Equivalent Performance
Figure B5 - Pilot F - Equivalent Performance
Figure B3 - Pilot D - Similar Performance
Figure B6 - Pilot G - Similar Performance