Sensor-offset HMD perception and performance

29 downloads 0 Views 1MB Size Report
Kirk Moffitt is a human factors consultant: phone 1-760-360-0204: kirkmoffitt@earthlink.net. Head- and Helmet-Mounted Displays XII: Design and Applications.
Sensor-offset HMD perception and performance James E. Melzera & Kirk Moffittb a Rockwell Collins Optronics, 2752 Loker Ave. West, Carlsbad, CA 92010-9731; b La Quinta CA ABSTRACT The perceptual and performance effects of viewing HMD sensor-offset video were investigated in a series of small studies and demonstrations. A sensor-offset simulator was developed with three sensor positions relative to left-eye viewing: inline and forward, temporal and level, and high and centered. Several manual tasks were used to test the effect of sensor offset: card sorting, blind pointing and open-eye pointing. An obstacle course task was also used, followed by a more careful look at avoiding specific obstacles. Once the arm and hand were within the sensor field of view, the user demonstrated the ability to readily move to the target regardless of the sensor offset. A model of sensor offset was developed to account for these results. Keywords: Sensor offset, HMD, perception, performance

1. PROBLEM The easiest approach to designing a helmet-mounted sensor/display system is to bolt displays and sensors onto the helmet. For example, a sensor/display module can be suspended in front of the viewing eye in a manner typical of night vision goggles (NVG) — placing the sensor outwards, but this approach results in a very forward center of gravity, with the potential to cause neck strain for the soldier. A more ambitious design approach is to integrate displays and sensors into the helmet so as to minimize bulk and protrusions and to optimize weight and balance in an attractive package. But this integrated approach creates an offset of the sensor with respect to the wearer’s normal line of sight. An example of the integrated approach is the Soldier Mobility and Rifle Targeting System (SMaRTS), shown in Figure 1, where the sensor is above eye-level and is head-centered. Figure 1 shows the Soldier Mobility and Rifle Targeting System (SMaRTS) system with the two imaging sensors (visible on top, long wave infrared on the bottom) located high and in the middle of the head with the digital HMD shown worn over the right eye.

Our current interest is in an integrated sensor/display system that is monocular, with a moderate field of view (FOV) and unity magnification. A necessary trade-off with this design is the desire to mount the sensor package in a location that is not directly in line with the user’s eye. Data are very limited on the perceptual and performance effects of sensor offset, and there are no engineering guidelines.



James E. Melzer is Manager of Research and Technology: phone 1-760-438-9255: [email protected] Kirk Moffitt is a human factors consultant: phone 1-760-360-0204: [email protected]



Head- and Helmet-Mounted Displays XII: Design and Applications edited by Randall W. Brown, Colin E. Reese, Peter L. Marasco, Thomas H. Harding Proc. of SPIE Vol. 6557, 65570G, (2007) · 0277-786X/07/$18 · doi: 10.1117/12.721156 Proc. of SPIE Vol. 6557 65570G-1

2. BACKGROUND 2.1 Displaced vision A well-developed literature describes the effects of displaced vision on manual tasks at a near distance that can be characterized as vision intensive 1, 2. The typical methodology uses one or two prisms to displace the visual scene and usually minimizes viewing the arms or hands. Pointing or reaching initially overshoots the target in the direction of the displaced image. This is followed by gradual adaptation to the displacement, though perfect adaptation does not always occur. Following removal of the prism displacement, a negative aftereffect is temporarily reported where pointing and reaching errors are in the opposite direction. 2.2 Angular error Reports of human-performance problems with HMD offset-sensor systems can sometimes be attributed to angular error (i.e., sensor line of sight not in display line of sight). An informal test was conducted at Rockwell Collins Optronics (RCO) in 2005 using a simulated SMaRTS system. Several of the relevant sensor/display offset conditions were evaluated in terms of walking and reaching, as well as general perceptions of height and tilt. This simulator consisted of an RCO monocular display and a forehead-mounted daylight camera. The horizontal FOV was approximately 32°. The camera was slewed, tilted and rotated while the display remained in front of the right eye. Five subjects were tested, and their behavior and observations were all in general agreement. Initial testing with the camera and display both aligned straight ahead (i.e., boresighted to each other) resulted in minimal problems. Participants were able to walk across the room, grasp objects, and move in straight lines. As expected, movement was slower and slightly more hesitant than with naked-eye viewing, likely due to the field of view restriction over normal viewing. With the camera slewed to the side by approximately 10°, each participant was asked to walk across the room to grab a door handle, and then back to grab an object sitting on a bench. Walking was noticeably slowed and hesitant, and started in the direction opposite the camera direction. Walking across the room from another direction towards a doorway resulted in an arced path for all participants. The distance was approximately 20 feet, and the arc was about 2 feet off-line at the halfway point. The endpoint was within the doorway. Tilting the camera up approximately 10° simulated the effect of looking down at the display. After walking back and forth across the room, participants said they felt “tall” and as if they were “on stilts.” Another observation was that the room appeared to tip forwards, resulting in the impression of walking downhill. Rotating the camera approximately 10° was disturbing and not easily corrected. Given the instruction to regain gravitational upright by using head tilting, it was initially unclear whether to move the head left or right. Furthermore, large head tilts never seemed to make the image upright. Some of this can be explained by the difference in center-ofrotation of the HMD and head tilting. Head tilt rotates about the base of the neck, and describes a wide arc. To complicate matters, most head tilting is accompanied by head rotation. To really complicate things, this head motion stimulates the vestibular system and sense of balance, and stimulates both vertical and small rotational eye movements. The results of the testing indicate that in the case of a digital sensor and imaging display, the two need to be boresighted to reduce perceptual and locomotion effects. Although there were no specific tests conducted to evaluate a numerical alignment tolerance, it is recommended that there be a maximum of 0.5° angular error between the two. Note that the effects associated with an angular error between the sensor and display should not be confused with sensor-offset. The remainder of this report assumes the two are boresighted with no angular error. 2.3 Early sensor-offset studies A 1998 study of the effect of offset binocular cameras on eye-hand coordination used cameras positioned forward 165 mm and upward 62 mm with the image seen on a head-mounted display 3. Measures of performance using a pegboard task showed significant cost. There was adaptation over time, though performance never returned to the baseline level. Negative aftereffects were also observed after removal of the apparatus. Only the one camera position was tested. What about manual tasks that involve distances greater than arm’s length? One study measured the effects of several stereo NVG configurations on grenade tossing performance to a target at a distance of 20-feet 4. Compared to the control condition where the NVG objectives were separated by the nominal distance of the eyes, a hyperstereo configuration where the lateral NVG separation was twice the nominal eye separation significantly degraded tossing performance, and

Proc. of SPIE Vol. 6557 65570G-2

this was attributed to exaggerated stereo. When the NVG objectives were vertically displaced, but with the same lateral separation as the eyes, no performance degradation was found. While the imaging apparatus studied by these researchers was binocular, it provides some indication that a simple vertical offset does not affect a medium-distance task involving distance judgment. What about walking and driving with offset vision? Walking and driving involve the picking-up of flow-field information in our visual periphery and the direction of waypoints rather than size and distance computations. To approach an object, we make it expand in our field-of-view and the ground flow backward. The kinesthetic feedback from our feet on the ground simplifies the acts of walking. This sensation of grounding precludes the need to directly observe our feet. During these activities, we are generally looking forward. One researcher used himself as a subject in extended testing of a displaced camera system similar to that used by Biocca and Rolland 4,5. He wore the head-mounted apparatus for several days, and found that walking around his building, up and down stairs, and through doorways was not a problem. 2.4 FOV effects What will have an effect on mobility is the limited FOV of the sensor/display system. FOVs of 12° and 40° have been shown to result in significant errors in a navigation task, with some degradation present with a larger FOV of 90° 6. Performance degradation has also been reported on search and maze tasks with FOVs of 48° up to 112° 7. Peripheral vision is important to self movement not because of retinal organization, but because that is where the highest rate of optic flow occurs. If vision is limited to the sensor/display FOV, an increase in head movement and slower movement is expected.

3. SIMULATOR The available data provide little design guidance for the location of offset-sensors on an HMD system. We devised a plan to use helmet-mounted cameras and a head-mounted display to test the effects of sensor offset. Manual coordination and mobility tasks were used in testing with small numbers of subjects. 3.1 Sensor-offset apparatus

MultiVideo Switching Unit

Front Watec Camera Side Watec Camera Top Watec Camera

Rotary switch 9 v battery Helmet Mounted

AverMedia PMCIA video interface

Dell Laptop PC Head Mounted eMagin controller

eMagin HMD

Backpack Mounted

Proc. of SPIE Vol. 6557 65570G-3

A simulator was constructed using an eMagin HMD and three miniature monochrome daylight cameras. Night-vision sensors were not used due to cost, weight and complexity. The sensor-offset simulator is diagrammed in Figure 2. The three basic components are the helmet and cameras, eMagin binocular HMD, and the backpack-mounted video switcher, video interface, laptop PC and eMagin controller. The Watec cameras are 537x597 pixel, >380 line monochrome systems with a 6 mm fl lens and an approximate FOV of 32° horizontal.

Figure 2 shows a block diagram of sensor-offset simulator.

The eMagin HMD is a 800x600 pixel SVGA binocular display with a horizontal FOV of approximately 32°. Viewing was left-eye-only, with the right eyepiece covered. The helmet was a large-size bicycle helmet with a camera platform on the front-left and counter-balancing weights on the back right. The weight of the helmet system was approximately one kilogram. The helmet assembly and HMD are shown in Figure 3, and the backpack is added in Figure 4. The camera offsets are described in Table 1. All outside vision was shielded with a black drape attached to the helmet with Velcro and tied around the waist. The compete package—HMD, helmet and cameras, black drape and backpack with electronics is shown in Figure 5. Table 1. Camera positions relative to left eye

Offset Forward High & Centered Side

Lateral 0 3 cm nasal 12 cm temporal

Vertical 0 15 cm high 0

Longitudinal 12 cm forward 7 cm forward 0

Figure 3. Sensor-offset simulator ensemble that includes a helmet, three cameras, mounting platform, power-switching control, video connectors, counterweights and eMagin HMD. Note the locations of the three cameras (circled)

Fig. 4.Sensor-offset simulator ensemble with backpack used in obstacle course and avoidance studies. The backpack held a laptop computer, video switch unit, and the eMagin control unit.

Proc. of SPIE Vol. 6557 65570G-4

Fig. 5.Sensor-offset simulator ensemble with black drape to limit visibility to 32° x 24° camera video.

4. INVESTIGATIONS 4.1 General procedure and observations Subjects were affiliated with RCO, and were verified to have at least 20/30 left-eye acuity using a vision wall-chart at a distance of 10 meters. Eye dominance was assessed for the initial study, and each subject had an unambiguous dominant eye. No eye dominance perceptual or performance effects were noted. The cameras were prefocused for near or far depending on the task. Each camera was boresighted to the HMD for all tasks. For each task, subjects were first measured without the simulator to establish a naked-eye baseline. Within each task, the order of sensor-offset was randomly selected for each subject. Relative to the baseline naked-eye vision, all camera conditions with a FOV of 32° x 24° slowed down movement. The most noticeable perception was a downward slant of the floor and a minification of distant objects with the top-mounted camera. 4.2 Manual tasks The first test was card sorting. This was a simple task where cards from one suit were laid out at the clock positions on a piece of felt on a table. The subject, using one hand, simply made a pile in the center starting with the “2” and ending with the “King.” The time for this task was recorded for three repetitions, after which each of the three subjects was asked to rate the effort required for that task on a scale of 1 (no effort) to 10 (extreme effort). A photo of this task is shown in Figure 6. Median times for the card-sorting task are shown in Table 2 for each of the three subjects. The difference between the baseline and sensor times reflects the cost of limiting the FOV plus offsetting vision. The times for the front and forward sensor are less than for the side and top sensors for all three subjects. Workload estimates were inconsistent and did not correspond to response times.

Proc. of SPIE Vol. 6557 65570G-5

Fig. 6. Card-sorting task.

Fig.7. This photo represents both the blind- and open-eye pointing tasks.

Table 2. Median card-sorting times (seconds). Subject S1 S2 S3

Baseline 9 9 11

Front 25 20 27

Side 36 25 30

Top 32 21 31

The combination of the small differences in Table 2 combined with the inconclusive workload data led us to develop another study of manual performance and sensor offset. We decided to separate the perceptual and performance aspects of a manual task. A pointing task was developed where subjects stood on a line 120 cm from the wall and step forward and point at an “X” target” at a height of 150 cm. For the first part of this pointing study, subjects were instructed to look at the target, close their eyes, and step forward and place their index finger on the “X” target. The experimenter promptly noted the finger position on the sheet of paper. This “blind” pointing task represents the perceptual component of where the target appears, with no visual guidance of their hand and finger. The pointing task is shown in Figure 7. Three trials were run for each sensor condition, and the centroid of the resulting triangle of points used as the summary statistic. Figure 8 shows the results for this study. These results correspond to the prism displacement studies, where the apparent target position is opposite to the sensor position. Specifically, the top sensor results in the perceived lowest target, and the (left) side target results in the target appearing to the right. The control of baseline condition with nakedeye vision always results in the most accurate performance. We next asked subjects to point to the same “X” target with their eyes open. Each trial started with the experimenter removing a card that hid the target “X”. Since the end result was the index finger pointing at the target, we took video recordings of each trial and noted the time from a go signal to finger-on-target plus any apparent strategies. The data for the “eyes open” pointing task for a representative subject are shown in Figure 9. The control or baseline condition showed the quickest response. We expected that response would improve and level-off with trials. This did not generally occur, and may be due to the tendency to stab at the target sheet and then drag the index finger to the target in the first few trials, but to then start guiding the finger to the target—the net result being little difference in pointing time over the nine trials. The hand and finger started each trial outside the 32° x 24° FOV. Based on video evidence, we speculate that the two strategies were to stab with the finger into the FOV and then drag it onto the target, or to move the finger into the FOV and then visually guide it to the target—with both taking about the same amount of time. No sensor position effect can be discerned from the data from the four subjects.

Proc. of SPIE Vol. 6557 65570G-6

TP

TO 20

20

10

10

Control -20

Control -10

-20

10

-10

10 -10

20

Front

Side

Front

-10

20

-20

Top

Side

Top

-20

KM

JM

20

20

10

10 Front

Control -20

-10 Front -10

10 Side Top

20

-20

Side Control

-10 Top

-20

10

20

-10 -20

Fig. 8.Pointing performance for the blind pointing task for four subjects (TP, TO, KM and JM).

Front Side Top

TO 2

Control condition

1

1

2

3

4

5 Trial

6

7

Fig. 9.Representative pointing data for the open-eye pointing task for one subject (TO).

Proc. of SPIE Vol. 6557 65570G-7

8

9

4.3 Mobile tasks We first asked subjects to describe their perceptions of the experimental room in terms of the floor slanting, objects looking distorted. A common response was that the floor looked slanted downwards with objects at 5 to 10 meters looking small with the top-mounted camera. No consistent perceptual effects were noted with the front or side mounts. We tested the effects of sensor offset on mobility by constructing a simple obstacle course. Subjects briskly walked through a course defined by cardboard boxes—stepping over two one-foot high and deep boxes and ducking under fivefoot entryways. Subjects also had to avoid several tall boxes on the left and right, and execute a hairpin left-hand turn. The entire course was approximately 50 feet in length. Subjects were instructed to move briskly but not to purposely knock over boxes. Figure 10 shows two views of the course.

11

__

I..-

Fig. 10. Obstacle course constructed of stacked boxes in a u-shaped 50-foot course.

Completion times for the obstacle-course task were 10 seconds for the baseline naked-eye condition, and between 18 and 42 seconds for the three sensor offsets. No sensor-offset trends were evident. Similarly, workload ratings also showed no evident trends. Subjects hit a number of boxes in stepping over, going around and ducking under obstacles. We think that subjects felt with their arms and hands and readily kicked the boxes to make their way through the course. We decided to follow-up with a closer look at components of this task. We recorded video of two subjects stepping over a one-foot high and wide box, then circling around and passing close by six-foot high stacked boxes on the left, and then circling around and walking towards these boxes and stopping at a distance of one-foot (chest to box). This sequence was repeated for each camera offset. Subjects did not reach out to touch any boxes. Representative video frames are shown in Figure 11. The results of this demonstration are shown in Table 4. As with the other studies in this investigation, the naked-eye control condition was associated with superior performance. The cost of the 32° x 24° FOV HMD view was misjudging distances and sometimes running into obstacles. Arms and hands were not used to reach out and feel the obstacles. Both subjects maintained a relatively large clearance in passing-by an obstacle on the left with the left-side-mounted camera. Similarly, the approach distance was overestimated with the front-mounted camera. Both of these findings correspond to camera offset, and demonstrate that effects linked to specific sensor-offset effects are more likely with isolated and simple tasks than with complex tasks.

Proc. of SPIE Vol. 6557 65570G-8

Figure 11. Video frame sequences (1/10 second between frames) of stepping over, passing by and approaching tasks.

Proc. of SPIE Vol. 6557 65570G-9

Subject KM JM

Table 4. Stepping over, passing by, and approaching obstacles Sensor Offset Task Control Forward Side Step-over 10 cm Hit Hit Pass-by 10 20 40 Approach* +1 +60 +40 Step-over 20 30 30 Pass-by 5 Hit 40 Approach* -11 +29 +20

Top 30 30 +40 20 Hit +29

* Relative to instruction to stop at distance of one-foot.

5. SUMMARY AND MODEL Task performance was degraded and workload estimates were higher for all sensor positions for manual and mobile tasks relative to naked-eye vision. The likely cause of this global effect is the limited vision from the 32° x 24° HMD FOV. If a sensor/display system has an angular error, performance will be dramatically affected. The current study only used straight-ahead and aligned sensors. The current study only used left-eye monocular imagery. The subjects presented a mix of left and right eye dominance, and left- and right-handedness. There were no comments or concerns about not seeing with both eyes, or even which eye was used for viewing. In agreement with the prism displacement literature, pointing without real-time visual feedback is opposite to the sensoroffset position. Once the arm and hand are visible within the sensor FOV, the user can readily reach a target position— regardless of sensor-offset position. The user can either stab at the apparent target location and then drag their hand to the target, or make a reaching motion and then guide their hand to the target. The distinction between the hand not being visible (obscured or outside the sensor FOV) or visible (within the sensor FOV) is critical to understanding reaching and pointing performance. The current study was deficient in only testing a small number of subjects for a limited number of trials. The apparatus had no look-around vision, which required total visual reliance on sensor imagery. The tasks imposed minimal stress on the subjects, unlike many tasks that would be encountered in the real world. Table 5 presents a simple model of a sensoroffset helmet display system as it relates to perception and performance.

System configuration Misaligned sensor/display Aligned sensor

Table 5. Sensor offset model Perception and performance Walking in curved path to waypoint, distracting slant and rotation effects Noticeable slant with high sensor; misjudge closeness of right-side objects when walking; reaching opposite to sensor position with no visual guidance

Blind pointing

Blind pointing opposite of sensor positions

Hands and feet visible

No large sensor offset performance or effort differences reported for both manual and mobile tasks No evidence of eye awareness or dominant eye effects. Monocular versus binocular viewing not investigated. Large decrements in performance and increases in reported effort relative to naked-eye vision

Left-eye display Moderate 32° FOV

Proc. of SPIE Vol. 6557 65570G-10

Evidence Informal testing at RCO Current study, literature

Current study, large literature on prism displacement Current study Current study, literature8 Current study, literature

REFERENCES 1. J. Petz, M. Hayhoe, and R. Loeber. “The coordination of eye, head, and hand movements in a natural task,” Journal of Experimental Brain Research, 139, 266-277 (2001). 2. R. B. Welch. “Adaptation of space perception, “ In K. R. Boff, L. Kaufmanf and J. P. Thomas (eds.), Handbook of perception and human performance, Volume I. New York: Wiley (1986). 3. F. A. Biocca and J. P. Rolland. “Virtual eyes can rearrange your body: Adaptation to visual displacement in seethrough head-mounted displays,” Presence, 7, 262-277 (1998). 4. V. G. CuQlock-Knopp, K. P. Myles, F J. Malkin, and E. Bender. The effects of viewpoint offsets of night vision goggles on human performance in a simulated grenade throwing task. ARL-TR-2401. Aberdeen Proving Ground MD: Army Research Laboratory (2001). 5. S. Mann. “Fundamental issues in mediated reality, WearComp, and camera-based augmented reality”. In W. Barfield & T. Caudwell (eds.), Fundamentals of wearable computers and augmented reality. Mahwah NJ: Erlbaum (2001). 6. P. L. Alfano and G. F. Michel. “Restricting the field of view: perceptual and performance effects,” Perceptual and Motor Skills, 70(1), 35-45 (1990). 7. K. W. Arthur. Effect of field of view on performance with head-mounted displays. Dissertation thesis, University of North Carolina (2000). 8. A. P. Mapp, H. Ono, and R. Barbeito. “What does the dominant eye dominate? A brief and somewhat contentious review,” Perception & Psychophysics, 65, 310-317 (2003).

Proc. of SPIE Vol. 6557 65570G-11

Suggest Documents