Recognition of Pointing and Calling for Remote Operatives Management Masahiro Iwasaki and Kaori Fujinami Department of Computer and Information Sciences, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei, Tokyo, Japan
[email protected],
[email protected] industrial health and safety is important but yet largely unexplored areas. Kortuem, et al. adopted wireless sensor nodes to monitor the workers’ exposure to vibrations, and testing of compliance with legal health and safety regulations [3]. They showed feasibility for remote monitoring of workers. To realize remote assessment for pointing and calling, we need to clarify the feasibility of recognition for performance of the pointing and calling. In this paper, we describe a prototype system and its preliminary evaluation.
ABSTRACT
We investigate a new facilitated assessment technique for the pointing and calling enforcement. The pointing and calling is an activity for workers to keep occupational safety and correctness. Our proposed system provides early detection for the false enforcement of the pointing and calling by recognizing the performance of the activity. We implemented a prototype system to record occurrence of the performance, the place where it performed, and the performed time. The recognition accuracy showed the feasibility of recognition with a self-contained wearable computer system. However, the results also showed some issues that need to be addressed for future practical use.
POINTING AND CALLING RECOGNITION METHOD
The effect of pointing and calling can be separated to four effects: actively concentrates one’s awareness to an object and clear consciousness by pointing; increases reliability of confirming by visual, auditory, and kinesthetic stimulus; raises an activity level of cerebral neocortex by kinesthetic stimulus of muscles used during pointing and vocalizing; and prevents operational error by enforcing pointing and calling between perception and reaction [2]. To recognize the correctness of pointing and calling, the occurrence of behaviors that is required to accomplish the four effects should be recognized. Therefore, our system attempts to recognize pointing gesture, pointing directions, gaze directions, and vocalization. Moreover, the implementation for recognition of locations where the pointing and calling performed is necessary to find the performing correctness.
Author Keywords
Occupational Safety; Ubiquitous Computing; ACM Classification Keywords
J.7. Computers in Other Systems: Industrial control; C.3. Special-Purpose and Application-Based SystemsSignal processing systems INTRODUCTION
The most common cause of accidents and industrial accidents is often attributed to human error. These are caused by mostly mistakes and carelessness that are associated with psychological factors [1]. Pointing and calling is an activity used to avoid the human errors at workplaces. The activity contains pointing gesture to the target objects stretching an arm, and vocalization by calling at important points in the work. The pointing and calling raise the consciousness level of workers, and increases the accuracy and safety of work [1]. However, in recent years, safety managers are reporting that a number of accidents reasoned with forgetting to perform the pointing and calling correctly. Moreover, the reports are telling that the problems are caused by the lack of periodic assessment for the enforcement. In the context of ubiquitous computing,
Pointing Gesture
The pointing activity is the motion of arms, which is categorized as gesture. Therefore, we applied Dynamic Time Warping (DTW) [4] to recognize them that has been successfully utilized for gesture recognition in existing work. The data from three-axis accelerometer and three-axis gyroscope attached to wrist are given to the DTW process. Templates data were acquired from 25 training data, finding the best data that has smallest average of DTW distance among all the other data. The final decision is made by thresholding. Gaze and Pointing Direction
Gaze involuntarily moves to the pointing object while the pointing and calling [2]. Also, the pointing direction is roughly concord with the extended line of line between an eye and a tip of pointing finger. Therefore, the gaze and pointing direction is able to recognize by calculating positional relations between the eyes and the pointing finger. The relations
Copyright is held by the author/owner(s). APCHI’12, August 28–31, 2012, Matsue-city, Shimane, Japan.
are calculated from magnetic sensor and accelerometer data attached to wrist with the physical information, such as length of arms.
of the subjects is measured and preset. Finally, the subjects conducted safety inspections to simulated objects. RESULTS
Vocalization
As shown in Table 1, the recognition accuracy for pointing gesture was 74.1%. However, half of subjects had a rate of 100.0%. We observed the change of motion for the gesture movement by subjects with unrecognized results. The movement differed from the template acquired phase to recognition phase. For the gaze and pointing direction, magnetic sensor errors affected the results of recognition. Concerning the results for vocalization, the diverse voiced volume for stating out loud influenced the recognition results.
To maximize the effects of pointing and calling, conscious vocalization such as sound volume with a volume of more than usual chatting (60 decibels) is required when stating out loud. Consequently, to recognize the conscious vocalization, the sound volume needs to be more than 60 decibels. A Bluetooth headset mounted on the ear is utilized for audio data collection. The raw audio data are converted into decibel values, and then simple moving averages of the values are calculated for smoothing purpose. Location
Result
The location where the pointing and calling is performed is identified to confirm that the enforcement is correctly done. To recognize the place, we utilized a fiducial marker that is cheap and easy to install. Video stream from a camera on an Android device capture the scene in front of a user. Therefore, if a fiducial marker appears in the images, it means a user is standing in front of the marker. The fiducial marker recognition is realized by utilizing the OpenCV library on the mobile terminal.
rate
Recognition Pointing Gaze and pointing gesture direction Vocalization 74.1% 89.9% 82.4% Table 1. Recognition accuracy.
Location 100.0%
CONCLUSION AND FUTURE WORK
We proposed a pointing and calling recognition system for facilitated assessment of the pointing and calling enforcement. The four recognitions are combined to recognize the pointing and calling. The evaluation of the recognition system suggests the feasibility of recognition for the pointing and calling with wearable sensors. We also identified some issues for future works. For the pointing gesture recognition, we will conduct more thorough investigation on the behaviors of workers. The accuracy might be improved by eliminating the other behaviors at the recognition process. For example, the other behaviors are loading and unloading of boxes, walking, and hand shaking. As for vocalization, we will implements voice recognition that has been used successfully with the Android phone or iPhone, which will help to eliminate the false recognition. Regarding to location recognition, turning over for the terminal may happen; however, we will implement an automatic camera switching facility on front side and backside.
Prototype System
These recognitions are processed on an Android-based phone terminal (Galaxy S II), and data are gathered from threeaxis accelerometer, three-axis gyroscope, and three-axis magnetic sensor (WAA-010), camera (implemented on Android Phone), and Bluetooth headset (LBT-HS050C2) attached on the user’s body (see Figure 1).
REFERENCES
1. Concept of “Zero-accident Total Participation Campaign” ”. http://www.jniosh.go.jp/icpro/jicosh-old/english/zerosai/eng/index.html.
Figure 1. System architecture.
2. Haga, S. Effect of finger pointings on eye movement. The Japanese Journal of Ergonomics 2007, 140-141.
EXPERIMENTS
To clarify the feasibility of recognition for the performance of the pointing and calling, we observed the recognition accuracy in simulative safety inspections with the proposed system. Safety inspections were simulated to three target objects. Test subjects are twelve people including eleven men and one woman, who are all college students. The procedure is as follows. First, a tester gives an instruction of correct pointing and calling, following the guideline [1]. Second, the threeaxis accelerometer data and three-axis gyroscope data are acquired for each one of the subjects while pointing 25 times to the object in front of the test subject. These data are used to acquire the template data. Then, the physical information
3. Kortuem, G., Alford, D., Ball, L., Busby, J., Davies, N., Efstratiou, C., Finney, J., White, M., and Kinder, K. Sensor networks or smart artifacts? An exploration of organizational issues of an industrial health and safety monitoring system. In Proc. UbiComp 2007, 465-482. 4. Liua, J., Zhonga, L., Wickramasuriyab, J., and Vasudevanb, V. uWave: Accelerometer-based personalized gesture recognition and its applications. PMC 2009, 657-675.