2014 5th International Conference on Information and Communication Systems (ICICS)
A Compact Portable Object Tracking System Karam Abughalieh,Waleed Qadi,Karam Melkon, Boulos Fakes, Belal Sababha, Amjed Al-mousa King Abdullah II Faculty for Engineering, Princess Sumaya University for Technology Amman 11941, Jordan
[email protected] Abstract - Computer vision and object tracking are becoming increasingly more important with a wide variety of applications in our daily life. Most of the available tracking systems are not compact enough to be mounted on small ground or aerial robots. Also, most of these systems are relatively expensive. The tracking system presented in this work is a compact low cost system which makes it suitable for weight sensitive applications. The proposed system is a commercial off the shelf android-based mobile device that is mounted on a pan/tilt gimbal. The system utilizes the camera and the processor of the mobile device to capture and process video frames. The tracking algorithm is a newly modified algorithm that combines three well known algorithms: SURF, CAMShift, Lucas-Kanade. Each of these algorithms is deployed at a different stage of the tracking process which yields a reliable real-time tracking system. The newly modified tracking algorithm was developed using OpenCV within an Android environment. An indoor lab experimental test showed that the system was able to track a (3cm x 5cm) object moving at a speed of 133cm/sec and placed 50 cm away from the system.
systems, while the second gives a brief background of the object tracking algorithms that were used in this work. A. Commercial off The Shelf (COTS) Tracking Systems Moog quickset has a video tracking system (Model 708625)[1] that is designed to track boats, aircrafts, vehicles and people. Its algorithm has been modified for the tracking of these objects in the sky or on water. This comes at the expense of extra complexity, weight and cost. In order to achieve this, two high end cameras positioned on a quickset pan/tilt base were used. One of which is a thermal camera and the other is a visible camera, both are connected to a PC through a video tracking server/power and controlled via a rugged joystick connected to the PC. Fig.1, illustrates the system [1].
Keywords—Computer Vision, object tracking, tracking algorithms.
I. INTRODUCTION In this paper we present a compact tracking system that uses an android-based smart phone to capture and analyze images. The system also uses a hybrid of multiple image processing algorithms to enhance the capture time and tracking. Section II of the paper discusses existing tracking systems and tracking algorithms. In section III we discuss the design of our system showing the multiple components and both the hardware and software designs. Section IV shows the experimental setup used to test the algorithm. It also discusses the results measured. Finally, the paper is concluded in section V. II. BACKGROUND Video tracking technology has been around for a while now and it is developing at a fast pace. This section includes two sub-sections. The first reviews commercial off the shelf
978-1-4799-3023-4/14/$31.00 ©2014 IEEE
Fig. 1. Moog Quickset Video Tracking System
Another system is the RV2 video capture and tracking system. Fig.2, shows the system. This system provides digital video recording and real-time tracking [2]. It also includes features like auto exposure, auto gain and auto white balance. Such features are considered to be of high cost features to include in a tracking system. Even though it has a relatively smaller spatial consumption, it is still not compact enough to be mounted on an aerial robot. To solve the compactness, portability and cost constraints we aimed to use a compact, flexible, highly modifiable and low cost system. Our video tracking system consists of small and cheap components such as a smart phone, for the processing, a cheap microcontroller and an open source library (openCV [9]) for implementing the computer vision algorithms. Combined together these provide the desired high performance under the aforementioned constraints.
2014 5th International Conference on Information and Communication Systems (ICICS)
GalaxyS2, GT-I9100). The 8 mega pixel camera of the phone is the main input device of the system. The phone camera captures live video of the object that is being tracked. A camera with that resolution will provide enough key points for our algorithms to keep the object locked without losing it.
Fig. 2. RV2 Video Capture and Tracking System
B. Tracking Algorithms To perform video tracking we need to get sequential video frames and process them using a tracking algorithm which analyzes the movement of the object of interest between the sequential frames. There are two major requirements that a visual tracking system has to satisfy: identifying the object and its location, as well as filtering any undesired elements. Several tracking algorithms have been developed, each one has strengths and weaknesses, depending on its use case. SURF [3] (Speeded Up Robust Features) is a tracking algorithm that is categorized as a feature matching algorithm. It is one of the most reliable algorithms in this category [4]. SURF selects interest points of an image, and then builds local features based on histograms of gradientlike local operators. However SURF consumes a lot of system resources, which makes it inapplicable for real time tracking with limited resources. CAMShift (Continuously Adaptive Mean Shift) [5] is a tracking algorithm that is categorized as a Kernel-based tracking algorithm [6]. This matching algorithm works on similarities between frames based on color histograms. The algorithm does not consume system resources, which makes it suitable for real time tracking. On the other hand, CAMShift cannot be used for shape detection. Also due to changes in lighting CAMShift may lose track of the desired object.
Fig. 3. High Level Hardware Design
With a frame rate of 30 frames per second, the system can track a mid-speed moving object. The phone processor will handle the entire image processing needed for the tracking system. The phone processor is an Arm Cortex A9 Dual Core Processor clocked at 1.2 GHz. The actuation function of the system is implemented via an Arduino Uno [8] microcontroller communicating serially with the phone. The microcontroller will control a mechanical gimbal composed of two servo motors that will be responsible for moving the entire tracking system setup. Fig.4, illustrate the pan/tilt gimbal setup, while Fig.5 demonstrates the electrical circuit design.
Another known algorithm is Lucas Kanade. It is an optical flow algorithm [7]. This algorithm keeps tracking of object edges between frames. Lucas Kanade algorithm does not consume a lot of resources for low detailed backgrounds. However, in order to keep locking the object in a high detailed background, it will require additional error computations and filtering. III. SYSTEM DESIGN A. Hardware Design Fig.3, illustrates the high level hardware design. The system functions integrated all together, achieve the overall tracking system. These three functions are: sensing, processing and actuation. In the presented design, the sensing and processing functions are performed through using a commercial android-based smartphone (Samsung
Fig. 4. Mechanical Gimbal Design
2014 5th International Conference on Information and Communication Systems (ICICS)
Fig. 5. Electrical Circuit Design
As shown in Fig.4, one of the servo motors of the gimbal is responsible for moving the entire setup radially along a fixed base, while the other servo handled rotating the smartphone in tilt position with the gimbal. B. Software Design The algorithm is designed to accurately detect a specified object and track it in real time with minimum resources. The algorithm is a combination of three algorithms to overcome the disadvantages of each algorithm by its own. The flowchart shown in Fig.6 illustrates the algorithms. The algorithm uses SURF specifically for acquiring the lock on the target. SURF compared to other algorithms is fast at acquiring the lock but it does not perform well in keeping track of an object. That's why Lucas Kanade and CAMShift are used. The CAMShift algorithm will use the mean value that is provided by Lucas Kanade to localize the object, and provide the edges of the object. Lucas Kanade will track the sharp edges that are provided by CAMShift in order to update the object's color mean value.
Fig. 6. Algorithm Flow Chart
The following steps explain the details of the system workflow. It is shown how the three algorithms SURF, CAMShift and Lucas Kanade are used to produce the new tracking algorithm. 1) A specific object is provided to the algorithm as an image. SURF will detect the features of the desired object in the image and extract the object’s key points which will be matched later with moving frames. Fig.7, shows SURF extracting the key points from the object’s image. 2) SURF will start detecting features and extracting key points from each frame until it acquires a good match with respect to the key points extracted from the image provided to the system earlier. Using the good matches from SURF, the algorithm will localize the object in the frame using homography to draw the bounding rectangle as seen in Fig.8.
Fig. 7. SURF extracting the object’s key points
2014 5th International Conference on Information and Communication Systems (ICICS)
Fig. 8. Match localizing the object in the frame
Using the dimensions and position of the bounding rectangle, a ROI (Region of Interest) will be extracted from the current frame. A mask is created based on the extracted ROI which will isolate the object, create an image out of it and cover all of the undesired elements in the frame. This step is illustrated in Fig.9.
Fig. 11. Lucas Kanade working in parallel with the CAMShift algorithm
5) In each frame the distance from the center of the ROI and the center of the frame is calculated (X, Y), see Fig.11. The X and Y values found in this step are sent to the microcontroller through a serial link. 6) The Arduino microcontroller receives the X and Y values and scales each of them to an angle between 0 and 180 degrees. These angle values will be later converted to Pulse Width Modulation (PWM) signals to control the rotation angles of the two servo motors of the gimbal. IV. EXPERIMENTAL SETUP AND RESULTS
Fig. 9. Creating a mask based on the extracted ROI
The isolated image is converted into HSV color space, which will be used to calculate the object’s histogram as shown in Fig.10. The histogram is used to calculate the mean value of the colors in the isolated image. The mean value is then used by the CAMShift algorithm to maintain lock on the object.
In order to compare the newly developed algorithm with existing algorithms a test setup has been created in the physics lab. A tilted bar with an air cushion is used to slide the desired object on top of the tilted bar, see Fig.12. The tracking system is placed 50 cm away from the setup. As the object slides down, the algorithm’s ability to keep track of the object is monitored. Multiple tests with different speeds have been performed. The setup has a limitation on the maximum speed, which was 4.7 Km/h.
Fig. 12. Tilted bar with an air cushion
Fig. 10. Object’s Histogram
3) The CAMShift algorithm uses the mean value and the position of the ROI to keep localizing the object in each frame and find the sharp edges of the object to be used by Lucas Kanade algorithm. 4) The Lucas Kanade algorithm works in parallel with the CAMShift algorithm and uses the sharp edges defined by CAMShift to update the object’s color mean value in each frame so that they can be used by CAMShift in the next frame.
A. Speed Experiment Results Multiple tests were conducted to see how well the new algorithm performed against CAMShift and Lucas Kanade in terms of tracking objects with various velocities. In each of these tests the speed of the object was increased gradually by allowing the object to travel further down the bar. These experiments were conducted after each algorithm acquired the lock on the target. The object is allowed to travel longer distance in each test. The results are shown in Table 4.1 below.
2014 5th International Conference on Information and Communication Systems (ICICS)
TABLE 4.1: SPEED EXPERIMENT RESULTS
Distance (cm)
40 cm
90 cm
100 cm
130 cm
Avg Speed(cm/s)
67
81
105
132
New Algorithm
9
9
9
9
CAMShift
9
9
9
9
Lucas–Kanade
9
9
Fail
Fail
It is clear from the table that the Lucas-Kanade algorithm lost track of the object in both the 100cm and 130 cm tests. Meanwhile, since the new algorithm uses CAMshift mostly to track the object after acquiring lock, it can be seen that both of them were able to successfully track the object in all cases. However CAMshift cannot acquire lock on object by its own. B. Lock Time Experiment Results In order to calculate each algorithm’s speed in acquiring the lock on the object, another experiment is conducted. The experiment involves placing the desired object in a well lit area and on a stable surface. Then the time needed for each algorithm from the moment it is fired up till the moment it acquires the lock on the target is recorded. There are four algorithms involved in acquiring lock: SURF, BRISK [10], SIFT [11] and FAST [12]. The results of the experiment are shown in Table 4.2. It can be seen that even though SURF consumes a lot of system resources, the results shows that it is the fastest compared with other less resource intensive algorithms. TABLE 4.2: ACQUIRING LOCK TIME COMPARISON
Algorithm
Time to lock on (sec)
SURF BRISK
0.52 3.635
SIFT
4.761
FAST
2.492
V. CONCLUSION In this paper we have presented a compact object tracking system. The system utilized the image processing and computing power of commercial smart phones along with servo motors and an Arduino microcontroller unit. The system uses a new algorithm to acquire lock and keep track of the target object. The algorithm uses SURF to acquire lock on the target, and then uses a hybrid of Lucas Kanade and CAMShift to keep track of it. As can be seen the new algorithm has the best time to acquire the lock on the target because it uses SURF, and it efficiently keeps track of the object as it accelerates as it uses CAMshift and Lucas– Kanade for this purpose.
REFERENCES [1] MOOG QuickSet, "Video Tracking System," http://www.quickset.com/literature/Space_Defense/QuickSet/ Video_Tracking_System.pdf [2] Tucker Davis Technologies, "Video Capture & Tracking System," http://www.tdt.com/products/RV2.html [3] Bay, Herbert, et al. "Speeded-up robust features (SURF)." Computer vision and image understanding 110.3 (2008): 346-359. [4] Juan, Luo, and Oubong Gwun. "A comparison of sift, pca-sift and surf."International Journal of Image Processing (IJIP) 3.4 (2009): 143-152. [5] Allen, John G., Richard YD Xu, and Jesse S. Jin. "Object tracking using camshift algorithm and multiple quantized feature spaces." Proceedings of the Pan-Sydney area workshop on Visual information processing. Australian Computer Society, Inc., 2004. [6] Prajna Parimita Dash, Dipti patra,Sudhansu Kumar Mishra,Jagannath Sethi, " Kernel based Object Tracking using Color Histogram Technique." International Journal of Electronics and Electrical Engineering, Vol. 2, Issue 4, April 2012. [7] Baker, Simon, and Iain Matthews. "Lucas-kanade 20 years on: A unifying framework." International Journal of Computer Vision 56.3 (2004): 221-255. [8] Arduino Uno official website, found at: arduino.cc/en/Main/arduinoBoardUno [9] OpenCV official website, found at: http://opencv.org/ [10] Leutenegger, Stefan, Margarita Chli, and Roland Y. Siegwart. "BRISK: Binary robust invariant scalable keypoints." Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011. [11] Lowe, David G. "Distinctive image features from scaleinvariant keypoints."International journal of computer vision 60.2 (2004): 91-110. [12] Rosten, Edward, and Tom Drummond. "Machine learning for high-speed corner detection." Computer Vision–ECCV 2006. Springer Berlin Heidelberg, 2006. 430-443.