visual trajectory estimation device (vted)

VISUAL TRAJECTORY ESTIMATION DEVICE (VTED)

Prepared by:

KHOTSOFALANG NQHOAKI NQHKHO001 Department of Electrical Engineering University of Cape Town

Prepared for:

DR SIMON WINBERG Radar Remote Sensing Group Department Of Electrical Engineering University of Cape Town

October 2015 Submitted to the Department of Electrical Engineering at the University of Cape Town in partial fulfilment of the academic requirements for a Bachelor of Science degree in MECHATRONICS ENGINEERING

Key Words: VISUAL TRACKING, TRAJECTORY ESTIMATION, PROJECTILE PREDICTION, ADAPTIVE FILTERS

Declaration 1. I know that plagiarism is wrong. Plagiarism is to use another's work and pretend that it is one's own. 2. I have used the IEEE convention for citation and referencing. Each contribution to, and quotation in, this final year project report from the work(s) of other people, has been attributed and has been cited and referenced. 3. This final year project report is my own work. 4. I have not allowed, and will not allow, anyone to copy my work with the intention of passing it off as their own work or part thereof

Name:

KHOTSOFALANG NQHOAKI

Signature:

Date:

i

14 October 2015

ii

Acknowledgements I would like to express my gratitude to my supervisor Dr Winberg for allowing me to pursue this project and helping me out through the course of the project, my fellow classmates who also helped me during the phase of my experimentations. My family for their support and everyone who contributed directly and indirectly to make this project a success.

iii

Abstract Visual Trajectory Estimation Device (VTED) is an amalgamation of visual object tracking algorithms and prediction modelling to estimate the trajectories of projectiles. The PlayStation eye camera is used to capture visual data which is then processed on a computer to predict and estimate projectile trajectories. The VTED is a single camera object tracking system.

Image processing techniques are used in this project for object detection. The system then invokes tracking of detected objects whose trajectories are plotted on a graph. A Least Mean Squares adaptive filter is used to predict the projectile trajectories and its convergence analysis is made in order to improve trajectory predictions. The VTED can be used in sport scene analysis as well as other scenes where projectile motion needs to be analysed.

iv

Table of Contents VISUAL TRAJECTORY ESTIMATION DEVICE (VTED)....................................................................................... 1 Prepared by: ................................................................................................................................................................. 1 Department of Electrical Engineering................................................................................................................. 1 Prepared for: ................................................................................................................................................................ 1 Key Words: VISUAL TRACKING, TRAJECTORY ESTIMATION, PROJECTILE PREDICTION, ADAPTIVE FILTERS .......................................................................................................................................................................... 1 Declaration .....................................................................................................................................................................i Acknowledgements................................................................................................................................................... iii Abstract ......................................................................................................................................................................... iv Table of Contents ......................................................................................................................................................... v List of Figures ............................................................................................................................................................. vii 1. Introduction ......................................................................................................................................................... 1 1.1 Background to the study .......................................................................................................................................... 1 1.2 Objectives of this study............................................................................................................................................. 2 1.2.1 Problems to be investigated ......................................................................................................................... 2 1.2.2 Purpose of the study ........................................................................................................................................ 2 1.3 Scope and Limitations ............................................................................................................................................... 2 1.4 Plan of development .................................................................................................................................................. 3 2. Literature review ............................................................................................................................................... 5 2.1 Introduction..................................................................................................................................................................... 5 2.2 Image Processing .......................................................................................................................................................... 5 2.2.1 Smoothing images ............................................................................................................................................. 5 2.2.2 Conversion between Colour Space............................................................................................................. 6 2.2.3 Background Subtraction................................................................................................................................. 7 2.3 Visual Object Tracking ............................................................................................................................................... 8 2.3.1 The working mechanism of Visual Object Trackers ........................................................................... 8 2.3.2 Feature Descriptors .......................................................................................................................................... 9 2.3.3 Object Tracking Algorithms ....................................................................................................................... 12 2.4 Projectile Motion and Trajectory Prediction ................................................................................................ 15 2.4.1 Projectile motion ............................................................................................................................................ 15 2.4.2 Recursive Least Squares Filters ............................................................................................................... 16 2.4.3 Least Mean Squares Filter........................................................................................................................... 17 2.5 Performance Evaluation of VTEDs ..................................................................................................................... 17 2.6 Software Tools And Hardware .............................................................................................................................. 18 2.6.1 OpenCV ............................................................................................................................................................... 18 2.6.2 Numpy and Scipy ............................................................................................................................................ 19 2.6.3 CUDA.................................................................................................................................................................... 19 2.6.4 OpenCL................................................................................................................................................................ 19 2.6.5 PlayStation Eye................................................................................................................................................ 20 2.7 Section Remarks ....................................................................................................................................................... 20 3. System Design Methodology ........................................................................................................................ 23 3.1 Preliminary Analysis .............................................................................................................................................. 23 3.2 Requirements definition ....................................................................................................................................... 23 3.3 System Design ........................................................................................................................................................... 23 3.4 Development .............................................................................................................................................................. 23 3.5 Integration and Testing ......................................................................................................................................... 24

v

3.6 Deployment ................................................................................................................................................................ 24 3.7 Section remarks ........................................................................................................................................................ 24 4. VTED design ....................................................................................................................................................... 26 4.1 Video capture............................................................................................................................................................. 27 4.2 Pre-processing .......................................................................................................................................................... 28 4.3 Object detection ........................................................................................................................................................ 30 4.3.1 Detection by color features ........................................................................................................................ 30 4.3.2 Detection by Geometric features ............................................................................................................. 30 4.3.3 Detection by background subtraction and Blob extraction .......................................................... 30 4.4 Plotting the trajectory paths ............................................................................................................................... 32 4.5 Trajectory Parameters Estimation and Prediction stage ........................................................................ 33 4.6 Performance Testing .............................................................................................................................................. 34 4.7 Section remarks ........................................................................................................................................................ 34 5. Experimental design ....................................................................................................................................... 36 5.1 Objectives of the experiment .............................................................................................................................. 36 5.2 Scope of the experiment........................................................................................................................................ 36 5.3 Experimental setup and apparatus .................................................................................................................. 37 5.4 Experimental methodology ................................................................................................................................. 40 5.5 Section remarks ........................................................................................................................................................ 40 6. Presentation of Results .................................................................................................................................. 42 6.1 Camera communication and video capture .................................................................................................. 42 6.2 Pre-processing and Object Detection .............................................................................................................. 44 6.2.1 Detection By colour ....................................................................................................................................... 45 6.2.2 Detection By geometrical Features ......................................................................................................... 46 6.2.3 Detection by Background subtraction methods and Blob Extraction ...................................... 49 6.2.4 Object detection performance .................................................................................................................. 51 6.3 Establishing the Coordinate system ................................................................................................................. 52 6.4 Predicting Spherical objects Trajectories ...................................................................................................... 55 6.4.1 Effects of varying speeds on projectile predictions ......................................................................... 55 6.4.2 Effects of varying launch angles on projectiles .................................................................................. 58 6.5 Predicting Non spherical objects trajectories .............................................................................................. 61 6.6 Convergence of the adaptive filter .................................................................................................................... 63 7. Discussions and Analysis............................................................................................................................... 68 7.1 Camera communication and video capture .................................................................................................. 68 7.2 Pre-processing and Object Detection .............................................................................................................. 68 7.2.1 Pre-Processing ................................................................................................................................................. 68 7.2.2 Object detection .............................................................................................................................................. 69 7.3 Establishment of the coordinate system ........................................................................................................ 70 7.4 Predicting Spherical objects Trajectories ...................................................................................................... 70 7.5 Predicting Non spherical objects trajectories .............................................................................................. 71 7.6 Adaptive filter learning convergence .............................................................................................................. 71 7.7 General performance of the VTED .................................................................................................................... 72 8. Conclusions ........................................................................................................................................................ 73 9. Recommendations ........................................................................................................................................... 74 10. References .......................................................................................................................................................... 75 11. Appendices ......................................................................................................................................................... 78

vi

List of Figures Figure 1 Plan of Development ................................................................................................................................................... 3 Figure 2 applying different filters ........................................................................................................................................... 6 Figure 3 converting RGB to HSV .............................................................................................................................................. 7 Figure 4 converting to grayscale .............................................................................................................................................. 7 Figure 5 Top-original frames, bottom-Background subtraction results from [10] ............................................. 8 Figure 6 Visual tracking flowchart adapted from [6] ...................................................................................................... 9 Figure 7 projectile curve adapted from [20].................................................................................................................... 15 Figure 8 RLS Filter adapted from Sahoo[26] ................................................................................................................... 16 Figure 9 System Flow Chart .................................................................................................................................................... 26 Figure 10 System overview ..................................................................................................................................................... 27 Figure 11 Pre-Processing Pipeline ....................................................................................................................................... 28 Figure 12 Gaussian Blur and HSV ......................................................................................................................................... 28 Figure 13 Pre-Processing summary .................................................................................................................................... 29 Figure 14 A build up from the Pre-processing flow to object detection .............................................................. 31 Figure 15 Experimental Setup Launcher ........................................................................................................................... 37 Figure 16 Experimental Setup Camera view ................................................................................................................... 38 Figure 17 Computing Hardware specifications .............................................................................................................. 39 Figure 18 No drivers Caption ................................................................................................................................................. 42 Figure 19 Device manager before installation ................................................................................................................ 43 Figure 20Device Manager after installation ..................................................................................................................... 43 Figure 21 Cheese Photobooth capture ............................................................................................................................... 44 Figure 22 Detecting Pink ball ................................................................................................................................................. 45 Figure 23 Detection of a Red Pen.......................................................................................................................................... 46 Figure 24 Detection of Pink ball from geometrical features ..................................................................................... 47 Figure 25 Detected Centre points of pink ball................................................................................................................. 47 Figure 26 Detection of Blue Ball as it enters Camera FOV ......................................................................................... 48 Figure 27 Object Detection By background subtraction and Blob extraction ................................................... 49 Figure 28 background subtraction applied to static scene with stationery ball............................................... 50 Figure 29 Coordinate system ................................................................................................................................................. 52 Figure 30 Testing the ball motion for Coordinate transform along x-axis .......................................................... 53 Figure 31 Results from moving the ball along the camera x-axis on the test rig .............................................. 53 Figure 32 Moving ball along ruler on the x axis along test rig.................................................................................. 54 Figure 33 Sample of Frames during launch...................................................................................................................... 55 Figure 34 Trajectories for high speed launch .................................................................................................................. 56 Figure 35 Projectile Trajectories for reduced Speeds launch ................................................................................... 57 Figure 36 Projectile Trajectories for low speed launch .............................................................................................. 58 Figure 37 Trajectories for Projectile Launch angle 20 degrees ............................................................................... 59 Figure 38 Trajectories for Projectile Launch angle 40 degrees ............................................................................... 60 Figure 39 Trajectories for Projectile Launch angle 60 degrees ............................................................................... 61 Figure 40 Pencil Launch sample frame sequences ........................................................................................................ 62 Figure 41 Trajectory paths for a red pen .......................................................................................................................... 63 Figure 42 Learning Curve for 5000 iterations with 60 tap estimates ................................................................... 64 Figure 43 learning curve with 5000 iterations and 120 tap estimates ................................................................ 65 Figure 44 Learning curve for 5000 iterations with 180 tap estimates ................................................................. 66

vii

Figure 45 Learning curve with 5000 iterations and 200 tap estimates ............................................................... 67

List of Tables Table 1 summary of state of art feature descriptors as in [6,7]............................................................................... 10 Table 2 results from [15] ......................................................................................................................................................... 11 Table 3 PlayStation Eye specifications ............................................................................................................................... 20 Table 4 Table of HSV boundaries .......................................................................................................................................... 44 Table 5 Time taken to detect object of interest .............................................................................................................. 51

viii

1. Introduction In this section of the report the Visual Trajectory Estimation Device (VTED) is introduced, this section begins with the motivation that led to the development of the VTED. The objectives of this study are defined with set scope constraints that will define the limits of this researches investigation. We later on give a pipeline for the plan of development, the plan of development defines what the reader will expect throughout the whole document.

1.1

Background to the study

The revolution of computer vision is seen over the past decade, the power of cameras has increased tremendously and the desire to mimic the human eye has led to the revolution of computer vision algorithms. Visual Tracking (VT) is one of the core applications of computer vision, it is often vital to track features in scenes as well as extract information from such scenes. There are often real-time scenes that involve high speed motion, for instance in sporting activities it is often needed to track motion such as player motion in soccer as well as ball trajectories. The VTED comes in handy in applications such as the latter, using visual information we can calculate the trajectory of projectiles, make estimations and predictions based on such data. The applications of computer vision normally need intense image processing algorithms, visual information needs to be processed so that it’s easier to extract relevant parameters in the application of interest. Such processing algorithms include blob extraction, cropping, filtering as well as background subtraction. All this will be included in the development of the VTED. The Sony PlayStation eye which is a camera used by PlayStation consoles for gesture recognition will be used by the VTED for acquiring the visual data to be processed, the choice of this device will be explained throughout the rest of this document.

1

1.2

Objectives of this study

The major goal of this investigation is to develop the VTED, we define here a set of objectives that need to be addressed in order to achieve the main goal.

1.2.1

Problems to be investigated

Below is a set of question this investigation is addressing: 

How close can we come to making projectile trajectory predictions using a set of algorithms that are part of the VTED?



How efficient is the device in terms of speed and accuracy?



Is the hardware chosen for development suited for this sort of application?



Will the developed VTED perform well in terms of performance metrics used?

This set of questions completely cover the problem to be investigated, these will be completely addressed throughout the whole investigation. 1.2.2

Purpose of the study

The purpose of this study is to develop the VTED, to develop the VTED a set of steps are needed. The hardware camera chosen should be able to communicate with the user platform, there must be communication between a processor and the camera in order to carry out calculations and manipulate visual data. The host processor must also be armed with algorithms that will process such visual data, the results must then be presented in a user friendly way. Based on the facts presented above the study needs to address: 

Effective Communication of the PlayStation eye with the host platform



Development of Effective algorithms for the VTED



Performance analysis of the developed algorithms



Performance analysis of the chosen hardware in the implementation of the VTED



Limitations of the developed VTED and improvements that could be made

This study however will have limitations and hence below we define the scope of this investigation.

1.3

Scope and Limitations

The development of the VTED will be done over a desktop PC, the PlayStation eye camera will be used as the capture device. Other camera models will not be investigated. Since the PlayStation eye is intended to be cross-platform this investigation will address the 2

communication with the camera on both windows and Linux platforms, other operating systems will not be investigated. Ubuntu 14.04 and windows 7 will be the main development platforms. The objects of interest in this investigation are spherical objects, focus will be made more on tracking algorithms than on object detection methods. Prediction algorithms will only be limited to real-time scenes, the tracking algorithms developed will also be based on state of art available methods. The experimentation done to test the VTED will be based on a controlled environment, clutter and illumination variance will however be investigated in order to assess how they affect the predictions.

1.4

Plan of development

Below we present the pipeline followed in this project, we present how the whole report is structured and how different sections follow each other.

Figure 1 Plan of Development

3

4

2. Literature review 2.1

Introduction

In this section of the paper we explore the state of art literature concerning visual trajectory estimation devices. We begin by building the knowledge base from the state of art literature and use the funnel method to critique efficiently the literature. This section begins with an overview of visual trajectory estimation devices mostly used in robotics, we build upon important methods that build upon VTEDs and

algorithms

that

analyse

their

effectiveness

and

efficiency

are

discussed.

A general visual object tracking device is composed of 4 main stages, which are camera calibration, object detection algorithm, object tracking algorithms as well as a trajectory prediction algorithms [1]. These are normally used in sporting activities such as badminton, soccer and table tennis where high motion is anticipated and therefore various computer vision and image processing techniques are employed for better results [2], [3], [4]. This problem is split into image processing, visual object tracking and trajectory estimation and prediction whose literature shall be discussed below.

2.2

Image Processing

Before applying computer vision algorithms it is often necessary to pre-process the images that are acquired through the capture device which may be a video camera, this pre-processing techniques are essential in order to effective eliminate information that’s not of interest to us before we even get to the manipulation of such data. This pre-processing techniques include filtering for noise removal, image resizing, smoothing, segmentation and many more. 2.2.1

Smoothing images

Smoothing is an image processing technique with is often applied to images to reduce noise, there are plenty of algorithms used for smoothing in image processing but our particular interest is with the median filter and smoothing using the Gaussian kernel. With the median filter, a windowing matrix is defined and the median of all the pixels in the window is calculated, all pixels in the windowing matrix are replaced by this median value, the same thing applies for the mean filter [5]. The Gaussian filter utilizes a 2D Gaussian kernel and an image is convolved with this kernel. The Gaussian 2D Kernel is defined as [6] : 𝑥 2 +𝑦2 1 − 𝐺(𝑥, 𝑦, 𝜎) = 𝑒 2𝜎2 2𝜋𝜎 2

5

(1)

The Image I is convolved with this kernel to find the smoothed image. The picture in Figure 2 shows an original picture and how it looks like after applying different smoothing algorithms.

Figure 2 applying different filters

2.2.2

Conversion between Colour Space

Images are made of small elements called pixels. This small elements are represented by numbers and this numbers are often manipulated through matrix computations for image processing. In order to make some specific computations efficiently it is often vital to switch through relevant colour spaces. The colour spaces we particularly interested in are RGB (Red, Green, and Blue), HSL (Hue, Saturation and Light) and Grayscale. We will often need to switch between this colour spaces hence we define their conversion as illustrated by [7]. To convert from RGB to HSL we use the relation: 𝑆 = max(𝑅𝐺𝐵) − min(𝑅𝐺𝐵) 𝐺−𝐵

𝐻=

2+

𝐵−𝑅 𝑆

𝑆

𝑖𝑓 𝑅 = max(𝑅𝐺𝐵)

𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 𝑖𝑓 𝐺 = max(𝑅, 𝐺, 𝐵) 4+

{

(2)

𝑅−𝐺 𝑆

(3)

𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝑢𝑛𝑑𝑒𝑓𝑖𝑛𝑒𝑑 𝑖𝑓 𝑆 = 0 𝐿 = 0.3𝑅 + 0.59𝐺 + 0.11𝐵

(4)

To convert from HSL to RGB the above set of equations are inverted. Figure 3 is an example of how an image looks like when converted between the two colour spaces.

6

Figure 3 converting RGB to HSV

To convert the Image to grayscale, there are three famously known methods found in most open source image editing packages, namely lightness, averaging and luminosity [8]. These are calculated through the relations: Lightness: (max(R, G, B) + min(R, G, B)) / 2

(5)

Average : (R + G + B) / 3

(6)

Luminosity: 0.21 R + 0.72 G + 0.07 B

(7)

Figure 4 is the result of converting from RGB to different grayscale domains [9].

Figure 4 converting to grayscale

2.2.3

Background Subtraction

In scenes where there are lots of objects that are not of interest to the computer vision application it is often necessary to apply background subtraction. Various methods exist but the one of particular interest in this application is a method by [10] in which a Gaussian Mixture-based Background/Foreground Segmentation Algorithm is used [11], this algorithm is implemented in opencv as BackgroundSubtractorMOG [11]. Results from the paper in [10] are presented in Figure 5.

7

Figure 5 Top-original frames, bottom-Background subtraction results from [10]

2.3

Visual Object Tracking

Visual object tracking is the idea of using object descriptors to locate the object in a scene as well as estimating the trajectory of the object in the image plane as it establishes its motion [12]. This is very challenging because of the data loss experienced when 3D real world data is projected on a 2D image. Illumination, noise and background indifferences are some of the challenges facing visual object tracking. In addition there are some factors that must be considered for solving problems faced by visual trackers. Yang et al. in [13] proposes that the following factors must be considered: 

Robustness: under all possible conditions the tracking algorithm must still be able to perform well on its region of interest.



Adaptivity : should the object of interest experience changes such as illumination variance there must be an adaptation algorithm ensuring the changes do not affect the tracking algorithm



Real-time Processing: the algorithms used must be sufficient to track fast moving objects in real time events such as those experienced in sporting activities.

2.3.1

The working mechanism of Visual Object Trackers

Below we entail how visual object tracking works as adapted from Yang et al. in [13]. We then expand on each block with the help of other author’s views on robust and well performing methods. Figure 6 is a flow chart for how these trackers work:

8

Figure 6 Visual tracking flowchart adapted from [6]

The above general flow chart may however be altered based on a specific tracking application. This general scheme will normally consist of video capture device from which we extract object descriptors and integrate context information which will help in tracking decisions, this decisions will then be used to update the model whose errors and modifications will then be incorporated into the video analysis framework to robustly track the target of interest [13]. 2.3.2

Feature Descriptors

A feature is a distinct characteristic in object that may partially or fully describe an object. In Image Processing, feature descriptors are normally used in image search algorithms which are widely used in forensics, social networks and even biometrics [14]. Feature descriptors are also very important in computer vision algorithms for object identification, features need to be unique so that objects of interest are easily distinguishable from such things as background and when occlusions occur [13]. The state of art feature descriptors will be mentioned below briefly after which we mention shape detection algorithms based on feature descriptors which will lead us to the bottom of the funnel on ball detection based on features. More details on feature descriptors can be found in [14] .

Feature Descriptor

Description

Gradient features

These use shape/contours to represent objects, some methods utilize a statistical sum of the gradients

Color Features

color spaces are used to describe object features, color histograms and color moments are also used to describe features

9

Smoothness and other intensity variations of a surface are used to describe

Texture Features

features

Spatio-temporal Features

characteristic motion and shape are captured in video frames and provide relatively independent representation of events with respect to their spatiotemporal shifts

Multiple feature Fusion

this is a combination of any available feature descriptors depending on the area of application

Table 1 summary of state of art feature descriptors as in [6,7]

i.

Shape detection based on Features

The above feature descriptors can be used to detect various geometric shapes, they can either be combined or used individually to achieve robustness. We focus here on coloured geometric shapes and this makes it obvious that the descriptors we focus on are gradient feature based and or colour based. Since the scope of this research is towards balls as circular shapes we will explore circle detection algorithms in this section in detail and how feature descriptors help us achieve circle detection. In 1962 Paul Hough introduced a method which is famously known as the Hough Transform [15] which can be used to detect various geometric shapes. However due to its computation complexities and storage space, various methods have been invented including the modified versions of the Hough transform [16].

The most relevant recent publication is by Chen et al. [17], Chen Proposes a method in which the video frame is pre-processed by eliminating noise before computations. The image is then discretized and the resolution is reduced for fast computations. After which two segmentation methods are used and the radius and centre of the circle calculated. The first stage of the method proposed by Chen et al. is similar to canny edge detection [11] after which the circle detection is implemented. In [17] feature point neighbourhood segmentation method is used to work out the centroid of curves. The Image matrix is scanned from columns and rows, feature points are then classified into one array and the distance between such feature points must be close to a certain set threshold value. After scanning we will have different circular coordinate values on different arrays and hence we have achieved segmentation. We can also calculate the centroid of a circle, since it is a basic geometric shape, this is calculated in [17] using the equations below:

10

𝑛

𝑥0 =

∑𝑖=1(𝑥𝑖 ) 𝑛

∑𝑛𝑖=1(𝑦𝑖 ) 𝑦0 = 𝑛

(8) (9)

Xi and yi are horizontal coordinate values and vertical coordinate values on the curve, n is the number of all feature points available on the curve while x0 and y0 are the calculated centroid points. Chen et al in [17] realised that the regular distance variance is zero because of the geometric orientation of a circle, however this value might change due to irregularities of the circle that may be encountered in real time processes. After computing the centroid Chen et al [17] calculates then centre of the circle and radius using the straight line Hough method, geometrically the centre of the circle lies where the lines with longest equal diameters meet. The method to find centre of the circle is to create vertical and horizontal lines, find the bottom and top points where the lines intersect the circle calculate the distance of the horizontal line. We also do the same thing vertically, we take the two greatest values from both the horizontal searching and the vertical searching after which the center will be their intersection. The radius will be half of the diameter. To verify the centre point we may do another search at 45 degrees to the horizontal.

Table 2 results from [15]

The results in Table 2 show that the method developed in [17] is fast enough for real time applications. It is faster and less computationally intensive compared to the ordinary Hough transform. However we will now look into other methods which were used in sporting activities in order to fully evaluate which methods are worth implementing. Another method was developed by Scaramuzza et al. in [18]. This algorithm is called “pixel to pixel” algorithm. This algorithm works in a very different way from the ones based on Hough, it acts directly on the binary edge map where a pixel is assigned 1 if it as an edge point otherwise 0. After obtaining the raw image edge map a windowing matrix of order (2k+1)∙(2k+1) is defined, the image dimensions 11

are acquired and the accumulator space with such dimensions is acquired and all values are set to 0. We then find the next edge point by considering points along the already defined windowing matrix, we count the edge points within the windowing matrix after which extremes are found call the two extremes A and B. if A and B are not aligned to the centre of the windowing matrix, we find a line perpendicular to the vector AB. The next step is to find the second derivative of the perpendicular to the vector and its direction and then plot this line in the accumulator space towards the direction of concavity. We iterate back and repeat until accumulator space is filled, after which a smoothing filter is applied. A global maximum is found whose coordinates give the circle location. The method described in [18] however depends on the choice of the variable k which defines the windowing matrix. It can however be robust if combined together with other fast methods in parallel or for voting purposes.

2.3.3

Object Tracking Algorithms

Once the region of interest has been confirmed, the next step is to track the object motion. In our case we are more interested in tracking the circular objects, high speed tracking algorithms will need to be explored since VTED’s work on fast real-time data. The work of this section covers the rest of the framework proposed by Yang et al. in [13]. i.

Kalman filter

The Kalman filter is a fusion algorithm based on the fact that the product of two Gaussian distributions is a Gaussian distribution, it is more than 50 years old and is still one of the best tracking algorithms in literature [19], the Kalman filter is based on a measurement update basis and a prediction update which are combined together to achieve tracking [20]. The kalman filter defines a state transition equation as below: xt=Ftxt-1+Btut +wt

(10)

where xt is the state containing the term of interest at time t, ut is the input perturbation which according to Deori in [20] is zero since we are not interested in controlling anything in visual tracking but we are more interested in the predictive part of the algorithm at zero input. The variable wt is the Gaussian process noise vector and Ft is the state transition matrix. We also define the measurement equation which helps in updating the state, according to Faragher in [19] the best estimate of the

12

current state is provided by combining our knowledge from the prediction and measurement. The measurement equation is defined as below: zt=Htxt +vt

(11)

Where zt is the measurement vector at time t, Ht is the measurement matrix mapping the state vectors into the measurement domain. vt is the Gaussian measurement noise with covariance Rt . we also define two other variables, the Kalman gain Kt and the covariance matrix Pt which fully describes the Gaussian functions by providing both their variances and covariances. The derivation for the Kalman filter is out of the scope of this paper but is described from first principles by Faragher in [19]. Below we provide the final equations that fully describe the Kalman filter without any derivations, we provide a complete set for the prediction stage:

xt|t-1=Ftxt-1|t-1

(12)

Pt|t-1=FtPt-1|t-1FtT

(13)

xt|t= xt|t-1+Kt (zt -Htxt|t-1)

(14)

For the measurement update we have:

Pt|t= Pt|t-1-Kt HtPt|t-1

(15)

Kt= Pt|t-1HtT(Ht Pt|t-1 HtT +Rt)-1 At this stage we also acknowledge the presence of other versions of the Kalman filter, this include the Extended Kalman filter which is a nonlinear version of the Kalman filter [21]. Due to the linear nature of the Kalman filter, other alternatives need to be reviewed and the next section discusses the particle filter.

ii.

Particle filter

The particle filter is a statistical method which is used in tracking, it tries to avoid the Gaussian flaws experienced by the Kalman filter. It is a numerical method for approximating the nonlinear Bayesian filtering problem [22]. We shall elaborate more on the details of the algorithm as described by Gustafsson in [22], but for now we give an overview of the algorithm. Suppose we want to track a feature, we need to be able to know something related to the feature we want to track, there should be something we can measure that’s related to the feature we want to track and a relationship between 13

the feature and the objects motion, a non-mathematical explanation of this prior information is explained in detail in [23]. For a particle filter, the conditional state density at time t is represented by a set of samples called particles with sampling probability. There is a common sampling scheme used by particle filters to generate new samples, random samples are selected, the prediction is made by generating new samples after which the correction is made by computing new sampling probabilities based on the measurements made [20]. We shall use Gustafsson’s notation [22] to describe mathematically the particle filter algorithm. The number of particles N is chosen, the proposed distribution q(xk+1|x1|k , yk+1) is chosen. Initialize the weights by generating x1i~px0 where i=1,…, N while letting w1|0i=1/N We define iterations using the notation k= 1,2,3…. 1

𝑖 𝑖 We update the measurement using : 𝑤𝑘|𝑘 = 𝑐 𝑤𝑘|𝑘−1 𝑝(𝑦𝑘 |𝑥𝑘𝑖 ) 𝑘

From the above it is obvious that ck is normalizing the sampling probabilities and it is not surprising that it is defined as a sum of previous weights multiplied by previous probabilities. The filtering density and mean can then be calculated using statistical definitions, after which we can make predictions based on the proposed distribution. This algorithm is a basic particle filter with no modifications proposed by Gustafsson in [22], this basic filter may need a lot of modifications depending on the application of interest.

iii.

Online based Tracking

Appearance variations in tracked objects may cause problems to tracking algorithms, various methods have been proposed for online object tracking [24]. Online visual tracking estimates the target state forward in time without future observations [25]. For online based tracking the general method is to update the object model online as change occurs to the target region, a learning algorithm is therefore needed to detect the changes and help the tracking algorithm adapt [13]. iv.

Remarks on object tracking algorithms

The problem that Kalman filters face are due to state variables being Gaussian, this implies that for systems with state variables that do not follow the Gaussian distribution the Kalman will give very poor estimates hence other methods may be necessary [20]. This however can be avoided by using the particle filter whose basics have already been explained in [22]. It may however be necessary in real-time applications to modify the parameters of the tracked object and online based learning

14

trackers will be important for feature updates, all this algorithms should be used with care depending on the need.

2.4

Projectile Motion and Trajectory Prediction

While tracking algorithms are implemented we will then need to make predictions about the projectile trajectories based on acquired parameters, in this section we describe this methods based on the literature. 2.4.1

Projectile motion

Projectile motion is a form of motion that follows a parabolic trajectory when objects are thrown up, the influence of gravity accelerates the objects downwards following a projectile path. Below is a diagram showing this form of motion.

Figure 7 projectile curve adapted from [20]

Figure 7 adapted from [26] shows this kind of parabolic trajectory. This sort of motion happens both

in the x and y directions but there is always a z vector which is however irrelevant since it normally stays constant and for this investigation this assumption will be used.

Equations can be developed from the above curve, a full derivation of this equations will be shown in Appendix A, in this section we present the useful results as presented in [26]. Since we are going to be dealing with projectile parameters determining the velocity vector is very important, this is defined as: 𝑉𝑥 cos(𝜃) 0 [𝑉 ] = 𝑉0 [ ]−[ ] 𝑔𝑡 sin(𝜃) 𝑦

15

(16)

Where V0 is the initial velocity, 𝑔 is acceleration due to gravity and t is the instantaneous time of flight. The displacement vector can be found by differentiating the velocity vector with respect to time. To make estimations about the projectiles in real-time, this velocity vector will come in handy since we can easily find the initial velocity using computer vision algorithms which will help us make necessary predictions. 2.4.2

Recursive Least Squares Filters

The Recursive Least Squares (RLS) filters are adaptive filters whose inputs are normally deterministic and can be used to make estimations and predictions through cost function minimization [27]. The RLS uses linear cost functions instead of using polynomial cost functions used by similar statistical estimation methods. In order to estimate the projectile trajectories, we need previous trajectory gradients each time we want to make a prediction. The RLS filter comes in handy in this respect as its better than all the other prediction algorithms studied by Sahoo in [27]. Figure 8 is a diagram of how it works:

Figure 8 RLS Filter adapted from Sahoo[26]

The algorithms effectively selects filter coefficients and updates the filter as new gradients are calculated. The cost function to be minimized depends on the error signal calculated as: 𝑒(𝑛) = 𝑑(𝑛) − 𝑑̂ (𝑛)

(17)

The cost function to be minimized is therefore: 𝑛

𝐶(𝑛) = ∑ 𝛽(𝑛, 𝑖)|𝑒(𝑖)|2 𝑖=1

16

(18)

The weighting vector 𝛽(𝑛, 𝑖) lies between 0 and 1 for all iterations. This weighting factor is directly related to the forgetting factor which is a factor that exponential reduces the effect of old error samples. The forgetting factor relates to the weighting vector through the relation: 𝛽(𝑛, 𝑖) = 𝛾 𝑛−1

(19)

Where 𝛾 is the forgetting factor.

The RLS converges quickly even though its complexity is increased, Sahoo in [27] compares the RLS With other methods based on set of metrics, the RLS was found to outperform the rest of the algorithms despite computational complexity.

2.4.3

Least Mean Squares Filter

The Least Mean Squares (LMS) filter is also one of the useful adaptive filters used for cost function minimization. This filter adjust filter coefficients to minimize the cost function [28] and its computational complexity is less than that of the RLS filter since there are no matrix operations involved during computations. To update the filter coefficients the filter first calculates the expected output y(n) and calculates the error as in equation 20. The important step in this filter is the coefficients update step which is carried out by: 𝑤 ⃗⃗ (𝑛 + 1) = 𝑤 ⃗⃗ (𝑛) + 𝜇 ∙ 𝑒(𝑛) ∙ 𝑢 ⃗ (𝑛) Where 𝑢 ⃗ (𝑛) is the input vector, 𝜇 the step-size and 𝑤 ⃗⃗ (𝑛) the coefficients vector. There are other modified versions of the simple LMS filter that include the Normalized LMS and the Leaky LMS which minimize different versions of the cost function [28].

2.5

Performance Evaluation of VTEDs

It is important to evaluate the performance of tracking algorithms in order to improve their performance for future applications. Bashir and Porikli in [29] defined both the frame based and object based evaluation methods. We shall evaluate here those that are relevant to this study. We make a few definitions before we proceed as in [29]: True positive (TP): number of frames where the system results and the ground truth agree on the presence of an object. Total Ground truth (TG): total number of frames for the objects in ground truth. False Positive (FP): number of frames where ground truth does not contain the object of interest but the system falsely does. 17

Bashir and Porikli [29] define the Tracker detection rate (TRDR) and positive prediction as follows: TRDR=𝑇𝑃⁄𝑇𝐺

(21)

Positive Prediction=𝑇𝑃⁄𝑇𝑃 + 𝐹𝑃

(22)

Having defined those, they can be easily used to evaluate the system performance during testing. Bashir and Porikli [29] also define and object tracking error, they define the variable for the number of overlapping frames between the results from the system and the ground truth as Nrg , the centroid of the object in ith frame of ground truth is defined by variables xig and yig while the centroid of the object in the system’s frame is defined by x and y coordinates xir and yir . The object tracking error is defined by the formula below: 1

𝑔 𝑔 OTE=𝑁 ∑𝑖∈𝑔(𝑡𝑖 )⋀𝑟(𝑡𝑖 ) √(𝑥𝑖 − 𝑥𝑖𝑟 ) + (𝑦𝑖 − 𝑦𝑖𝑟 ) 𝑟𝑔

(23)

According to [29], this method is good enough for quick and simple evaluations.

2.6

Software Tools And Hardware

In order to take full advantage of the state of art algorithms and methods some software tools are needed, the software will be integrated with the hardware to achieve the goals of this project. In this study various tools are explored which are relevant to the development of VTED’s.

2.6.1

OpenCV

Opencv is an open source computer vision library first developed by Intel in 1998 is released under a BSD license and hence it’s free for both academic and commercial use [11]. It is also a machine learning library armed with tonnes of state of art algorithms, this algorithms can be used in object detection, object tracking, biometrics as well as other image processing and computer vision applications [30]. Opencv has c,c++, python, java and matlab interfaces and is compatible with linux, windows, android and Mac OS. Opencv is compatible with CUDA and openCL and can therefore be used in fast and realtime applications as well as applications that feature intense graphics [30].

Opencv has the GPU module which has a large number of functions for manipulating images. It includes a performance benchmark that runs GPU functions with different parameters and they can be tested on different datasets [30]. This allows parallelization of computer vision and image processing algorithms for faster implementations. Opencv also supports stereo vision with multiple

18

cameras and could therefore be used for building scenes from multiple views [11]. Opencv can also be used on mobile devices including smartphones which have both their GPU and CPU built on one chip [30].

2.6.2

Numpy and Scipy

Numpy is a numerical library for scientific computing in python [31]. It includes useful linear algebra tools, Fourier transform and other numerical tools. It is famously known for its good traits in dealing with arrays and vectors. Numpy also has tools for intergrating c/c++ tools in the python code in order to improve performance and it is also licensed under the BSD license [31]. Scipy is a scientific computing package for python. It is often refered to as the scipy stack and it includes many other open source packages for scientific computing [32]. Packages included in the scipy stack are: pandas- which provides high performance data structures, Ipython-which provides a rich interactive interface, Matplotlib-which provides plotting functions for both 2D and 3D data, sympy which is for symbolic mathematics and algebra. On top of this, the stack also provides many other functionalities including statistics and the signal processing functionalities [32]. This libraries and packages are essential for manipulating image and video data, matrix and array manipulation are essential for image processing hence why numpy will be a relevant package. For general signal processing, scipy already has pre-defined functions which are armed with filters and many other signal processing tools. 2.6.3

CUDA

CUDA is a parallel computing platform and programming model which was invented by NVIDIA [33]. It is used in most high performance applications as it harnesses the power of GPU’s to increase performance. CUDA only works in NVIDIA GPU’s even though its performance is notable [34]. CUDA can be used to speed up computer vision algorithms especial for fast real-time applications such as sports scenes [30].

2.6.4

OpenCL

OpenCL is the first open source standard for cross-platform parallel programming for CPU’s and GPU’s [34]. OpenCL can work in different kinds of processor from various vendors, it is also used to speed up engineering algorithms which include computer vision for real-time applications, OpenCL can be used to work with OpenCV GPU module to achieve this [30]. OpenCL supports python, c/c++,java and other programming languages. It can also work on FPGA’s and DSP’s. 19

2.6.5

PlayStation Eye

The PlayStation Eye is a camera from Sony designed for PlayStation Consoles, this camera is normally used for motion gestures, below is the PlayStation eye specifications [35].

Table 3 PlayStation Eye specifications

Even though there are no freely available drivers for the PlayStation Eye this camera is well suited for computer vison applications and it’s fairly cheap compared to other cameras. This project will attempt to make freely available drivers for windows platforms since the latest Linux distros come with the capability of communicating with the PlayStation eye.

2.7

Section Remarks

In this section we have discussed the basics that build up the VTED and considered relevant literature with similar methods. We have discussed Image processing methods relevant to the development of the VTED from literature, this methods include filtering through smoothing as well as colour conversion methods. We have also discussed the basics of visual object tracking with relevance to spherical object tracking, methods have been discussed to track trajectory paths. We have established that visual object tracking methods should be robust, adaptive and should be able to handle real-time events. Visual tracking

20

methods work on feature descriptors, this descriptors allow the object of interest to be defined after which tracking algorithms are applied. We have discussed a few feature descriptors that include colour based, gradient based, spatiotemporal as well as texture based feature descriptors. Some of this object detection methods will be used in this project, for robustness multi feature fusion proposed in [14]

will be followed.

Geometrical feature descriptors that include the application of the Hough transform will also be used since they are widely used in literature. The tracking algorithms that have been discussed are the particle filter, Kalman filter as well as well as online based tracking methods. This methods have their pros and cons hence why methods for evaluating them have been discussed. Methods discussed for evaluating tracking algorithms are both object based and frame based. This shall be used in the methodology in order to ensure trackers being used perform ultimately. Software tools that are necessary towards solving the problem at hand have also been discussed, opencv will be the primary tool since it has most state of art image processing and computer vision algorithms. Numpy and scipy will also be very useful for dealing with scientific computing as well as handling matrices and vector manipulation. We have also discussed high performance computing tools that include CUDA and OpenCL, this will be necessary since real-time processes need fast processing. We have also presented features of the PlayStation eye Camera which are relevant to the choice of the playstation eye for computer vision applications. The next section will discuss the methodology used that includes hardware tools as well as experimental setup and design.

21

22

3. System Design Methodology Below the system design lifecycle is described. This section describes how the VTED system was designed, a system of systems approach was used in the design of the VTED.

3.1

Preliminary Analysis

In this stage the problem is defined, its feasibility assessed, the preliminary analysis takes care of the projects shortcomings and if there are barriers it is at this stage that they must be addressed. With the VTED no barriers were found, the social and ethical issues of developing the system were assessed and no barriers in deploying the system since the experimentation for the system will be done under controlled laboratory conditions.

3.2

Requirements definition

The requirements of the system have already been describe in the introduction, the system scope is restricted so as to efficiently address the goals of the system. The system focuses more on the algorithmic design more than the software issues that have nothing to do with the algorithmic development. With the algorithms we focus on speed and efficiency. The system is expected to perform comparably to already design similar systems.

3.3

System Design

The system design section will outline the whole system, the methods used in the development of the VTED will be outlined in detail. During the system design it will also be vital to go back to the requirements section to make sure the system design meets the objectives of the project. The system design entails hardware interfaces as well as software integration.

3.4

Development

The development of the VTED will be done using python and the Open Computer vision library, the scientific python stack will also be used to develop the algorithms. The code will also be written following the object oriented model as much as possible. The code will be commented and made user friendly so that anyone using it can understand each steps as well as modify it for their own purpose.

23

3.5

Integration and Testing

The different classes will be brought together in a testing environment where a Linux machine will run the software and the camera will be connected to a Linux machine. The experimental design will be setup in order to test the system, the system will be tested in a laboratory under controlled environments. The testing will be done with different objects and objects of different colors to ensure robustness and versatility

3.6

Deployment

The VTED is Aimed at being deployed in sporting activities, it will be useful in analysing trajectories of projectiles and it can help in judging in sports such as tennis, table tennis and volleyball. Its algorithms can be used to make predictions in matches. It can also be extended to industrial applications such as the analysis of falling bodies in plants.

3.7

Section remarks

In this section we have established a simple design methodology for the software lifecycle. The details of the software design follow in the next section. A system of systems approach will be used where an object oriented model approach will be used. Different classes will depend on each other and will all be used in main function to implement the entire system.

24

25

4. VTED design This section of the report outlines the methodology used in the development of the VTED. We give an overview of how the trajectory estimation and prediction algorithms are implemented as well as following details of each block. Figure 9 is a flow chart of the whole software design.

Figure 9 System Flow Chart

An overview of the whole system can be seen in Figure 10:

26

Figure 10 System overview

4.1

Video capture

Before we process the video information, a capture device is needed. In our case the capture device is the PlayStation eye. The PlayStation Eye works well on the latest Linux kernels through the Video For Linux drivers (V4L). However the system is meant to also be able to run on windows. Although the rest of the tests and experimentation will be done on linux, we present here the drivers development for windows for completeness. The source code for the latest V4L drivers was downloaded, this source code was modified to work on windows using the library called Libusb for windows which establishes communication between the usb devices and the computer. The verification for communication between the computer and the camera was done using the sniffer software, the sniffer was run for both Linux and windows to make sure that results were similar. The addresses and values assigned to these addresses was also verified from Linux, the code was compiled on windows and a binary file was produced according to the drivers development standards from Microsoft. The results obtained for the drivers development are shown in the Results and analysis section of this document. A basic test script was written for opencv capture function, using this test script capture was able to be done after the opencv_ffmpeg.dll was added to the relevant path. This drivers were not robust enough hence more work would be needed to improve their functionality, for the purpose of testing the rest of the functionalities of the VTED, the implementation was done on Linux.

27

4.2

Pre-processing

Before doing the actual computations, the frames involved must be pre-processed to eliminate noise as well as reduce the number of computations the processor is going to handle during the real-time processing. This processing methods have been discussed under image processing in the literature review. After capturing the video, it is blurred using the Gaussian filter with the kernel size calculated from 𝜎 = 3. This is to blur the image and remove the noise. The blurred frame is converted to HSV through the equations as calculated through the formulae by [7] .The HSV values are then calculated for the ROI and stored in a numpy array. Figure 11 shows a diagram of the whole preprocessing pipeline:

Figure 11 Pre-Processing Pipeline

The preprocessing functions speed up the following algorithms since there is less clutter and the computations that will follow will only be applied to relevant pixels. One of the pre-processing techniques which will prove to increase the speed of the next computations done in the background subtraction technique, but this will prove vital after the relevant object detection methods have been implemented. Once the relevant object is identified, it can be processed as a silhouette to speed up computations. Figure 12 is an example of how the experimental space looks like after applying both the Gaussian

blur and HSV conversion.

Figure 12 Gaussian Blur and HSV

28

We will utilize opencv background subtraction on the experimental rig, the background subtraction will be important for object detection by blob extraction. The class to be used from opencv background subtraction is bacgroundSubtractorMOG which is a Gaussian Mixture-based Background/Foreground Segmentation Algorithm as seen in [10] already discussed in the literature review. We summarise the pre-processing sub processes in Figure 13 the sub-processes blocks are those responsible for pre-processing.

Figure 13 Pre-Processing summary

29

4.3

Object detection

The main Aim of this investigation concerns spherical objects, hence for object detection we design for spherical object detection. In this section we use the feature descriptors as in the literature review Table 1. Since object detection may take too much computational power to find the ROI we use the OpenCV implementation of the relevant algorithms and also make use of fast numpy array manipulations to speed up this computations. 4.3.1

Detection by color features

From the preprocessing stage we return HSV values which are relevant for object detection. We select the relevant region of interest according the experimental setup which will be described in the next chapter. From the HSV values calculated we find the threshold as well as the upper and lower bounds of the HSV values, this process of thresholding allows us to computer the average color and we can identify the ball by color hence track its color features. This however is not robust enough and it might be necessary to identify the object of interest from its geometrical features. 4.3.2

Detection by Geometric features

We also want to track the ball according to its geometric features which are note prone to illumination variance. This is done through the Hough Transform which was described in the literature review. The Hough Transform is then applied to the area of interest, we pass the threshold values calculated before finding circles in the image using the Hough Transform. We then identify this features and continue to tracking them. It should also be noted that the centroid of the ROI is also passed on to the next functionality that involves drawing the trajectory paths. 4.3.3

Detection by background subtraction and Blob extraction

For the case where an object of interest is not spherical, object detection will be done by blob extraction and tracking the centroid of the blob will be needed. However for this to happen we need a motion detection algorithm and this will be done using background subtraction methods. The basic background subtraction method where the first and second frame determine the ground truth will be used, a moving object in the scene will then be Identified and tracked. Opencv already has the BackgroundSubractorMOG class which can be used, the class BlobExtractor also exists in Opencv and will be used together with the BackgroundSubractorMOG.

30

The method for object detection is summarized by Figure 14 which is an expansion of the object detection block in Figure 9 and a build up from Figure 13.

Figure 14 A build up from the Pre-processing flow to object detection

The next section describes the class that plots the trajectory paths

31

4.4

Plotting the trajectory paths

From the object detection methods, the coordinates of the centroids are returned, this coordinates can then be plotted using the matplotlib library [32]. that comes with the scientific python stack. The matplotlib library can do all forms of plots hence its use in this section to plot graphs. Since we need to determine the trajectories of projectiles there is a need for the ground truth in which the calculated trajectory will be compared to, we will define the ground truth according to the ideal projectile motion equations defined in [26] in the literature review and then carry out ideal plots using the matplotlib library. The drawpath class is initialized by defining the ground truth path and defining the initial points for the actual path to determine. We also define a function called addlist which adds points to the current list and updates the trajectory. The third trajectory to be drawn is the predicted trajectory whose points are generated immediately after the launch. Hence there will be three trajectories in the final diagram, the first trajectory being the ground truth, the second trajectory being the realtime values while the last is the predicted trajectory based on the prediction algorithm to be implemented. The trajectories calculated will also invoke vector calculations at each instance which will be stored in a file and compared. This will be compared to the expected ground truth parameters hence results will be able to be obtained and compared.

32

4.5

Trajectory Parameters Estimation and Prediction stage

There are many parameters that need to be calculated in order to make estimations and predictions, since we are going to use centroid points returned from the object detection and tracking algorithms to make calculations we will define here some of those parameters and how we will use them. 1.

We define a class that defines a point returning a named tuple for the point x-y coordinates in the video frame

2.

The vector between two consecutive points is also defined in the prediction class

3.

Delta increments in the x and y direction of the trajectory are also defined in the prediction class

4.

The instantaneous velocity vector is also defined in the prediction class

5.

The acceleration due to gravity is also defined in the predictor class, this will be used in calculating the ground truth path

6.

The projected path is also defined in the prediction class

To calculate the projectile motion we will use both methods described in [26] and [27]. With the first method in [26] we will define the vector in equation 24 which we will manipulate to calculate all the parameters of the projectile. We will also minimize the cost function 𝐶(𝑛) = ∑𝑛𝑖=1 𝛽(𝑛, 𝑖)|𝑒(𝑖)|2 using methods already described in [27] And [28].

Both of this methods will be used for trajectory calculations and performance metrics will be tested for both methods and results will be analysed and compared. We will also develop a fusion of the two methods with modified parameters for improved performance and this will be the final implementation for the VTED.

The fusion of the adaptive LMS filter with equation 25 has not been implemented anywhere to the knowledge of the author. The Visual Trajectory Estimation Software (VTES) developed by Retief. F [36] used polynomial regression model and the author identified the robustness problems with that 33

approach hence the pursuit of this approach for the VTED. This approach was also chosen because it can adapt to other environments where scene parameters are changed.

In this new approach we identify a set of initial input points and use those points to update the weights of the adaptive LMS filter, after which predictions are made based on those weights. Equation 26 is used as the ground truth calculation and the initial output points to update the filter will be computed from that. , 𝜇 the step size will be chosen based on learning curve analysis of initial experiments and their convergence attributes.

4.6

Performance Testing

The performance testing will be done based on methods described in section 2.5 of the literature review, testing will be made and results will be analysed. Different parameters will be changed to determine due performance.

4.7

Section remarks

In this section of the report we have outlined the underlying design of the VTED. We have presented the design from the video capture and how we are planning on going through with the VTED development. We have identified which image pre-processing techniques to use and how the preprocessing techniques relate to the following object detection algorithms. We have also presented how this algorithms will help in trajectory prediction and estimations, this VTED design framework will be used and refined if necessary but refinements that are made will be mentioned in the right sections. Finally the VTED defines a new method for projectile trajectory estimation which will be justified with the results obtained.

34

35

5. Experimental design In this section of the report experiments are defined, the purpose of the experiments is to test the algorithms developed for the VTED as well as test the whole system, after experimentation the system will be refined to meet the performance criteria. We will use various metrics to test for performance and adjust the parameters to meet the ideal systems requirements.

5.1

Objectives of the experiment

The objectives of this experiment are to: 

Establish communication with the PlayStation eye in both windows and Linux platforms



Test the VTED algorithms against real-time trajectories



Compare VTED results with the theoretical expectations



Improve algorithms where necessary and redesign

5.2

Scope of the experiment

The following limitations are inherent in the experimentation: 

The experiment is performed in laboratory conditions , the conditions are controlled



The objects to be tracked are coloured balls, the balls are pretty light



Other objects will also be investigated to ensure the robustness and versatility of the system



The hardware used is only limited to a low performance PC and the PlayStation Eye

36

5.3

Experimental setup and apparatus

The experimental setup consists of the test launcher as seen in Figure 15:

Figure 15 Experimental Setup Launcher

1. The lever support 2. The ball to be launched which is supported by a tilt support below it 3. The test floor, where the lever is being setup 4. The lever that is going to launch the ball 5. The known mass that will be applied on the lever to launch the ball

The camera is setup in such a way that the launcher is in the field of view of the camera so that ball trajectories can be observed. Figure 16 is a diagram of how the camera is setup on the test rig:

37

Figure 16 Experimental Setup Camera view

1. Computer screen to display readings 2. Computer processing unit to capture and process real-time data 3. PlayStation eye camera to capture trajectories 4. The test launcher

38

The specifications of the computing hardware are given in Figure 17 below:

Figure 17 Computing Hardware specifications

Other tools to be used in carrying out this experimentation process are opencv, scientific python stack, C++ tools and windows driver development tools. The python version used was python 2.7 and the latest scipy stack at the moment. Opencv latest version 2 was used. The chosen opencv 2 was because of its use of implementation with OpenCL for speeding up computations, this version already has most algorithms to be used in this projected implemented for OpenCL.

39

5.4

Experimental methodology



Connect all components according to the apparatus setup as outlined in the section above



Test camera on Linux, Install developed drivers on windows test capture on both platforms



Run the VTED script and make sure its correctly setup, the script will ensure the FOV is set correctly



Run various tests using the launcher and the VTED, ensure that the same parameters are kept while running similar tests



Establish the ground truth throughout all tests



Take VTED plots and compare the results



Evaluate the real-time plots against the theoretical plots



Run velocity, final landing and other prediction tests and evaluate against true values.



Refine the algorithm based on test using an adaptive approach



Test the adaptive algorithm against a new data set of launches



Analyse the results based on those tests and draw conclusions



Test the completed VTED software on windows to make sure its cross platform



Draw conclusions from results and suggest future improvements

It might be necessary to change the order of the above experimentation methodology based on the results gathered, however all steps must be done to ensure robustness of the developed VTED. The main goal is to make it adaptive hence the inclusion of the last four steps.

5.5

Section remarks

In this section we have established the experimental methodology through which the VTED algorithms are going to be tested. We have established the setup for the apparatus in which experiments will be carried out and analysis be made. The results of this experimentation are presented in the next section.

40

41

6. Presentation of Results

In this section the findings from the experimentation are gathered and presented. This section begins with the presentation of the video capture module and the communication establishment between the PC and the PlayStation eye. The next section will then present the results from the object detection module which will allow is to delve into tracking using features. The trajectory path estimation and prediction algorithms results will then follow after which an analysis of the gathered data will be made.

6.1

Camera communication and video capture

Figure 18 shows what happens when the PlayStation eye is inserted on the pc without the drivers,

there are no drivers hence the displayed message.

Figure 18 No drivers Caption

42

We then go on to device manager---> other devices and see the USB-camera as in Figure 19

Figure 19 Device manager before installation

And right click to update drivers and browse to the drivers folder that we developed. After which the results are shown in Figure 20

Figure 20Device Manager after installation

The ps3eye camera is now shown in imaging devices to see that it is working. However developing the VTED under windows is not the aim of this project even though the code can be easily ported. The purpose of the above section is to make development easy using the PlayStation eye for future computer vision applications on windows platform. On Linux there is already a package called Video4Linux which is incorporated in the latest releases, most programs like cheese photo booth use this API in order to implement video streams and video captures hence a few commands were needed to update the already existing API. The rest of the working captures and streams were first tested with cheese photo booth before deploying opencv. The screenshot in Figure 21 is a capture from cheese photobooth confirming an established camera communicaton, this confirms that we can proceed to opencv and implement the respective methods we wish to deploy.

43

Figure 21 Cheese Photobooth capture

The next section will present object detection results. In order to track object motion certain features need to be tracked. We will track motion based on object geometry and colour.

6.2

Pre-processing and Object Detection

In order to do computations, some image pre-processing needs to be done. The frame is captured and blurred using Gaussian blur after which the frame is converted to the HSV colour space. The lower and upper bounds of the colour of interest must also be found. This ranges for different colours were found using a function defined in the VTED software, Table 4 is a table of colour boundaries found to allow the VTED to track specific objects with colour. Colour

Lower bound

Upper bound

pink ball

[2,89,179]

[2,133,255]

Orange ball

[14,156,179]

[18,234,255]

Light blue ball

[85,129,106]

[103,193,198]

Yellow

[26,77,179]

[32,155,255]

Table 4 Table of HSV boundaries

An example of a detected ball using the above values is shown in Figure 22, in this figure the pink ball was detected.

44

6.2.1

Detection By colour

Figure 22 Detecting Pink ball

We then use the opencv draw box function to draw the red box around our area of interest.

The method can also detect other objects of different geometries given their colour identity, a non spherical object in Figure 23 was detected and its centroid calculated. While applying VTED methods the centroid point will be tracked.

45

Figure 23 Detection of a Red Pen

6.2.2

Detection By geometrical Features

The Hough Transform was also used to detect the object of interest by geometry for spherical objects. The opencv function cv2.HoughCircles() takes certain parameters that relate to the spherical object to be detected in an frame, this method uses the gradient feature of edges to determine the possible circles in an object after which the circle centre and radius are found. In our implementation, the following parameters were used with the hough circles function. Circles=cv2.HoughCircles(frame_bin, cv2.cv.CV_HOUGH_GRADIENT, 2, 10, param1=100, param2=40, minRadius=50, maxRadius=200)) . This values were chosen based on the illumination of the scene and the camera FOV value.

The figure below shows the results of the Hough transform detection method which was attempted at by the VTED.

46

Figure 24 Detection of Pink ball from geometrical features

The Hough transform method obviously returns many points and it is quite difficult to identify which one is the centroid. Figure 25 shows points as plotted on matplotlib for a stationary ball held in the above position.

Figure 25 Detected Centre points of pink ball

47

The object detection must also be able to detect the ball as it moves in projectile motion. For very fast projectiles this will be hard to do, we present in Figure 26 the image of the real-time ball identification during a launched projectile of a single ball.

Figure 26 Detection of Blue Ball as it enters Camera FOV

48

6.2.3

Detection by Background subtraction methods and Blob Extraction

Figure 27 Object Detection By background subtraction and Blob extraction

49

Figure 28 background subtraction applied to static scene with stationery ball

Figure 27 shows the results of object detection from applying the background subtraction methods

and extracting the blob as the ball enters the scene While Figure 28 is sample of the first frames after applying background subtraction methods on the static scene with stationary ball. In the next section we establish coordinate system for the VTED, the results presented in the next section will show how we move from the pixel coordinate system to the real world coordinate x-y plots.

50

6.2.4

Object detection performance

Method

Time taken in seconds

Colour Detection

0.002

Detection by Geometric features

0.0009

Detection by blob and background subtraction

2.53

Table 5 Time taken to detect object of interest

Table 5 shows time take to detect the objects of interest using different methods developed for use

with the VTED.

51

6.3

Establishing the Coordinate system

Before real time plots and predictions are made the coordinate system needs to be established.

The

relationship between the camera pixel coordinate system and the real world coordinate system are shown in Figure 29 [37]

Figure 29 Coordinate system

From the PlayStation eye specifications in Table 3 we have established that the FOV value is 75 degrees. This value is used In coordinate transformations calculation. Figure 30 shows how the linear test was performed. The linear motion test was performed so that the coordinate system in the camera view can be adjusted, after adjusting this then transformation from camera to real coordinate system can be made. Figure 31 shows the resulting centre coordinates plotted in the camera pixel coordinates. Obviously

350 should be subtracted in the camera pixel vertical coordinates in order to set the zero for the vertical coordinates.

52

Figure 30 Testing the ball motion for Coordinate transform along x-axis

Figure 31 Results from moving the ball along the camera x-axis on the test rig

53

Figure 32 Moving ball along ruler on the x axis along test rig

Figure 32 shows the corrected coordinate system where we have zeroed out in the code the real world

coordinates will be calculated using the camera pinhole camera model, this will utilize the camera FOV value already mention in this section.

54

6.4

Predicting Spherical objects Trajectories

Presented in this section is the results of tracking trajectories for spherical objects, we present how varying speeds of projectiles as well as varying the angle of launch has an effect on projectile trajectory predictions implemented using the fusion of the adaptive filter with Figure 33 is a sample of frames when the test ball was a blue stress ball.

Figure 33 Sample of Frames during launch

6.4.1

Effects of varying speeds on projectile predictions

Projectiles were launched at different speeds and predictions were made, different speeds have an effects on the predicted value. The projectile tracking on the setup test rig takes between 0.05 and 0.09 seconds to run, the frame processing speeds already establish a restriction when it comes to projectile prediction accuracy. The following plots will display the predicted trajectory, the ground truth which is the real trajectory that is expected and also the realtime trajectory from experimental data points.

55

The Trajectories represented in Figure 34 , Figure 35 and Figure 36 were all launched at 45 degrees with different initial launch speeds

Figure 34 Trajectories for high speed launch

56

Figure 35 Projectile Trajectories for reduced Speeds launch

57

Figure 36 Projectile Trajectories for low speed launch

6.4.2

Effects of varying launch angles on projectiles

The following section presents the effects of varying the launch angle while keeping the launch speed the same from the launcher, we present the graphs and observe the difference in predicted trajectories, experimental trajectories and the actual ground truth trajectories.

58

Figure 37 Trajectories for Projectile Launch angle 20 degrees

59


60


6.5

Predicting Non spherical objects trajectories

The VTED must also be able to handle non spherical objects also, an example using a simple pencil is presented in this section.

61

Figure 40 Pencil Launch sample frame sequences

62

Figure 41 Trajectory paths for a red pen

6.6

Convergence of the adaptive filter

The adaptive filter of choice was the LMS filter due to reasons already mentioned in the literature review. Parameters were chosen for the LMS filter, similar parameters were chosen and learning curves were analysed. The learning curves are presented in this section. Figure 42 to Figure 45 shows the learning curves with 60-200 tap estimates.

63

Figure 42 Learning Curve for 5000 iterations with 60 tap estimates

64

Figure 43 learning curve with 5000 iterations and 120 tap estimates

65

Figure 44 Learning curve for 5000 iterations with 180 tap estimates

66

Figure 45 Learning curve with 5000 iterations and 200 tap estimates

67

7. Discussions and Analysis This section of the report discusses the results obtained in previous sections. We have seen how the camera operates in previous sections and also obtained the results for the VTED. The whole pipeline has been analysed but details, relevance and significance to the results presented in section 6 will be presented here.

7.1

Camera communication and video capture

The camera communication was established for both Linux and windows platforms. The performance on Linux was better than on windows hence why the Linux platform was chosen for experimentation. The PlayStation eye also does not have enough documentation and the camera had to be hacked to work on windows.

The video capture results on Linux were realisable. Using opencv it was possible to adjust the camera FPS which is very important for computer vision. Reducing the number of frames to analyse reduces computational complexity and leaves the focus on significant results. To loop through the capture per frame took a period of 0.09 seconds which is too fast for tracking slow motion, the ability to reduce the FPS value is therefore of paramount importance.

The camera communication seems to be effective and fast enough on Linux, we have also established high reliability offered by the Video For Linux layer, the camera communicates properly and the connection never gets broken. There is also an issue with setting up the opencv video capture on windows, the video capture is unstable due to the unreliability of third party dynamic link libraries.

7.2

Pre-processing and Object Detection

7.2.1

Pre-Processing

Various pre-processing techniques were employed and results were obtained in section 6.2. the HSV values were obtained for colour based detection. The scenes are prone to illumination variance and this values may change any time resulting in ineffective object detection. Converting the images to grey scale was quite efficient since it reduced the computations when it comes to other computations that follow. The background subtraction method also proven to be more efficient in a static scene but this might not be the same in other scenes where there is constant motion like the wind blowing the 68

trees. There might be a lot of noise in other scenes, however the results obtained are satisfactory for indoor scenes. Background subtraction also forces that the object of interest to be tracked is be in the scene otherwise the object might be considered as part of the background and this may yield wrong results. Figure 28 shows the results obtained when applying background subtraction with the object of interest in the scene, it is obvious that the object is not detected as seen in the figure.

7.2.2

Object detection

Various colours were detected using the colour detection methods. Table 4 shows HSV threshold levels which were applied for objects of various colours. The colour detection algorithms was the most robust since the colours hardly changed in the scene. The colour detection method proved robust enough as can be seen in Figure 23 it is easy to detect non spherical objects this is one reason why this method was preferred in pursuing the VTED.

Using Geometric features for object detection was found not to be robust enough, although the computation time was reduced compared to other methods Figure 24 and Figure 25 reveal that the Hough transform was returning more points than we need, this is probably caused by bad preprocessing, the Gaussian blur kernel was defined with 𝜎 = 3 and this might not be the most efficient value to use. This method needs to be used where lighting is controlled and more experimentation would need to be done to make it work properly.

Object detection from the method of background subtraction was not really efficient, first the object of interest should not be in the camera FOV when it is invoked. This makes it one of the most unreliable methods amongst all the methods implemented on the VTED. It can also be seen from Table 5 that it is the slowest of the methods implemented here.

The overall performance of the object detection methods is clear enough from the results, it is obvious that the color detection methods performs better than the other two implemented methods hence it is the most preferred in the VTED, the experimentation for trajectory predictions was therefore done using this detection method. Color detection was found to be robust enough and defining the color to track was easy enough since there is a function in opencv that allows for obtaining HSV boundary values from a region of interest.

69

7.3

Establishment of the coordinate system

Since a single view camera was used in the pursuit of the VTED, there was a need to define the coordinate system accurately. The pixel axis had an offset on the test rig hence it was necessary to redefine the zero crossings of the coordinates system. This feature will be important when the VTED is used in other scenes, the pursuit of the VTED and the conversion into the real world coordinates was not pursued in the experimentation, and this is so that the VTED is not restricted but could be used in versatile scenes. Also the VTED is more concerned with the algorithm design approach in order to improve the performance, further details on conversion to real world coordinates can be found in [36].

7.4

Predicting Spherical objects Trajectories

In section 2.5 of the literature review we have established the performance metrics for the VTED we will use this metrics in this section and the next in order to analyse the trajectories as well as establish performance. However instead of using actual frames we will use the trajectory plots as seen in section 6.4 and analyse the deviation from the ground truth. Figure 34 -Figure 36 show the effect of varying speeds from high to low on the predictions made. Figure 34 shows a predicted trajectory whose path has a minimized deviation from the ground truth,

from the 3 figures it can be seen that increasing the speeds slows down performance. The VTED is meant to be fast enough to run at least 300 iterations of video capture within a very short period time. This is because of the convergence of the adaptive filter used whose optimal performance curve can be seen in Figure 42. The slow motion of the launched object means the VTED returns many similar points for many iterations and the adaptive filter returns similar weights and results in a much greater error. The effect of changing the launch angle was also investigated, Figure 37 to Figure 39 shows the results of this. There is no significant change in the error since the speed is kept the same. The adaptive filter works perfectly in this regard since the computations return different values each time and the weights calculated allow the filter to converge in a similar way.

70

7.5

Predicting Non spherical objects trajectories

Figure 41 is the results of predicting the projectile trajectory of a red pen as seen on the sequence of Figure 40. The VTED is constantly doing centroid calculations for the pen as it spins in the air in

projectile motion, the graph of Figure 41 shows how the experimental values deviate from the ground truth and the predicted path from initial points. It is worth noting that the deviation from the ground truth is greater than similar speeds for spherical objects.

The error is caused by the fact that the central point tracked keeps on changing in a nonlinear fashion. The number of tap estimates are therefore not enough to handle this kind of computation even though they would be enough for linear centroid estimates from spherical objects. The trajectory estimation is however not very far off for basic applications.

7.6

Adaptive filter learning convergence

The convergence of the adaptive filter is very important in the prediction stage. Fast convergence means the cost function is minimized as soon as possible and the error is reduced within a few iterations hence good performance. Before implementing the VTED core algorithms a convergence test was made so that parameters could be adjusted for optimal performance. The correct choice of the number of taps and the step-sizes must be chosen. Figure 42 shows the learning curve for different step-sizes with 60 tap estimates. The step-size of 0.01

converges faster and the error is minimized within a few iterations. Figure 44 shows what happens when the number of tap estimates increase, the convergence time increases. The worst case scenario is seen with 200 taps in Figure 45, the step-size of 0.01 diverges while the other step-sizes make the learning too slow for applications with the VTED. Its was therefore decided that the step-size to be used would be 0.01 with 60 tap estimates or less in order to reach increased performance.

The convergence and number of iterations would however be optimized using optimization methods of numpy arrays. The methods were implemented and the performance was satisfactory.

71

7.7

General performance of the VTED

In section 2.3 of the literature review we have established that visual object trackers must be adaptive, robust and be able to handle real time processing efficiently. The VTED must also be weighed against these performance evaluation methods.

In terms of robustness the VTED was found to be robust enough, it was able to track objects of different colours and also objects of different shapes with minimal errors. The adaptive filter helps in this regard. The VTED is also adaptive since it uses an adaptive filter which updates weights to achieve predictions, it can basically be used for other defined trajectories which makes it better than the similar.

The time complexity of the VTED algorithms are highly reduced by numpy implementations. The algorithms used are optimized with openCL on openCV, hence the speeds are satisfactory enough for real time processing. The accuracy of the VTED is fair enough for basic applications, this can be seen in all trajectory plots in this report.

72

8. Conclusions

We have established the Visual Trajectory Estimation Device’s basic framework. Object detection methods have been discussed preceded by pre-processing techniques for robust object detection. We have also discussed what happens after the object detection methods have been implemented, that is object tracking after which trajectory paths can be determined from set of points associated with the object of interest.

The development methods of this project address the shortcomings of a similar project called the VTES [36]. The author of the VTES had suggested better algorithms for object detection as well as for making predictions of trajectories in his future works. Such methods have been implemented, we noticed how the object detection methods pursued in this project performed better than those of the VTES since we also implemented one of the VTES object detection methods based on background subtraction and blob extraction.

We make a final remark in this section that it is very important to implement best pre-processing techniques before applying any computer vision algorithms. It can be noted how the colour detection and tracking by colour methods outperformed the other object detection methods in this project since better pre-processing was done. The tracking by colour is therefore the most robust and quite reliable given well defined backgrounds hence it can be chosen as an improvement towards improving the already established VTES since the VTES was designed in such a way that it can be updated and made to perform better.

The VTES also needed an improvement towards the prediction algorithm hence the focus on the algorithmic development than the software structure in this project and it can be concluded that the LMS filter used in pursuit of the VTED could act as an improvement towards the VTES as was needed in Retief’s work [36].

73

9. Recommendations Despite the optimal performance of the developed VTED we have established a few shortcomings which will need to be addressed in the future. This shortcomings are a result of a few things which the system failed to address.

A single view camera is not enough for this sort of application since using a single view camera requires extreme calibration which was not really in the scope of the VTED development. In the future the use of a single view camera would require calibration methods robust enough and adaptive. This methods will allow the use of tracking trajectories with less error.

Multiple cameras should be used next time in order to have a better analysis of the trajectories. Multiple cameras will allow us to easily map 3D trajectories since multiple views will be used and photogrammetry will help in the reconstruction of 3D trajectories. In this way the predictions will be more accurate and robust.

Version 2 of opencv in python does not yet have the CUDA implementations of most algorithms needed for object tracking, the implementations of such algorithms on the GPU could increase performance greatly especially in complex scenes where a lot of pre-processing computations are necessary. The OpenCV OpenCL optimization utilized in this project may not perform well enough in other scenes.

74

10. References [1] J. e. a. Ren, “A general framework for 3D soccer ball estimation and tracking,” in ICIP'04. 2004 International Conference, 2004. [2] M. c. M. C. a. P. N. Heer Gandhi, “realtime Tracking of game assets in american football for automated camera selection and motion capture,” Procedia Engineering, vol. 2, no. 2, pp. 26672673, 2010. [3] H.-s. C. M. H. Huang-Tsung Chen, “A trajectory based ball tracking framework with enrichment for broadcast baseball videos,” in International Computer Symposium, Taiwan, 2006. [4] W. C. J. K. Fei Yan, “Layered data assosciation using Graph-theoretic formulation with application to tennis ball tracking in monocular sequences,” IEEE Transactions on pattern analysis and machine intelligence, vol. 30, no. 10, pp. 1814-1830, 2008. [5] scipy, “scipy lectures,” [Online]. Available: lectures.github.io/advanced/image_processing/. [Accessed 21 07 2015].

https://scipy-

[6] M.Chung, “Wisc university statistics website,” [Online]. Available: http://www.stat.wisc.edu/~mchung/teaching/MIA/reading/diffusion.gaussian.kernel.pdf.pdf. [Accessed 2 09 2015]. [7] Microsoft, “Directx Gaming and Graphics,” Microsoft, [Online]. Available: https://msdn.microsoft.com/en-us/library/windows/desktop/hh706313(v=vs.85).aspx. [Accessed 21 07 2015]. [8] J. D. cook, “converting colors,” [Online]. Available: http://www.johndcook.com/blog/2009/08/24/algorithms-convert-color-grayscale/. [Accessed 02 08 2015]. [9] GIMP, “GIMP colour tools,” [Online]. Available: http://docs.gimp.org/2.6/en/gimp-tooldesaturate.html. [Accessed 02 08 2015]. [10] P. K. a. R. Bowden, “An Improved Adaptive Background Mixture Model for Realtime Tracking with Shadow Detection,” in In Proc. 2nd European Workshop on Advanced Video Based Surveillance Systems,, 2001. [11] opencv, “OpenCV documentation,” [Online]. Available: http://opencv-pythontutroals.readthedocs.org/en/latest/py_tutorials/py_imgproc/py_canny/py_canny.html. [Accessed 8 6 2015]. [12] N. D. Binh, “A Robust Framework for Visual Object Tracking,” in Computing and Communication Technologies, 2009. RIVF '09, Da Nang, 2009. [13] L. S. F. Z. L. W. Z. S. Hangxuang Yang, “Recent advances and trends in visual tracking: a review,” Elsevier neurocomputing, vol. 74, pp. 3823-3831, 2011. [14] S. K. Rekhil M Kumar, “A Survey on Image Feature Descriptors,” International Journal of Computer Science and Information Technologies, vol. 5, no. 6, pp. 7668-7673, 2014. [15] P. H. V. C, “Method and means for recognizing complex patterns”. United States of America Patent US3069654 A, 8 December 1962. [16] O. Djekoune, “A New Modified Hough Transform Method for Circle Detection,” in 5th International Joint Conference on Computational Intelligence, Vilamoura, Algarve, Portugal, 2013.

75

[17] L. L. S. Y. Xing Chen, “concentric circle detection based on normalized distance variance and the straight line hough transform,” in 9th international conference on computer science & education, Vancouver, canada, 2014. [18] S. P. a. P. V. Davide Scaramuzza, “Ball Detection and Predictive Ball Following Based on a Stereoscopic Vision System,” in Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain, 2005. [19] R. Faragher, “Understanding the Basis of the Kalman Filter,” IEEE signal processing magazine, 2012. [20] B. D. a. D. M. Thounaojam, “A SURVEY ON MOVING OBJECT TRACKING IN VIDEO,” International Journal on Information Theory (IJIT), vol. 3, no. 3, pp. 31-46, 2014. [21] M. P. Yibing Wang, “Real-time freeway traffic state estimation based on extended Kalman filter: a general approach,” Elsevier Transportation Research , vol. Part B, no. 39, pp. 141-167, 2005. [22] S. M. I. Fredrik Gustafsson, “Particle Filter Theory and Practice with Positioning Applications,” Aerospace and Electronic Systems Magazine, IEEE Vol 25 issue 7, 2010. [23] A. Svensson, “youtube,” 8 10 2013. [Online]. https://www.youtube.com/watch?v=aUkBa1zMKv4. [Accessed 08 07 2015].

Available:

[24] Y. Wu, J. Lim and M.-H. Yang, “online object tracking: a benchmark,” in Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, 2013. [25] Y. Wei, B. Microsoft Res. Asia, J. Sun, X. Tang and H.-Y. Shum, “Interactive Offline Tracking for Color Objects,” in Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, Rio de Janeiro, 2007. [26] M.-H. H. a. D.-C. Y. Ting-Sheng Weng, “Dynamic Teaching of Kinematics of Particles and Python,” International Journal of e-Education, e-Business, e-Management and e-Learning, vol. 3, no. 4, pp. 318-321, 2013. [27] P. Sahoo, “slideshare.net,” slideshare, [Online]. Available: www.slideshare.net/bratisundarnanda/project-34302050?from_action=save. [Accessed 3 8 2015]. [28] National Instruments, “Labview Adaptive Filter Toolkit,” National Instruments, [Online]. Available: http://zone.ni.com/reference/en-XX/help/372357A01/lvaftconcepts/aft_lms_algorithms/. [Accessed 01 10 2015]. [29] F. P. Faisal Bashir, “perfomance evaluation of object detection and tracking system,” Mitsubishi electric research laboratories, Cambridge, 2006. [30] A. B. K. K. a. V. E. Kari Pulli, “Real-Time computer vision with opencv,” communications of the acm, pp. 61-69, 6 June 2012. [31] “Numpy,” [Online]. Available: http://www.numpy.org/. [Accessed 4 5 2015]. [32] “Scipy,” [Online]. Available: http://www.scipy.org/. [Accessed 4 5 2015]. [33] Nvidia, “Nvidia,” [Online]. Available: http://www.nvidia.com/object/cuda_home_new.html. [Accessed 4 5 2015]. [34] N. G. D. F. H. Kamran Karimi, “A Performance Comparison of CUDA and OpenCL,” arXiv.org , 2010. [35] engadget, “engadget.com,” [Online]. Available: http://www.engadget.com/products/sony/playstation/eye/specs/. [Accessed 02 08 2015]. [36] F. Retief, “Visual Trajectory Estimation Software,” University Of CapeTown, Cape Town, 2013. [37] IITD, [Online]. Available: http://www.cse.iitd.ac.in/~parag/projects/pctrace/camera.shtml. [Accessed 5 10 2015].

76

77

11. Appendices .

11.1

Appendix A- Deriving the Projectile Equations

Assuming the only force acting on the projectile is due to gravity. Then: From newtons second law of motion ∑ 𝐹𝑖 = 𝑚𝑎 = 𝑚𝑔 𝑤ℎ𝑖𝑜𝑐ℎ 𝑖𝑚𝑝𝑙𝑖𝑒𝑠 𝑡ℎ𝑎𝑡 𝑎 = 𝑔 Then : 𝑑𝑣 = 𝑎 = 𝑔𝑗̂ 𝑑𝑡 We also have that 𝑣 = 𝑣0 sin 𝜃 𝑗̂ + 𝑣0 cos 𝜃 𝑖̂ We integrate once ∫

⃗ 𝑑𝑣 = 𝑔𝑗̂ 𝑑𝑡

To get the vector 𝑉𝑥 cos(𝜃) 0 [𝑉 ] = 𝑉0 [ ]−[ ] 𝑔𝑡 sin (𝜃) 𝑦

This vector can be integrated further to get the position of the object of interest which is the pursuit used in this project.

78

visual trajectory estimation device (vted)

visual trajectory estimation device (vted)

Suggest Documents