A New Vehicle Detect Method Based on Gaussian Mixture Model along with Estimate Moment Velocity Using Optical Flow Mohammad Ali Alavianmehr1, Ali Zahmatkesh2 , Amir Sodagaran3, 1
E-mail:
[email protected], Deputy of Traffic & Transportation Shiraz Municipality, Shiraz Traffic Control Center 2 Vice Chancellor for Transportation of Shiraz Municipality 3
[email protected], Deputy of Traffic & Transportation Shiraz Municipality, Shiraz Traffic Control Center
Abstract: This paper presents a complete system for analyzing a vehicle's traffic behavior in the context of real-time traffic video surveillance applications. Receiving the images through video surveillance camera in the first phase, we get use of Gaussian mixture model for each frame to achieve a precise background image. This process will be repeated as long as we seize accurate background images. This phase is called training phase. An initial training step is performed that involves estimating the geometrical structure of the road. In the second phase, the received images will be analyzed along with the trained images to extract the vehicles (moving objects) based on this analysis. In third phase, a green block will surround each vehicles to enable the researches count them. Either inaccurate training of the background images or the shadow of moving vehicles might cause problems in detecting vehicles in motion in the second phase. To solve these problems, we get of merging the blocks which overlap the other blocks to compute the volume and density of traffic accuracy. In four phase, the optical flow is used for computing moment velocity of each vehicle based on improved LucasKanade and Horn-Schunck methods. Finally, the report of traffic can be presented by post processing. Our approach is demonstrated to be more adaptive, accurate and robust than some existing similar pixel modeling approaches through experimental results. Results show that the proposed method obtains better results for moving objects detection than the previous counterpart methods and can be easily assembled to current automated video surveillance systems it also reduces the running time. Key Word: Image Processing, Intelligent Transportation Systems, Gaussian Mixture Model, Optical Flow.
1. INTRODUCTION Traffic monitoring is an important tool in the development of Intelligent Transport Systems (ITS) involving the detection and categorization of road vehicles. The application of image processing and computer vision techniques to the analysis of video sequences of traffic flow suggests noticeable improvements over the present methods of traffic data collection and road traffic monitoring. Other methods such as the inductive loop, the sonar and microwave detectors suffer from important drawbacks in that they cost a lot to install and maintain and they are not able to identify slow or stationary vehicles. Video sensors offer a relatively low installation cost with little traffic disruption during maintenance. Moreover, they offer wide area monitoring allowing analysis of traffic flows and turning movements (important to junction design), speed measurement, multiple point vehicle counts, vehicle classification and highway state assessment (e.g. congestion or incident detection) [1]. Image processing also provides extensive applications in the related field of autonomous vehicle guidance, mainly for recognizing the vehicle‟s relative location in the lane and for obstacle detection. The problem of autonomous vehicle guidance solves several problems at different abstraction levels. The vision system can assist in the meticulous localization of the vehicle with regard to its environment, which is made up of the appropriate lane and obstacles or other moving vehicles. Both lane and obstacle detection are based on estimation procedures for detecting the borders of the lane and determining the path of the vehicle. The estimation is often done by matching
the observations (images) to a presumed road and/or vehicle model. Video systems for either traffic monitoring or autonomous vehicle guidance normally involve two important tasks of perception: (a) estimation of road geometry and (b) vehicle and obstacle detection. Road traffic monitoring is to recognize and analyze traffic figures, including presence and numbers of vehicles, speed distribution data, turning traffic flows at intersections, queue-lengths, space and time occupancy rates, etc. Therefore, for traffic monitoring it is vital to identify the lane of the road and then sense and determine presence and/or motion parameters of a vehicle. Also, in autonomous vehicle guidance, the knowledge about road geometry allows a vehicle to follow its route and the detection of road obstacles becomes an essential and serious task for avoiding other vehicles present on the road. In road traffic monitoring, the video acquisition cameras are stationary. They are located on posts above the ground to get optimal view of the road and the passing vehicles. In automatic vehicle guidance, the cameras are moving with the vehicle. In these applications it is necessary to analyze the dynamic change of the environment and its contents, as well as the dynamic change of the camera itself. Accordingly, object detection from a stationary camera is easier because it involves fewer estimation procedures. Initial approaches in this field include spatial, temporal and spatio-temporal analysis of video sequences. More advanced and effective approaches consider object modeling and tracking using state-space estimation procedures for matching the model to the observations and for estimating the next state of the object. The most common techniques, i.e. analysis of the optical flow field and processing of stereo images, involve processing two or more images. With optical-flow-field analysis, multiple images are recognized at different times [2]; Optical-flow-based techniques detect obstacles indirectly by analyzing the velocity field. Stereo image techniques determine the correspondences between pixels in the different images. Object detection approaches have been classified according to the method used to isolate the object from the background on a single frame or a sequence of frames.
1.1 Thresholding This is one of the easiest, but less effective techniques, which operates on still images. It is based on the fact that vehicles are compact objects having different intensity form their background. Therefore, by thresholding intensities in small regions we can separate the vehicle from the background. This approach seems to depend heavily on the threshold applied, which must be chosen appropriately for a specific vehicle and its background. Adaptive thresholding can be utilized to account for lighting changes, but cannot avoid the false detection of shadows or missed detection of parts of the vehicle with similar intensities as its environment [3]. To assist the thresholding process, binary mathematical morphology can be employed to aggregate close pixels into a unified object [4]. Moreover, gray-scale morphological operators have been offered for object detection and identification that are insensitive to lighting variation [5].
1.2 Edge-based detection (spatial differentiation) Approaches in this class are based on the edge-features of objects. They can be applied to single images to detect the edge structure of even still vehicles. Morphological edge-detection schemes have been extensively applied, since they exhibit superior performance. In traffic scenes, the results of an edge detector generally highlight vehicles as complex groups of edges, while road areas yield relatively low edge content. Hence the existence of vehicles may be detected by the edge complexity within the road area, which can be quantified through analysis of the histogram. Alternatively, the edges can be grouped together to form the vehicle‟s boundary. Towards this direction, the algorithm must determine relevant features (often line segments) and define a grouping strategy that allows the identification of feature sets, each of which may correspond to an object of interest (e.g. potential vehicle or road obstacle). Vertical edges are more probable to form dominant line segments corresponding to the vertical boundaries of the profile of a road obstacle. Moreover, a dominant line segment of a vehicle must have other line segments in its neighborhood that are detected in nearly perpendicular directions. Consequently, the detection of vehicles and/or obstacles can be simply comprised of finding the rectangles that enclose the dominant line segments and their
neighbors in the image plane. To improve the shape of object regions Ref. [6, 7] use the Hought transform to extract consistent contour lines and morphological operations to restore small breaks on the detected contours. Symmetry offers an additional useful feature for relating these line segments, since vehicle rears are generally contour and region-symmetric about a vertical central line. Edge-based vehicle detection is generally more effective than other background removal or thresholding approaches, since the edge information remains important even in variations of ambient lighting.
1.3 Space signature In this detection method, the objects to be determined (vehicles) are described by their features (forms, dimensions, luminosity), which allow recognition in their environment. Space signature utilizes a logistic regression approach which uses characteristics extracted from the vehicle signature, so as to detect the vehicle from its background. Alternatively, the space signatures are defined by means of the vehicle outlines projected from a certain number of positions (poses) on the image plane from a certain geometrical vehicle model. A camera model is used to project the 3D object model onto the camera coordinates at each expected position. Then, the linear edge segments on each observed image are matched to the model by evaluating the presence of attributes of an outline, for each of the preestablished object positions (poses). Owing to the inflexible nature of template matching, a specific template must be created for each type of vehicle to be identified. This makes a problem, since there are many geometrical shapes for vehicles contained in the same vehicle-class. Besides, the template mask assumes that there is little change in the intensity signature of vehicles. In practice, however, changes in ambient lighting, shadows, occlusion, and severe light reflection on the vehicle body panels generate serious variation in the spatial signatures of same-type vehicles. To cope with such problems, neural networks for recalling space signatures, and employs their ability to interpolate among different known shapes. Despite its drawbacks, vehicle detection based on sign patterns does not require high computational effort. Moreover, it enables the system to deal with the tracking process and keep the vehicle in track by continuously sensing its sign pattern in real time [8].
1.4 Inter-frame differencing This is the most direct method for making motionless objects vanish and preserving only the traces of moving objects between two successive frames. The immediate consequence is that still or slow-moving objects are not detected. The inter-frame difference succeeds in detecting motion when temporal changes are obvious. Nevertheless, it fails when the objects in motion are not sufficiently textured and preserve uniform regions with the background. To cope with this problem, the inter-frame difference is described using a statistical framework often utilizing spatial Markov random fields [9]. The inter-frame difference is modeled trough a two-component mixture density. These components are zero mean corresponding to the static (background) and changing (moving object) parts of the image. Inter-frame differencing provides a crude but simple tool for estimating moving regions. This process can be complemented with background frame differencing in order to improve the estimation accuracy [10]. With color segmentation or accurate motion estimation we can further refine the resulting mask of moving regions by means of optical flow estimation and optimization of the displaced frame difference to refine the segmentation of moving objects.
1.5 Time signature This method encodes the intensity profile of a moving vehicle as a function of time. The profile is calculated at several positions on the road as the average intensity of pixels within a small window located at each measurement point. The analysis of the time signature recorded on these points is employed to get the presence or absence of vehicles. The analysis of the time signal of light intensity on each point is performed by means of a model with pre-
recorded and periodically updated characteristics. Spatial correlation of time signatures makes further reinforcement of detection possible. As a matter of fact, the joint consideration of spatial and time signatures provides valuable information for both object detection and tracking. Through this consideration, the one task can benefit from the results of the other in terms of reducing the overall computational complexity and increasing the robustness of analysis [11].
1.6 Feature aggregation and object tracking These techniques can operate on the feature space to either recognize an object, or track characteristic points of the object. They are often used in object detection to improve the robustness and reliability of detection and decrease false detection rates. The features are aggregated with regard to the vehicle‟s geometrical features. Thus, this operation can be interpreted as a pattern recognition task. For feature aggregation motion-based and model-based approaches have been used. Motion-based approaches group together visual motion consistencies over time [12]. Motion estimation is only done at distinguishable points, such as corners, or along contours of segmented objects, or within segmented regions of similar texture [13].Model-based approaches match the representations of objects within the image sequence to 3D models or their 2D projections from different directions (poses).
1.7 Optical flow field Approaches in this class employ the fact that the appearance of a rigid object changes little during motion, while the drastic changes take place at regions where the object moves in and/or out of the background. The optical flow field is estimated by mapping the gray-value recorded at time at the image point onto the gray-value recorded at location x at time t: The optical flow field encodes the temporal displacement of observable gray-scale structures within an image sequence. It comprises information significantly about the relative displacement of pixels, but also about the spatial structure of the scene. Various approaches have been suggested for the efficient estimation of optical flow field. In general, they can be characterized as (i) gradientbased (ii) correlation based (iii) feature-based and (iv) multigrid methods. Gradient-based techniques concentrate on matching with on a pixel-by-pixel basis through the temporal gradient of the image sequence. In most cases, the intensity variations alone do not provide sufficient information to completely identify both components (magnitude and direction) of the optical flow field . Smoothness constraints makes the estimation of optical flow fields easier even for areas with constant or linearly distributed intensities. Gradient-based techniques come up with poor results for poor-texture images and in presence of shocks and vibrations [14]. Under such conditions, correlation-based techniques usually derive more precise results. Correlation-based techniques search for the maximum shift around each pixel that increases the correlation of gray-level patterns between two consecutive frames. Such procedures are pretty expensive in terms of computational complexity. Endeavors to accelerate the computation at the cost of resolution often imply subsampling of the image and computation of the motion field at fewer image points. Feature-based approaches consider the organization (clustering) of pixels into crude object structures in each frame and subsequently calculate motion vectors by matching these structures in the sequence of frames. In [15] is proposed the Vehicle speed measurement (VSM) method on the base of the improved three-frame difference and gray constraint optical flow algorithm has the features of low cost, low computation and accurate speed measurement. In the method, the contour of moving vehicles can be detected accurately by the improved frame difference algorithm, and the vehicle contour‟s optical flow value, which is the speed (pixels/s) of the vehicle in the image, is computed by the proposed gray constraint optical flow algorithm. In the image, we draw the region of interest and only estimate the speed of moving target‟s contour in the region, which can decrease the computational further. By the corresponding ratio between the image pixels and the width of the road, the speed (km/h) of the moving target is calculated [15].
In [16], a new illumination robust foreground prediction algorithm is proposed by integrating a novel color recovering algorithm with the optical flow estimation and opacity propagation algorithm. The color recovering algorithm which is designed with respect to the observation that the illumination changes are usually locally smooth, is used to recover the pixel colors distorted by the illumination changes. They demonstrate that the color recovering algorithm can eliminate the negative effect of the illumination changes made on the optical flow estimation and opacity propagation, and make them robust to the illumination changes. In [17], an abnormal behavior detection algorithm for surveillance is proposed to correctly recognize the targets as being in a normal or chaotic movement. A model is developed here for this purpose. The uniqueness of this algorithm is the use of foreground detection with Gaussian mixture (FGMM) model before passing the video frames to optical flow model using Lucas-Kanade approach. Information of horizontal and vertical displacements and directions associated with each pixel for object of interest is extracted. These features are then fed to feed forward neural network for classification and simulation.
1.8 Gaussian mixture background model The standard approach to object detection is background subtraction (BS) that attempts to build a representation of the background and detect moving objects by comparing each new frame with this representation. A number of different BS techniques have been proposed in the literature and some of the popular methods include mixture of Gaussians model. Basic background subtraction (BS) techniques detect foreground objects as the difference between two consecutive video frames, operate at pixel level, and are applicable to still backgrounds. Although the generic BS method is easy to understand and implement, the drawbacks of the frame difference BS are that it does not provide a mechanism for selecting the parameters, such as the detection threshold, and it is unable to deal with multimodal distributions. One of the significant techniques able to cope with multimodal background distributions and to update the detection threshold employs Gaussian mixture models (GMMs). A Gaussian Mixture Model (GMM) is a parametric probability density function represented as a weighted sum of Gaussian component densities. GMMs are commonly used as a parametric model of the probability distribution of continuous measurements or as features in a biometric system, such as color based tracking of an object in video. In many computer related vision technology, it is vital to determine moving objects from a sequence of videos frames. In order to obtain this, background subtraction is applied which mainly recognizes moving objects from each portion of video frames. In video surveillance, target recognitions and banks background subtraction or segmentation technique is widely used. By using the Gaussian Mixture Model background model, frame pixels are removed from the required video to obtain the desired results. The application of background subtraction involves different factors which contain developing an algorithm which is able to detect the required object robustly, it should also be able to react to various changes like illumination, starting and stopping of moving objects. In [17], a new mixture model for image segmentation is presented. They propose a new way to incorporate spatial information between neighboring pixels into the Gaussian mixture model based on Markov random field (MRF). In mixture models based on MRF, the M-step of the expectation-maximization (EM) algorithm cannot be directly applied to the prior distribution for maximization of the log likelihood with respect to the corresponding parameters. Adaptive Gaussian mixture background model is excellent background model cause of its good analytic form and high operation efficiency. It is more appropriate than single Gaussian models for high speed moving objects detection under outdoor environment where image background and illumination intensity changes slowly. However, low convergence speed is its main disadvantages especially as the illumination intensity suddenly changes. It cannot adapt to these kinds of rapid changes and often take changing background pixels as moving objects, which makes the image foreground information confused and moving objects lost. Recently, although many modified Gaussian models (Chen, &He, 2007; Ma&Zhu, 2007; Chen& Zhou, 2007) [18, 19, 20] have been developed this problem has not been solved effectively. In their models, the moving objects shadows are occasionally detected in HIS(Hue, Saturation Intensity space or HSV (hue, saturation, value) space, which cause the objects detection and shadow removal need to be processed in different spaces and joined together by complicated space transform. Yinghong Li in [21] proposed a new method of detecting moving vehicles based on edged Gaussian mixture model is proposed.
Firstly, mixture Gaussian model is founded for image edge processing according to the features of image edge information is not sensitive to sudden change of illumination intensity. And some parameters such as the mean vectors and variance vectors of the pixels are obtained from the model for moving vehicle shadows detection. Moreover, through statistical approach moving vehicle shadows are eliminated based on the differences of brightness distortion between vehicle shadows and moving vehicles relative to background. In [22], presented an enhanced GMM method for the task of background subtraction. To that purpose, they use an adaptive variance estimator for the initialization of new modes. The value computed by the variance estimator is further employed to trigger a splitting rule, which is applied to over dominating nodes in order to avoid under-fitting the underlying background distribution. Dibyendu [23] proposed a Gaussian mixture model with advanced distance measure based on support weights and histogram of gradients for background suppression. The method also employs variable number of clusters for generalization. The main advantages of the method are implicit use of pixel relationships through distance measure with least modification to the conventional GMM and effective background noise removal through the use of background layer concept with no post processing involved. Tom S.F. Haines [24] proposed a new method based on Dirichlet process GMM, which are used to compute perpixel background distributions. Using a non-parametric Bayesian method allows per-pixel mode counts to be automatically inferred, avoiding over-/under- fitting. The rest of this paper is organized as follows. The details of proposed method are presented in Section II. Some experimental results are provided in Section III; finally, the paper is concluded in Section IV.
2. PROPOSED METHOD In this section, we explain the proposed method in detail. Four phases exist in our algorithm in order to estimate the traffic status and to compute the velocity of moving object. (Figure 1).
2.1 Explain over all of Proposed Method Receiving the images through video surveillance camera in first phase, we get use of GMM for each frame to achieve a precise background image. This process will be repeated as long as we seize an accurate background images. This phase is called training phase. In the second phase, the received images will be analyzed a long with the trained images to extract the vehicles (moving objects) based on this analysis. As the mention above, we may extract vehicles more accurately as long as, we have a more precise trained background images. In third phase, a green block will surround each vehicles to enable the researches count them. Either inaccurate training of the background images or the shadow of moving vehicles might cause problems in detecting vehicles in motion in the second phase. To solve these problems, we used of merging the blocks which overlap the other blocks to compute the volume and density of traffic accuracy. In four phase, the optical flow is used for computing moment velocity of each vehicle based on improved Lucas-Kanade and Horn-Schunck methods. Finally, the report of traffic can be presented by post processing.
2.2 GMM Method Background Subtraction Background Subtraction is based on four important steps which are stated below (figure 3): Preprocessing Temporal or spatial smoothing is employed in the early preprocessing stage to remove device noise which can be a factor under different light intensity. Smoothing technique also includes omitting various elements like environment such as rain and snow. In real-time systems, frame size and frame rate are commonly adopted to decrease the data processing rate. Another important factor in preprocessing technique is the data format used by the background subtraction model. Most algorithms can handle luminance intensity which is one scalar value per each pixel.
Background Modeling This step makes use of the new video frame in order to compute and update the background model. The main purpose of developing a background model is that it should be robust against environmental changes in the background, but sensitive enough to determine all moving objects of interest.
Received Live Images from
Compute Gaussian Mixture Model (GMM)
Video Surveillance Systems
Receive Trained Systems of GMM
Phase 1
Foreground
Received Live Video
Extract Vehicle from Background Images Phase 2
Merging of
Blocking of Vehicles
Detected Blocks
Phase 3
Detect of Moment Velocity Vehicles According to Optical Flow (Horn-Schunck and Lucas-Kanade Methods)
Estimate of Moment & Average Velocity of Vehicles Post Processing Estimate of Traffic Flow (TF) & Traffic Volume & Traffic Density Phase 4
Figure 1. Graphical Representation of the Proposed Algorithm
Report of Traffic via VMS
Initialize Parameters
Detect edges in Current Frames Build Edge Gaussian Mixture Models for every Pixel
Is the Pixel Value matched with one of the Background Gaussian Model?
Classify This Pixel as a Foreground Pixel No
Yes Classify This Pixel as a Background Pixel
Update the background Models Input the Next Images
Figure2. Graphical Representation of Gaussian Mixture Method
(a)
(b)
(E)
(f)
Figure. 3 a. Original frame b, Background frame, c and d. Detected edge of Background frame, e and f Foreground images. Foreground Detection In this step, it recognizes the pixels in the frame. Foreground detection compares the video frame with the background model, and identify candidate foreground pixels from the frame. To check whether the pixel is significantly different from the corresponding background estimate is a widely-used approach for foreground detection. Data Validation Finally, this step removed any pixels which are not relevant to the image. It involves the process of improving the foreground mask based on the information obtained from the outside background model. Most background models lack three main points: 1. Ignoring any correlation between neighboring pixels 2. The rate of adaption may not match the moving speed of the foreground object. 3. Non-stationary pixels, from moving leavers or shadow cast by objects in motion are at times mistaken for true foreground objects (figure 2).
2.3 Algorithm of Gaussian Mixture Model To give a better understanding of the algorithm used for background subtraction the following steps were adopted to achieve the desired results: 1. Firstly, we compare each input pixels to the mean 'μ' of the associated components. If the value of a pixel is close enough to a chosen component's mean, then that component is counted as the matched component. To be a matched component, the difference between the pixels and mean must be less than compared to the component's standard deviation. 2. Secondly, update the Gaussian weight, mean and standard deviation (variance) to reflect the new obtained pixel value. In relation to non-matched components the weights 'w' reduces whereas the mean and standard deviation stay the same. It is dependent upon the learning component 'p' in relation to how fast they change. 3. Thirdly, here we identify which components are parts of the background model. To do this a threshold value is applied to the component weights 'w'. 4. Fourthly, in the final step we determine the
foreground pixels. Here the pixels that are determined as foreground do not correspond with any components identified to be the background. A Gaussian mixture model can be formulated in general as follows: ∑
∑
(
)
(1)
Where, obviously, ∑
(2)
The mean of such a mixture equals ∑ (3) that is, the weighted sum of the means of the component densities. Where be the variable which represents the current pixel in frame, K is the number of distributions, and t represents time (i.e., the frame index), w is an estimate of the weight of the ith Gaussian in the mixture at time t, is the mean value of the ith Gaussian in the mixture at time t, is the covariance matrix of the ith Gaussian in the mixture at time t. The conventional GMM has a number of problems. It does not take into account the spatial relationship between the neighboring pixels. Pixels belonging to one object often resemble similar characteristics such as color, intensity, and edge orientations. This can actually help to reduce the misclassification problem. The distance formulation in GMM is only based on Euclidean distance between the color vectors. This distance often fails as different value pairs in separate channels may yield same distance. Thus, a foreground pixel may be confused with a background pixel having similar color distances, even if they have different colors. The conventional GMM uses fixed number of clusters. Different video sequences consist of single or multiple background motions and separate amount of foreground. Even a single sequence may go through certain changes that can lead to use of more or less clusters. The proposed method handles problems using merging step in second phase. In the Radon Transform, the light will be shined on the object & based on the produced shadow it is identified whether it‟s the edge or not. (Figure 4). If the light angle changes, the results will be illustrated on X axis as the vertical distance. When the light angel is 90 it is the longest even.
Figure 4. Radon Transform 𝑐 𝑙 𝑚2 {
𝑐
𝑙2
. 𝑛2 {
. 𝑛 {
𝑚 {
Figure 5. Figure of Merging Method
Category 1
Category 2
Category 4 Category 3
Figure 6. Category of Merging Step
2.4 Merging Step In the Merging step, we use the following model to find out what the distance between the block is, and whether the two blocks have any overlap or not. If the coordinates of A, B are defined in the center of the block, based on Euclidean distance, we will have (Figure 5). ,
,
√
2
2
(4)
In that are longitudinal latitudinal distance and is the Euclidean distance between the two rectangles. If or , then overlap has been occurred, else, there is not overlap. 2 2 If distance is smaller than half of total width length of the rectangle, the overlap might happen otherwise, the blocks are separate. Whenever there is overlap between blocks, we merge them to have a modified block. The overlap blocks are in one of these four forms regarding the minimum coordinates of each block the mentioned algorithm, the modified block will be formed. Figure 5 indicates four probable forms in which merging might accrue. In the first and fourth forms if the vehicle determining blocks locate within the bigger block (figure 6) the smaller one located within the bigger blocks must not be taken into account. If merging take place in the second and third forms, the dotted blocks will be elected as the vehicle determining blocks. Hence, considering merging algorithm, calculation of the density and volume of traffic will be much more precise than common GMM method. Pseudo-code of the merging stage of this method for video sequences is given in merging code. Notes that in this Pseudo-code, we just process one block of vehicle in one frame. We assume a horizontal line on the image receiving from video surveillance camera. This line will count the vehicles even in two-way roads. Two lines can be used in these two-way roads to count vehicles traveling in both directions. Once vehicles pass the line, vehicle count begins. For long vehicle like trucks, the initial line vehicles surrounding box may cross the vehicle count line in a frame, while in several frames the box might not cross the count line which leads to keep counting a vehicle for several times. To solve the mentioned problem, two threshold lines can be defined. Those cars crossing the first threshold line and don‟t cross the second will be counted. It should be reminded that vehicles between these two lines will not be counted.
2.5 Optical Flow Step The optical flow block estimates the direction and speed of object motion from one image to another or from one video frame to another using either the Horn-Schunck or the Lucas-Kanade method.
Detected Cars
(a)
Detected Cars
(b)
Figure 7. Algorithm of Merging. a) Before Merging Step b) After Merging Step TABLE I ALGORITHM FOR MERGING OVERLAPPED DISCRIMINATING CAR BOXES ALGORITHM MERGING
bbox_d Bounding Boxes initially found by GMM for each car including five the following properties: (i,j) and (h, w) and SIndex (i,j) Coordinates of the bounding boxes starting points (h,w) height and width of bounding boxes SIndex Searching indices to avoid repetition num Number of initial Bounding boxes num_d Number of final Bounding boxes bb1 Euclidean Distance θ Angle of interconnection lines with zero tangent line bbox_n Finally merged bounding boxes for i = 1 num-1 do for j = i+1 num do if bbox_d(i,end)==0 or bbox_d(j,end)==0, then 1. Compare bbox_d contents from first to one remaining last for ith bounding b box with bbox_d contents from first to one remaining last for jth bounding box box and stablish (bbox_d(i,1:end-1);bbox_d(j,1:end-1)) dd. 2. Find related index of dd which is matched with min (dd) as ss1, and max ( (dd) as ss2. 3. Find Euclidean distances between the center of bounding boxes ith and jth bb1 and Compute the tangent of the intersection line between the center of bounding b boxes ith and jth mm. if bb1×sin(atan(mm)) < ((bbox_d(i,3)+bbox_d(j,3))/2) or bb1×cos(atan(mm))