Medical therapy to improve the quality of life for physical therapy patients and ... segmentation are Long shadows, full and partial occlusion of objects with each ...
International Conference on Convergence of Technology - 2014
Tracking Manifold Objects in Motion Using Gaussian Mixture Model And Blob Analysis D. Harihara Santosh, Dr. P.G. Krishna Mohan
improperly segmented. Some basic problems of erroneous segmentation are Long shadows, full and partial occlusion of objects with each other. At segmentation level and at tracking level dealing with shadows and occlusions is important in robust algorithm According to the needs of the applications and methods used for its solution tracking a video is classified. There a 2 ways of classifying object tracking: one is dependent on correspondence matching and other thing is based on making use of motion estimation position prediction. The methodologies that track object parts adapt model based schemes to track and locate body parts. Cardboard Model, stick figure, 3D volumetric models and 2D contour are some of the examples for models. By using the Cardboard Model, parts of human like hands, heads, feet and torso can be tracked. Even in split and merge cases appearance templates of objects are kept. Here the algorithm uses blob analysis and Gaussian Mixture model which is a method for background modeling in tracking objects in motion and for the purpose of predicting the trajectories. A video tracking algorithm introspects the video frames and locate the target in motion in the video frame. Specifically when the targets are moving at a relatively high speed when compared with the frame rate, identifying the target location in the respective video frame becomes difficult. Here, in the object tracking systems we usually adapt motion model which explains how the shape of the target could be changed for various possible movements of the target to track.
Abstract—-Object tracking is a primary step for image processing applications like object recognition, navigation systems and surveillance systems. The current image and the background image is differentiated by approaching conventionally in image processing. Image subtraction based algorithms are mainly used in extracting features of moving objects and take the information in frames. Color extraction, foreground detection using Gaussian Mixture Model and object tracking using Blob Analysis are 3 stages of the algorithm. The initial step is to extract the required color from a particular picture frame which is color extraction and after which Gaussian Mixture Model and Blob Analysis are applied in detecting moving objects present in the foreground. These are done to track the moving object in the video sequences which are in frames so as to observe the motion of the object. Index Terms— gaussian mixture model, multiple object tracking blob analysis, background subtraction, foreground detection
I. INTRODUCTION There is lots of importance in tracking a moving object in computer vision in video pictures. Surveillance systems, navigation systems and object recognition are primary steps in object tracking. There is a lot of importance to object tracking in real time environment[1] .It finds applications in providing provide better sense of security using visual information, Security and surveillance to recognize people, in Medical therapy to improve the quality of life for physical therapy patients and disabled people, to analyze shopping behavior of customers in retail space instrumentation to enhance building and environment design, video abstraction to obtain automatic annotation of videos, to generate object based summaries, traffic management to analyze flow, to detect accidents, video editing to eliminate cumbersome human operator interaction, to design futuristic video effects.
II. COLOR EXTRACTION Red, Green and Blue are basic three colors that are comprised n a Color image. Therefore every pixel of this color image can be split into values of Red, Green and Blue. Consequently, Three matrices will be obtained from the whole image where each matrix represents a feature of color. These three obtained matrices are arranged in such a way that they are sequential in order, one after the other forming as3 matrixes comprising of m by n dimension.
Computer vision researchers have interest in tracking as it is a difficult problem and significant. To obtain a relationship between objects and object parts between consecutive frames of video is main in tracking [2]. It is most critical task in image processing applications because it provides cohesive temporal data about objects that are moving which are used to improve lower level processing such as motion segmentation and to enable higher level data extraction such as behavior recognition and activity analysis. Tracking had become a tedious task to apply in stophisticated because objects are
978-1-4799-3759-2/14/$31.00©2014 IEEE
The above mentioned three basic colors are used in extracting other components of colors for the color image. Extraction can be done as shown in below equation. Using the above mentioned equations the extraction of the basic three color components are shown in lower figure. The main motto of this project is to track the buses of institutions whose color is yellow in general. So how this
1
1
International Conference on Convergence of Technology - 2014
yellow can be obtained is a question and the answer for this is color combination of red and green gives yellow color. Once the extraction of basic components from an image is done, Using the below mentioned equation yellow color can be extracted from the basic three components which are nothing else Red, Green and Blue.
R image:, :,1; B image:, :,3;
is not a simple task. In video surveillance, Scientists emerged various ways of techniques to backend process. There are various software available that are automated for analyzing video footage. Huge body motions and objects are tracked using this.
G image:, :, 2;
A. Background Subtraction Background subtraction is relied on below mentioned four critical steps: 1) Preprocessing Temporal or spatial smoothing will be applicable at early stages of preprocessing to eradicate noise in the device. This noise could be an issue under the various intensities of light. Smoothing technique involves deleting different elements of environment such as rain and snow. In real time systems, to minimize the data rates for processing, frame rate and frame size are generally used. The important factor while preprocessing the technique that can be used for data format using background subtraction model. Various algorithms are there to process intensity of luminance that is single valued scalar per every pixel[3]. The below two figures shown among which left one shows snow and on right side shows spatial and temporal smoothing application. This outcomes removal of snow in more effective way to get clear image.
Figure 1. RGB color related with grayscale
Y R G B 2
(1)
III. OBJECT DETECTION USING GAUSSIAN MIXTURE MODEL There is a parametric model of probability density function for object detection. That model is coined as Gaussian Mixture Model (GMM). This model is represented in following way: GMM is equal to the weighted summation of the so called Gaussian component densities. Speaking about GMMs, This model is used for the probability distribution for taking measurements continuously or for biometric system’s features such as tracking an object based on color in video.
Figure 2 - Image on the left shows snowing and image on the right is a resultant of smoothing effect
In numerous computer correlated vision technology, there is a need to recognize motion objects of frames of video which are in sequence in order. This can be obtained by applying background subtraction that help in recognizing the motion objects from every part of the frames of the video.
2) Background Modeling Using video frame this step computes background model. The important motto of this emergence of background model is to withstand the alterations in environment in background. However, it is complicate to recognize the motion objects.
This background subtraction is also called segmentation. It is a technique mostly used in applications like target recognitions, video surveillance and banks.
3) Foreground Detection This step recognizes the frame pixels. This foreground detection will differentiate video frame and background model. General way for foreground detection will verify that pixel is distinguished from background. 4) Data Validation Ultimately, the final step eradicates the pixels which are not used in image. It includes enlighten of process for foreground mask which is relied on data that is actually extracted from background model. There are efficient key points where it lags. These are mentioned below: 1. Disregarding any correlation among adjacent pixels. 2. The rate of amendment may not be compatible with the motion velocity of the so called foreground object[5]. 3. Pixels which are Non-stationary, from casting shadow or moving leavers by motion objects that actually violates from
This Gaussian Mixture Model can be used as background model. To get the anticipated outcomes pixels of frame are removed from necessary video[4]. This background subtraction includes different issues that implicit emerging am algorithm that can used in recognizing the object. Moreover, it could be capable in responding to different alterations like moving and halt of motion objects and illumination. This surveillance is nothing but observing the actions, its behavior or any other altering data. It is often monitored in people and in very stealthy way. This video surveillance is generally applied in recognizing events and recognizing the humans. However, recognizing any events and tracking them
978-1-4799-3759-2/14/$31.00©2014 IEEE
2
2
International Conference on Convergence of Technology - 2014
the true one.
decision R: Background subtraction lead to outcomes that are generally transmitted to a higher level modules, for instance the identified objects are then tracked. By tracking an object one could attain few information about the object to be tracked and the obtained information applied to progress background subtraction. Generally we have no idea regarding the foreground objects which could be seen and present. Hence we fixed p(FG) = p(BG) and adopt uniform distribution for appearance of foreground object p( |FG) = . We conclude the pixel is related to the background only if it satisfies:
B. Algorithm of Gaussian Mixture Model: To get well aware of algorithm that applied at background subtraction, below steps are studied. Foremost thing to be done is to distinguish every input pixels for mean 'µ' for components. If pixel value nearer to mean of selected component, that particular component is taken as compatible component. To be a compatible component, the difference of pixel and mean obtained should be less. This is matched with standard deviation with scaling factor of D[6]. Secondly, Gaussian weight, mean and standard deviation (variance) are updated to replicate the obtained new pixel value. Further components which are non-matched decreases to weight ‘w’ and the mean and standard deviation won’t change. This relied on learning component 'p' to state the instant alterations. Thirdly, categorize the components are which portions of background model. To accomplish this task a threshold value is used as component weights 'w'. Fourthly, we regulate the pixels of foreground. Here the recognized pixels as foreground will not be suitable with any other components that are firm to the background.
(t ) ( t ) pBG p BG p BG x x R (t ) (t ) pFG p FG p FG x x
(5)
(t ) p BG cthr RcFG , x
(6)
where is represented as threshold value. Let p( |BG) represented in background model. Estimation of The background model is done from a set of training which denoted as X.
C. General formula of Gaussian Mixture Model: A Gaussian mixture model can be formulated in general as follows:
E. Methodology K
P(
i 1
( X t ; i ,t , i ,t )
i ,t
(2)
The GMM[7] is defined as a Combination of Gaussian distributions of K that analyze the alterations of state for equivalent pixels of frames. Hence algorithm established put on Gaussian mixtures to individual frame and alters images from colorful images to binary[8]. The pixels which has no change in its state is represented by value 1that is black color and the pixels which change their state is represented as value 0 that is white color. Hence, creating positions for motion objects of video is made possible.
Where, obviously, K
i 1
i ,t
1 (3)
The mean of such a mixture equals K
t i ,t i ,t
(4)
i 1
consider as variable that signifies the present value of pixel in the frame ,let K be the number of distributions, while t represents time (i.e., the frame index), represents an estimation of the ith Gaussian weight at time t in the mixture, is the mean value of the ith Gaussian in the mixture at time t, is the covariance matrix of the ith Gaussian in the mixture at time t.
The GMM is a mixture of k Gaussian distributions that
Figure 3.a
D. Necessity of GMM:
Below shown a figure 3 that represents the pixels equivalent to road of the background image that doesn’t suffer changes in state, consequently the black color that is value 0 is represented and seems black in color as revealed in below figure 3.b. The pixels of image equivalent to cars experience radical state alteration, Hence the white color which is represented as value 1 is represented and cars seemsas white
The pixel at time t in basic colors RGB is represented by . Background subtraction based on pixel includes resolution if an a only if pixel fits to the background (BG) or nearly foreground object (FG). This is formed by Bayesian
978-1-4799-3759-2/14/$31.00©2014 IEEE
Figure 3.b
3
3
International Conference on Convergence of Technology - 2014
in color which is shown in below figure 3.b Number of pixels –Particle area without any holes in units of pixel. Particle area –Particle area that is represented in real units such as calibration of spatial image. This value will be equivalent toNumber of pixels such that 1 pixel signifies 1 square unit.
IV. BLOB ANALYSIS For processing the image, a blob is used. It is defined as a region for associated pixels. In an image Blob analysis defined as the recognizing and well study of the above mentioned regions. The algorithms preferred distinguish pixels based on their respective value and locate them among one of the two categories which are mentioned below: the foreground where pixels represented as non-zero value and the background where pixels represented with zero value.
Scanned area – Entire image’s areais represented in real units. The obtained value is equivalent to the product that is given as(Resolution X×X Step) (Resolution Y×Y Step). Ratio – Ratio of the particle area to the entire image area. The percentage of the required image occupied by all particles.
In classic applications where blob analysis are used, the blob structures typically calculated area and perimeter, blob shape, Feret diameter and location. The flexibility of tools of blob analysis makes them appropriate for a extensive variation of applications like, pharmaceutical, pick-and-place or checkup of food for external material.
Ratio =
(7)
Number of holes – Number of holes inside a particle. The software discovers the holes inside a particle as small as 1 pixel. Holes’ area – Total area of the holes within a particle Total area – Area of a particle as well as the area of its holes. This value is equal to Particle area + Holes’ area
As we know blob is nothing a region of moving pixels, Tools used for analysis classically study touching pixels of foreground pixels is a portion of the equivalent blob. Subsequently, recognizing by human eye is easily done whereas touching the blobs might be understood by software like simple single blob. Also, any portion of blob of background pixel represented since during analysis reflection or lighting is assumed.
Note: A particle that is situated inside a bigger particle’s hole is recognized by separate particle. The hole area consists a particle that consists whole covered area of particle.
The analysis of blob is applied in recognizing blobs. The spatial characteristics fulfill necessary obligations. In numerous application computation is done as consume of time, Based on spatial characteristics the analysis of blob is used as deletion of blobs which are of no use, and use the required blobs for auxiliary analysis. Further, statistical information can be found using this like blob size or number of blob, presence of regions of blob and location.
A. Blob Analysis Concepts A classic blob analysis scans over a whole image and recognizes the whole particles, or blobs, and figures a full report on every particle. The report typically covers around 50 pieces of data regarding the blob, as well as the blob’s position of the blob in image, shape, size, longest segment orientation to other blobs, and moment of inertia.
Fig 4: Different types of area indications V. RESULTS The image components R, G and B are individually extracted. From thes things, yellow objects are extracted and the remaining objects are removed from the image. Finally Gaussian Mixtur Model is practiced to take out the foreground image which contains objects with yellow color.
The parameters that are used to recognize and organize these blobs are area, angle, perimeter and center of mass. By means of several parameters it can be made faster and further effective than matching of pattern in various applications.
After taking out the fore ground from the image, buses with yellow color are separated from other yellow colored objects based on the blob analysis. To do this, blob analysis with area is used. Apart from the blob area, bounding box area is calculated and the area of the bounding box and blob area
B. Blob analysis to calculate area: Some of area parameters are described below:
978-1-4799-3759-2/14/$31.00©2014 IEEE
4
4
International Conference on Convergence of Technology - 2014
ratio are found out. Based on the ratio, we classify whether it is a bus or not. If that ratio is even slightly greater than threshold set, the we classify that as bus. The rectangles are drawn around the buses which are detected and we also displayed the buses count which are tracked on the top left corner of the output vidoe file.
is displayed in black box on the top of output frame which can be observed in figure 6.1.d. Though there are two yellow buses in the input frame bounding box is drawn around only one bus which is present in the ROI (Region Of Interest). ROI is the region present below the white line in the output frame as shown in figure 6.1.d.
Figure 5.a displays the 750th Frame of the video sequence which contain two other non-yellow color vehicles and two yellow buses, R, G, B components are individually from this frame extracted. yellow colored objects are detected using these set of components while neglecting all non-yellow colored objects in the frame, the yellow colored objects appear brighter than the remaining colored objects. Now the Gaussian Mixture Model is practiced to extract the object containing yellow only. After the extraction of the foreground, in order to separate the yellow buses from remaining yellow color objects blob analysis is used. Using blob analysis areas of blob, bounding box are calculated and ratio of those areas is calculated. For the yellow bus, the calculated ratio exceeds the threshold set, as a result a bounding box is drawn around the bus and with that number of buses that have been tracked is displayed in black box on the top of output frame which can be observed in figure 4.b. Though there are two yellow buses in the input frame bounding box is drawn around only one bus which is present in the ROI (Region Of Interest). ROI is the region present below the white line in the output frame as shown in figure 5.b.
Figure 6.2.a shows the frame 3335 of the traffic video sequence which contains one auto and two other non-yellow color vehicles, from this frame R, G, B components are extracted individually. Using these components yellow objects are detected while neglecting all the non-yellow color objects in the frame, the yellow objects appear brighter than the remaining objects. That can be clearly observed in figure 6.2.b. Now Gaussian Mixture Model is applied to figure 6.2.b to extract the foreground containing only yellow objects which is clearly observed in figure 6.2.c. After extracting the foreground, using blob analysis areas of blob, bounding box are calculated and ratio of those areas is calculated. Though the auto is in yellow color, bounding box is not drawn around it which is shown in figure 6.2.d because the ratio of areas does not exceed the threshold set.
Figure 5.a
Figure 6.1.a: Frame 1422 of extraction Traffic video sequence
Figure 6.1.b: Yellow color from frame 1422
Figure 6.1.c: Foreground detection in from frame 1422
Figure 6.1.d: Yellow bus frame 1422 is tracked
Figure 5.b
Figure 6.1.a shows the frame 1422 of the traffic video sequence which contains two yellow buses and two other non-yellow color vehicles, from this frame R, G, B components are extracted individually. Using these components yellow objects are detected while neglecting all the non-yellow color objects in the frame, the yellow objects appear brighter than the remaining objects. That can be clearly observed in figure 6.1.b. Now Gaussian Mixture Model is applied to figure 6.1.b to extract the foreground containing only yellow objects which is clearly observed in figure 6.1.c. After extracting the foreground, in order to separate the yellow buses from remaining yellow color objects blob analysis is used. Using blob analysis areas of blob, bounding box are calculated and ratio of those areas is calculated. For the yellow bus, the calculated ratio exceeds the threshold set, as a result a bounding box is drawn around the bus and with that number of buses that have been tracked
978-1-4799-3759-2/14/$31.00©2014 IEEE
From the above figure it is observed that the second bus is not detected as the first bus occludes the second bus, as a result the second vehicle is not recognized as a bus.
5
5
International Conference on Convergence of Technology - 2014
successfully. Thus the overall efficiency of proposed algorithm is 91.8. VI. CONCLUSIONS
Figure 6.2.a: Frame 3335 of Traffic video sequence 3335
Figure 6.2.c: Foreground detection from frame 3335
We have proposed an object tracking algorithm for video pictures, based on GMM and blob analysis on frames in a simple feature space. Simulation results for frame sequences with moving buses (especially yellow colored) video sequences verify the suitability of the algorithm for reliable moving object tracking. Initially R, G, B components of the images are extracted individually. Using these extracted components, yellow objects are detected while excluding all the remaining objects in the image. Later Gaussian Mixture Model is applied to extract the foreground containing yellow objects. There may also be the concern that the linear motion estimation may fail for objects moving in a complicated nonlinear way. However, if the movement is not extremely fast, the deviation from estimated positions between successive frames is small, than correct tracking is reliably achieved. Furthermore, if mistracking occurred at some frame by reason of occlusion, newly appearing or disappearing objects, the proposed algorithm could recover correct tracking after a couple of frames. GMM makes the algorithm more robust, such that the tracker will recover tracking if occlusion takes place.
Figure 6.2.b: Yellow color extraction from frame
Figure 6.2.d: Auto in frame 3335is not tracked
Thus, this method is capable of discriminating the yellow buses from remaining yellow colored objects.
REFERENCES [1]
Thus, this method is capable of discriminating the yellow buses from remaining yellow colored objects.
[2] [3]
A. Case study: TABLE 2: TRACKING EFFICIENCY OF THE PROPOSED ALGORITHM
Time slot
Buses passed 3:30 pm-3:45 pm 11 3:45 pm-4:00 pm 13 4:00 pm-4:15 pm 4 4:15 pm-4:30 pm 6 4:30 pm-4:45 pm 17 4:45 pm-5:00 pm 10 Overall Efficiency
Buses detected 10 12 4 6 15 9
[4]
Efficiency in % 90.9 92.3 100 100 88.2 90 91.8
[5]
[6]
[7]
[8]
In order to calculate the density of institutional buses and to compute the efficiency of proposed algorithm six time slots have been considered between 3:30 pm to 5 pm each of 15 minutes interval. In the table 7 it can be observed that the efficiency of the proposed algorithm varies between 88.2 to 100 percent. The causes for this variation observed are: 1. the average speed of the vehicles is below the threshold 2. occlusions caused while overtaking other vehicles
Mr. D. Harihara Santosh obtained his B. Tech. and M. Tech degrees from JNT University, Hyderabad in the year 2005 and 2010. Presently he is pursuing Ph.D, in Video Processing at JNTU, Hyderabad. He is presently pursuing his Ph.D. under the Guidance of Dr. P.G. Krishna He has 9 publications in both International and National Journals and presented 20 papers at various
From table it is observed that a total of 61 buses passed through, out of which 56 buses have been detected
978-1-4799-3759-2/14/$31.00©2014 IEEE
Takashi morimoto, Osama kiriyama, youmei harada,Object tracking in video images based on image segmentation and pattern matching IEEE conference proceedings, vol no 05, page no, 3215-3218, 2005 Zehang, S., Bebis, G., Miller, R.: On-road vehicle detection: a review. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 694–711 (2006) Bin, D., Yajun, F., Tao,W.: A vehicle detectionmethod via symmetry in multi-scale windows. In: 2nd IEEE Conference on Industrial Electronics and Applications, 2007 (ICIEA 2007), pp. 1827–1831 (2007) Nedevschi, S., Vatavu, A., Oniga, F., Meinecke, M.M.: Forward collision detection using a stereo vision system. In: 4th International Conference on Intelligent Computer Communication and Processing, 2008 (ICCP 2008), pp. 115–122 (2008) Reynolds, D.A., Rose, R.C.: Robust Text-Independent Speaker Identification using Gaussian Mixture Speaker Models. IEEE Transactions on Acoustics, Speech, and Signal Processing 3(1) (1995) 72–83 Boeing, A., Braunl, T.: ImprovCV: Open component based automotive vision. In: Intelligent Vehicles Symposium, 2008 IEEE, pp. 297–302 (2008) Wei, L., XueZhi, W., Bobo, D., Huai, Y., Nan, W.: Rear vehicle detection and tracking for lane change assist. In: Intelligent Vehicles Symposium, 2007 IEEE, pp. 252–257 (2007) Junqiu wang [Member IEEE], yasushi yagi [Member IEEE],‖Integrating colour and shape texture features for adaptive real time object tracking‖, IEEE transactions on image processing, vol no 17,no 2, page no 235-240,2007
6
6
International Conference on Convergence of Technology - 2014
International and National Conferences. His areas of interests are Image and Video Processing. Dr. P. G. Krishna Mohan presently working as Professor in Institute of Aeronautical College of Engineering, Hyderabad. He Worked as Head of ECE Dept. , Member of BOS for ECE faculty at University Level, Chairman of BOS of EIE group at University level, Chairman of BOS of ECE faculty for JNTUCEH, Member of selection committees for Kakitiya, Nagarjuna University, DRDL and convener for Universite a Hidian committees. He has more than 35 papers in various International and National Journals and Conferences. His areas of interests are Signal Processing, Communications.
978-1-4799-3759-2/14/$31.00©2014 IEEE
7
7