Video Shot Segmentation Using Spatio-temporal ...

2014 Fourth International Conference on Communication Systems and Network Technologies

Video Shot Segmentation Using Spatio-Temporal Fuzzy Hostility Index and Automatic Threshold Hrishikesh Bhaumik, Siddhartha Bhattacharyya

Susanta Chakraborty

Depart ment of Information Technology, RCC Institute of Informat ion Technology, Kolkata, India {hbhaumik,dr.siddhartha.bhattacharyya}@gmail.co m

Depart ment of Co mputer Science and Technology, Bengal Engineering and Science University, Sh ibpur Howrah, India [email protected] m

Abstract—S hot segmentation is an important preprocessing step towards content based video analysis. In this paper, we propose a S patio-Temporal Fuzzy Hostility Index (S TFHI) for determining the edges of objects present in the frames, composing the video. The edges present in the frames are treated as features of the frame. The correlation between the features is computed for successive frames of the video. An automatic threshold is set using the three-sigma rule, on the gradient of correlation values thus computed to detect hard cuts (abrupt transitions) in the video. The proposed method is able to accurately detect the hard cuts in a video. In an experimental evaluation on a heterogeneous test set, consisting of videos from sports, movie songs, music albums and documentaries, the proposed method achieves substantial improvement over the state of the art methods.

statistical differences [6, 17, 18], standard deviation of pixel intensities [19], edge change ratio [20] and computing motion vectors [21], resulting in several techniques for shot boundary detection [5, 7, 21, 22, 23, 24]. The frame-to-frame p ixel intensity difference measure used by Zhang et al. [25] is sensitive to object or camera movement. This limitation is overcome by comparing the gray level histograms of successive frames [15, 25, 26]. Statistical techniques like mutual in formation and joint entropy used by Cernekova et al. [18] are found to perform better in detection of gradual transitions than histogram comparison approaches. Chavez et al. [27] and Ling et al. [28] have proposed supervised learning approaches using SVMs for detecting hard cuts and gradual transitions.

Keywords—Shot boundary detection; hard cut detection; spatiotemporal fuzzy hostility index; image correlation; three-sigma rule

I.

The aim of any shot boundary detection algorithm is to reduce the number of false hits and misses. Precision and recall have served as reliable metrics [29] for researchers in order to test the robustness of proposed information retrieval algorith ms. Reduction in number of false hits leads to an increase in precision while reducing the misses corresponds to improved recall. If the threshold (T ) for similarity is set to a high value, it may lead to higher precision but low recall. On the contrary, if the value of T is low, there may be more number of false alarms, causing lower precision but higher recall. Statistical analysis have shown that the number of hard cuts present in a video by far outnumber the number of gradual transitions in it. Hence our current work focuses on reliab le detection of hard cuts present in a video, thereby improving the precision and recall over other state of the art methods . The proposed method is immune to camera movement, change in lu minance, background changes and rapid object movement.

INT RODUCTION

Research in the field of content based video analysis, indexing and retrieval has proliferated over the past decade with the increasing number of digital videos available on the internet. Analysis of video content includes extraction of both high level and low level features [1, 2]. While the high level features describe the scene, events, location and objects in videos, low level features deal with video properties such as frame rates, resolution, color model and shot boundary detection [3]. Shot boundary detection comprises of segmenting the video into its constituent shots and is the first step towards semantic analysis of a video [4, 5, 6, 7, 8, 9, 10, 11]. During the video editing process these constituent shots are comb ined together by cut transitions or gradual transitions to form a v ideo sequence. Cut transitions are abrupt scene changes, also referred to as hard cuts, while gradual transitions include fades, dissolves and wipes [11]. For finding hard cuts in a video, V { f i : i 1,2,....n} , where f i refers to the

II.

A. Application of fuzzy set theory to image The pixel values of a gray scale image are integers in the range 0 to 255. This 2 D matrix of values can be scaled to the range [0, 1] by dividing each element of the matrix by 255. Consider two fuzzy sets WHITE and BLACK with membership functions W and B respectively. Each set has n elements, where n denotes the number of pixels in the gray scale image. If a value 0 represents a completely black pixel and 1 a completely white one, then such scaling depicts the degree of

i th image frame, the problem is to find successive frames f i and f i 1 such that they are dissimilar in content. The frame

f i is referred to as pre-cut frame wh ile f i 1 is referred to as post-cut frame. Several metrics for co mputing similarity [12, 13, 14] between image frames have been proposed from time to time and the majo r ones include histogram comparisons [11, 15], independent component analysis [16], 978-1-4799-3070-8/14 $31.00 © 2014 IEEE DOI 10.1109/CSNT.2014.106

BASIC CONCEPT S AND DEFINITIONS

501

membership W ( pi ) of the

ith pixel p i to the fuzzy set WHITE. The degree of membership B ( pi ) of the pixel p i

to

the

set

BLA CK 1 W ( pi ) .

B ( pi )

can

be

represented

where

Considering two 2 D mat rices

the Pearson’s correlation coefficient ( X ,Y ), also known as

X ,Y

where

p qi 3 8 i 1 p 1 qi 1

(1)

qi ; i =1, 2, 3, . . . , 8 are the membership values of its fuzzy

0 indicating

heterogeneity

and

total homogeneity in the neighborhood.

The above equation (1) can be effectively applied for a single image frame. However, a v ideo consists of temporally separated frames and equation (1) needs to be modified. The STFHI ( ) of a pixel in the function

i th image frame f i of a video is a

of the fuzzy hostility index of the candidate pixel

f i and the corresponding pixels in the previous f i 1 and post f i 1 frames, exp ressed as follows: f i ( f i1 , f i , f i1 ) (2)

in

f

i

is computed as the average of

f

i 1

,

f

i

and

f

for the first and last frames of a video where

f

i 1

i 1

n

( x ij ) 2

{x ij 2

n

( y ij ) 2

}{ y ij 2

n

}

xij and y ij are the elements in the i th row and X , y is the mean value of elements of

D. Three-Sigma Rule The standard deviation ( ) of a dataset or probability distribution denotes the variation or deviation from the arith metic mean (M ) or expected value. The three-sigma rule in statistics is used to signify the range in which the values of a normal d istribution will lie. According to this rule (refer Fig.1), 68.2% values in a normal d istribution lie in the range [M , M ] , 95.4% values in [M 2 , M 2 ] and 99.6% in the range [M 3 , M 3 ] . Hence, this empirical rule may be reliab ly used to compute a threshold to detect values which represent abrupt changes. In our proposed method, the three-sigma rule has been used to detect the hard cuts at shot boundaries.

neighbors in a second-order neighborhood fuzzy subset. The value of the fuzzy hostility index lies in [0, 1], with maximu m

x)( y ij y )

Y and n is the total number of elements in the matrix under consideration. The correlation is defined only if both of the standard deviations are finite and both of them are nonzero. The correlat ion is 1 in the case of an increasing linear relationship, -1 in the case of a decreasing linear relationship, and some value in between in all other cases, indicating the degree of linear dependence between the matrices .

is the membership value of the candidate pixel and

signifying

ij

value of elements of

8

1

(x

{ ( x ij x) 2 }( ( y ij y ) 2 }

j th column of matrices X and Y respectively, x is the mean

( ) of a pixel is defined as:-

p

X and Y of same dimensions,

product-mo ment correlation coefficient, between the matrices may be represented as: xij y ij (3) xij y ij

The degree of homogeneity or heterogeneity of the neighborhood of the candidate pixel can be determined by computing its fuzzy hostility index [30]. Fuzzy hostility index thus indicates the amount of variation in the pixel neighborhood with respect to itself. The p ixel hostility has high value if the surrounding pixels have greater difference of values as compared to the candidate pixel i.e. heterogeneity in its neighborhood is more. According to Bhattacharyya et.al. [30], in a second-order neighborhood the hostility index

Where

j th colu mn.

row and

C. Correlation between Image Frames

as

B. Spatio-Temporal Fuzzy Hostility Index (STFHI)

ij is the STFHI of the pixel at i th

except

f

i 1

and

not present respectively. The 2 D matrix thus formed by

computing the of each pixel will represent an image with profound edges of all objects present in the original image. Pixel intensity scaling is performed to make the edges more prominent compared to other portions of the image. The intensity scaling function ( ) used in our work is represented as:-

Fig. 1. Normal Distribution with three standard deviations from mean

(ij )2 , if ij 0.5 (ij )1 / 2 , if ij 0.5

502

E. Correlation Gradient Correlation between consecutive image frames is co mputed and stored as a row matrix (CM ) . More the similarity between two image frames, higher is the correlation value. The correlation between the pre-cut frame and post-cut frame is very low and shows an abrupt change. In order to detect the shot boundary, the gradient of the correlation values is computed. The correlat ion gradient is a row vector which is computed by taking the difference between consecutive correlation values in C M . It is obvious that if there are N image frames in a video, the nu mber of correlation values computed will be N 1 and the number of elements in the correlation gradient matrix will be N 2 . The correlation gradient plot as depicted in Fig. 2, consists of steep spikes at the points of hard cut.

A. Extraction of time sequenced image frames from a video The video under consideration is decomposed into its constituent image frames in a t ime sequenced manner by standard codec corresponding to the file type i.e. AVI, MPEG, MP4 etc. The extracted images are stored as bitmaps for further processing. B. Generating fuzzy hostility map from the image frames The fuzzy hostility map computed for each gray scale image (refer Fig. 4(b)) corresponds to a 2 D matrix of values, where each element of the matrix is the spatio-temporal fu zzy hostility index of the pixels of the original b it map as explained in section II B. The fuzzy hostility map is indicative of the amount of coherence/incoherence in the movement of the objects in a mot ion scene of a video. The fuzzy hostility map is used to generate the edges of the objects of the gray scale image. Thereafter an intensity scaling function is used to make the edges more pro minent.

(a)

(b)

Fig. 4. (a) Original image frame (b) Fuzzy hostility map of original image

C. Edge Dilation Edge dilat ion is a technique which is used to enlarge the boundaries of the objects in a grayscale image. This may be used to compensate for camera and object movement. In our work, a 3 3 square structuring element is used to dilate the edges of the grayscale image so generated from the fuzzy hostility map (Fig. 5). It is experimentally observed that the value of correlation between the similar images is increased as a result of edge dilation even where there is rapid camera or object movement.

Fig. 2. Sample plot of the correlation gradient

III.

PROPOSED METHOD FOR SHOT BOUNDARY DETECTION

The proposed method for detecting hard cuts in a video using spatio-temporal fuzzy hostility index and automatic threshold is performed in six steps. The flow chart of the entire process is shown in Fig.3. The different steps are described in the following subsections. Extraction of time sequenced image frames from a video

Generating fuzzy hostility map from the image frames

Computation of correlation between consecutive image frames

Fig. 5. Edge dilation performed on fuzzy hostility map Edge Dilation

D. Computation of correlation between consecutive images The correlation between the fuzzy hostility maps are co mputed for consecutive image frames and stored in a row matrix as explained in section II C. The correlation gradient is co mputed for the correlation matrix which has been explained in section II E.

Setting an automatic threshold using three- sigma rule and detection of hard cuts

Refining the detection of hard cuts using confusion range [M 2 , M 3 ] and [ 2 , 3 ]

E. Setting an automatic threshold using three-sigma rule To detect the hard cuts present in a video we apply the three-sigma ru le wh ich is explained in section II D. This

Fig. 3. Flow diagram of the detection process

503

experimentally that when such a span is considered, the hard cuts falling in it can be reliably detected. The total number of hard cuts in the video is summation of those obtained in subsections E and F of this section.

emp irical rule is applied to compute an automatic threshold for detection of abrupt changes in the correlation gradient found at the points of hard cuts. If the value o f the correlation gradient is greater than or equal to this threshold, we detect a hard cut.

IV.

EXPERIMENTAL RESULT S AND A NALYSIS

The proposed method for shot boundary detection was applied on a video data set consisting of ten videos with varied features (Table I and Table II). All the videos considered here, have a resolution of 640 360 at 25 fps in MP4 format. The performance of the proposed method as compared to the existing methods is evaluated by taking into consideration two parameters, recall (R) and precision (P) defined as follows:-

(a)

Recall ( Bd B f ) / Bt Precision ( Bd B f ) / Bd

(b)

where, Bd : Shot boundaries detected by algorithm B f : False shot boundaries detected

Bt : Actual shot boundaries present in the video A. The Video Dataset To establish the effectiveness of the proposed method, the test data was divided into two sets. The videos in the first set (Table I) are all of short length, while the videos in the second set (Table II) are of longer duration. The first video (labeled V1 in Table I) “Wimb ledon Semifinal Highlights 2013” is taken fro m the semifinal match between Djokovic and Del Potro. It was specially chosen because of rapid object movement and small duration of the shots. In contrast, the second video (labeled V2 in Table I) is a movie song “Dagabaaz” fro m the Hindi film “Dabangg 2” which consists of shots taken mostly outdoors in daylight. The third video (labeled V3 in Table I) is another movie song “Chammak Challo” taken fro m the Hindi film “Ra.One”. It consists of small duration shots taken indoors as well as some digitally generated frames interlaced with real life shots. The average number of frames in each shot of the video is the least among all videos of the dataset. The fourth video (labeled V4 in Table I) is based on a violin track by Lindsey Stirling which is characterized by simu ltaneous movement of the performer as well as camera. This video has rapid zoo m-in and zoo m-out sequences taken outdoors in the backdrop of mountains and trees. The fifth video (labeled V5 in Tab le I) is “Waka Waka (This time for Africa)”, the official song of the 2010 FIFA World Cup. It consists of match sequences taken from FIFA World Cup history, intermixed with the song performance against varied background and illu mination. This was the major motivation for including this video in our dataset. The videos V6 to V9 (listed in Table II) are four docu mentaries fro m different TV channels, while v ideo V10 is a cricket highlights taken from the Cricket World Cup 2011 final match between India and Sri Lanka.

(c) Fig. 6. (a) and (b) are adjacent frames at the shot boundary. (c) Abrupt changes in correlation gradient at the points of hard cut. Frames are from Waka Waka (This time for Africa), 2010.

F. Refining the detection over confusion range [M 2 , M 3 ] and [M 2 , M 3 ] Since the three-sig ma ru le takes into consideration the global mean and standard deviation of all the image frames in the video, there are certain cases of hard cuts which may not be detected due to these global values. It has been observed that hard cuts present in a video having nearly the same background may not be reliably detected because the correlation between such image frames is usually quite high. This results in hard cuts being undetected due to low value of correlation grad ient. Hence for reliable detection of such hard cuts, we take into consideration a confusion range in the band [M 2 , M 3 ] and [M 2 , M 3 ] . The values lying in th is range are tested again; taking into consideration a span consisting of two hard cuts detected in the previous step, prior and post of the point under consideration. Thus, the new mean ( M new ) and standard deviation ( new ) of the values lying in this span are recomputed. The point under consideration is considered as a hard cut if it lies outside the range [M new 3 new , M new 3 new ] . If the point under consideration is towards the start or end of a video, we may not get two hard cuts prior or post to it. In such cases we take an appropriate nu mber of hard cuts (total four), prior or post of the point under consideration. It has been verified

504

B. Experimental Results The proposed hard cut detection algorithm works in two phases. In the first phase, hard cuts are detected by using a global threshold calculated using the three-sigma ru le as explained in section III E. It is seen that in some cases a few hard cuts remain undetected as the correlation gradient is below the global threshold set by the three-sigma. Experiments on several videos including those in the test video dataset show that the undetected hard cuts lie in the band [M 2 , M 3 ] and [M 2 , M 3 ] , referred to as the confusion band. In the second phase, a local threshold needs to be determined over a span of detected hard cuts in the vicinity of the undetected hard cuts. The methodology has been explained in section III F. The usefulness of this two phase detection is visible fro m the experiments performed on the test video set and summarized in Table III.

Dissolve detection has not been addressed in this work and remains an underexplored area of research. The performance of existing dissolve detectors is less than satisfactory. Hence, a complete shot boundary detection algorithm would encompass detection of all three important types of edits i.e. hard cuts, fades and dissolves. A CKNOWLEDGMENT We acknowledge the contributions of Surajit Dutta, M.Tech student at RCC Institute of Information Technology, Kolkata during the experimentation phase of our proposed method. REFERENCES [1] [2]

C. Comparison with other existing methods The proposed method for shot boundary detection was compared with several existing methods like Mutual Information (MI) [10], Co lor Histogram Differences (CHD) [15] and Edge Change Ratio (ECR) [20]. The problem of automatic computation of a threshold has been addressed in the literature [3, 4] and strength of the proposed method lies in the fact that the threshold is computed automatically without manual intervention, unlike the other existing methods. The comparative results are shown in Table IV. It is to be noted that the other methods may take several parameter values. While making a co mparison we considered the parameter settings corresponding to the method, which have y ielded the best results for it as reported in the literature. TABLE I. Video Duration (mm:ss) No. of Frames No. of Hard Cuts Average no. of frames in each shot

V1 02:58 4468 43 101.54

V.

[4]

[5]

[6]

TEST VIDEO DATASET-I V2 02:42 4057 70 57.14

V3 04:10 6265 172 36.21

V4 03:27 4965 77 63.65

[7]

V5 03:31 5053 138 36.35

[8]

[9]

TABLE II. Video Duration (mm:ss) No. of Frames No. of Hard Cuts Average no. of frames in each shot

[3]

V6 51:20 74020 941 78.57

TEST VIDEO DATASET-II V7 28:40 43018 406 105.69

V8 58:06 87150 807 107.85

V9 59:29 89225 1271 70.14

V10 111:19 166990 2807 59.46

[10]

[11]

CONCLUSIONS AND REMARKS

The proposed method for hard cut detection in videos was tested on a diverse set of videos. It is seen to outperform the existing methods in terms of both the number of hits and precision. The problem of setting an automatic threshold without any human interference has been addressed and results are very encouraging. The number of false hits is almost negligible as compared to the other existing methods.

[12]

[13]

[14]

505

A. Mittal “ An Overview of Multimedia Content-Based Retrieval Strategies,” Informatica, Volume: 30, Page(s): 347–356, 2006. W. Zheng, J. Li, Z. Si, F. Lin, and B. Zhang “ Using High-Level Semantic Features in Video Retrieval. Image and Video Retrieval” 5th International Conference, CIVR 2006, Volume: 4071/2006, Page(s): 370-379, June, 2006. L. Ranathunga, R. Zainuddin, and N. A. Abdullah “ Conventional Video Shot Segmentation to Semantic Shot Segmentation,” 6th IEEE International Conference on Industrial and Information Systems (ICIIS), Page(s):186-191, August 2011. Alan Hanjalic “ Shot-Boundary Detection: Unraveled and Resolved,” Circuits and Systems for Video Technology, IEEE Transactions, Volume:12, Page(s):90-105, February 2002. Hattarge A.M., Bandgar P.A., and Patil V.M. “A Survey on Shot Boundary Detection Algorithms and T echniques”, International Journal of Emerging T echnology and Advanced Engineering, Volume 3, Issue 2, February 2013. Biswanath Chakraborty, Siddhartha Bhattacharyya, and Susanta Chakraborty “A Comparative Study of Unsupervised Video Shot Boundary Detection Techniques Using Probabilistic Fuzzy Entropy Measures”, DOI: 10.4018/978-1-4666-2518-1.ch009 in Handbook of Research on Computational Intelligence for Engineering, Science, and Business, 2013. John S. Boreczky, and Lawrence A. Rowe “ Comparison of video shot boundary detection techniques”, Journal of Electronic Imaging, Page(s):122–128, March,1996. Swati D. Bendale, and Bijal. J. T alati “Analysis of Popular Video Shot Boundary Detection Techniques in Uncompressed Domain,” International Journal of Computer Applications (0975 – 8887) ,Volume 60– No.3, IJCA, December, 2012. Ullas Gargi, Rangachar Kasturi, and Susan H. Strayer “Performance Characterization of Video-Shot-Change Detection Methods,” IEEE Transaction on Circuits and Systems for Video Technology, Vol. 10, No. 1, IEEE, February 2000. Aarti Kumthekar and Mrs.J.K.Patil “ Comparative Analysis Of Video Summarization Methods,” International Journal of Engineering Sciences & Research Technology, ISSN: 2277-9655,2(1): Page(s): 15-18, IJESRT January, 2013. R. Lienhart, S. Pfeiffer, and W. Effelsberg “ Scene determination based on video and audio features,” Proceedings of IEEE International Conference on Multimedia Computing and Systems, Volume:1, Page(s):685 -690 IEEE, 1999. Matthew Cooper, and Jonathan Foote “ Scene Boundary Detection Via Video Self-Similarity Analysis,” International Conference on Image Processing, Volume:3, Page(s): 378–381, October 2001. Matthew Cooper, Jonathan Foote, John Adcock, and Sandeep Casi “ Shot boundary detection via similarity analysis,” Proceedings of the T RECVID (03) Workshop, Gaithersburg, Maryland, USA, pp. 79—84, 2003. A.A. Goshtasby “ Image Registration, Advances in Computer Vision and Pattern Recognition,” DOI 10.1007/978-1-4471-2458-0_2, Springer-Verlag London Limited, 2012.

[24] Lenka Krulikovská, Jaroslav Polec, and Tomáš Hirner “Fast Algorithm of Shot Cut Detection,” World Academy of Science, Engineering and Technology, 2012. [25] H. Zhang, A. Kankanhalli, and S.W. Smoliar “ Automatic partitioning of full-motion video,” Multimedia Systems, Volume: 1, no. 1, Page(s): 10– 28, 1993. [26] Y. Tonomura “ Video handling based on structured information for hypermedia systems,” Proceedings of ACM International Conference on Multimedia Information Systems, Page(s): 333–344, 1991. [27] G. Cámara-Chávez, F. Precioso, M. Cord, S. Phillip-Foliguet, and A. de A. Araújo “ Shot Boundary Detection by a Hierarchical Supervised Approach,” Systems, Signals and Image Processing, 2007 and 6th EURASIP Conference focused on Speech and Image Processing, Multimedia Communications and Services. 14th International Workshop, Page(s):197 – 200, IEEE, June 2007. [28] Xue Ling, Yuanxin, Ouyang, Li Huan, and Xiong Zhang “ A Method for Fast Shot Boundary Detection Based on SVM,” Congress on Image and Signal Processing, CISP '08. Volume:2, Page(s):445 – 449, IEEE, May, 2008. [29] P. Browne, A. F. Smeaton, N. Murphy, N. O'Connor, S. Marlow, and C. Berrut “Evaluating and Combining Digital Video Shot Boundary Detection Algorithms,” in IMVIP 2000 - Conference on Irish Machine Vision and Image Processing, 1999. [30] Siddhartha Bhattacharyya, Ujjwal Maulik, and Paramartha Dutta “Highspeed target tracking by fuzzy hostility-induced segmentation of optical ﬂow ﬁeld,” Applied Soft Computing ,Science Direct, 2009.

[15] Vrutant Hem Thakore “ Video Shot Cut Boundary Detection using Histogram,” International Journal of Engineering Sciences & Research Technology, ISSN: 2277-9655, 2(4): Page(s): 872-875, IJESRT , April, 2013. [16] Jian Zhou, and Xiao-Ping Zhang “ Video Shot Boundary Detection Using Independent Component Analysis Acoustics,” IEEE International Conference on Speech, and Signal Processing, ICASSP '05, Volume:2, Page(s): 541 – 544, March 2005. [17] Junaid Baber, Nitin Afzulpurkar, Matthew N. Dailey, and Maheen Bakhtyar “ Shot Boundary Detection From Video Using Entropy And Local Descriptor,” 17 th International Conference on Digital Signal Processing (DSP), Page(s): 1–6, July 2011. [18] Z.Cernekova,C. Nikou, and I. Pitas “ Shot Detection In Video Sequences Using Entropy-Based Metrics”, International Conference on Image Processing, Volume:3 24-28 Page(s): III-421 - III-424 vol, June 2002. [19] A. Hampapur, R. C. Jain, and T. Weymouth “Production Model Based Digital Video Segmentation,” Multimedia T ools and Applications, Vol.1, No. 1, Page(s): 9-46, March 1995. [20] R. Zabih, J. Miller, and K. Mai “ A Feature-Based Algorithm for Detecting and Classifying Scene Breaks,” Proceedings of ACM Multimedia 1995, San Francisco, CA, Page(s):189-200, November, 1995. [21] Z.C. Zhao, and A.N. Cai “ Shot boundary detection algorithm in compressed domain based on adaboost and fuzzy theory,” in the Proceedings of International Conference on Natural Computation, Page(s): 617–626, 2006. [22] Xiang Fu, and Jiexian Zeng “An Effective Video Shot Boundary Detection Method Based on the Local Color Features of Interest Points,” Second International Symposium on Electronic Commerce and Security, ISECS 2009, Volume:2 Page(s): 25 – 28, IEEE, May, 2009. [23] Shujuan Shen, and Jianchun Cao “Abrupt shot boundary detection algorithm based on fuzzy clustering neural network,” 3rd International Conference on Computer Research and Development (ICCRD), Volume:2, Page(s):246 - 248, IEEE, March,2011. TABLE III.

E XPERIMENTAL RESULTS OF TEST VIDEO DATASET

Video Hard Cuts present

V1 43

V2 70

V3 172

V4 77

V5 138

V6 941

V7 406

V8 807

V9 1271

V10 2807

Hard Cuts detected (Phase I)

43

68

160

75

129

940

406

806

1269

2790

Hard Cuts detected (Phase II)

0

2

11

1

8

1

0

1

2

18

Falsely detected Hard Cuts Recall (%) Precision (%)

0

0

1

0

0

0

0

0

0

1

100 100

100 100

98.83 99.41

98.70 100

99.27 100

100 100

100 100

100 100

100 100

100 99.96

TABLE IV. Video

COMPARISON WITH EXISTING METHODS

Proposed Method R P 100% 100%

R 88.37%

P 92.68%

R 81.57%

P 96.87%

R 90.69%

P 88.63%

V2

100%

100%

91.42%

94.11%

75.71%

92.98%

95.71%

89.33%

V3

98.83%

99.41%

84.88%

94.80%

77.90%

94.36%

92.44%

88.33%

V4

98.70%

100%

81.81%

92.64%

74.02%

95%

93.50%

90%

V5

99.27%

100%

84.05%

95.08%

80.43%

94.87%

94.92%

91.60%

V6

100%

100%

85.97%

92.98%

75.98%

95.96%

90.96%

92.98%

V7

100%

100%

83%

88.91%

78.07%

87.93%

92.11%

96.05%

V8

100%

100%

91.94%

93.55%

83.02%

88.97%

89.96%

85.99%

V9

100%

100%

87.96%

95.98%

76%

91.03%

95.98%

92.99%

V10

100%

99.96%

85.99%

93.97%

81.97%

91.98%

87.99%

94.01%

V1

MI

CHD

ECR

506

Video Shot Segmentation Using Spatio-temporal ...

Video Shot Segmentation Using Spatio-temporal ...

Suggest Documents

Spatiotemporal Semantic Video Segmentation

Bayesian video shot segmentation

Video Shot Segmentation using Singular Value Decomposition

One-Shot Video Object Segmentation

Bayesian video shot segmentation - Semantic Scholar

Spatiotemporal Video Segmentation Based on ... - Google Sites

Foveated Shot Detection for Video Segmentation - Informatica

VIDEO SHOT META-SEGMENTATION BASED ON ... - MKLab

Temporal Video Segmentation Using Unsupervised

Video shot classification using lexical context

VIDEO SHOT CLUSTERING USING SPECTRAL METHODS Jean ...

Video Shot Boundary Detection Using Normalized Periodogram ...

Video Compression Using Spatiotemporal Regularity Flow - CRCV

Techniques of video segmentation and shot change detection are ...

From Video Shot Clustering to Sequence Segmentation

temporal video segmentation using cross correlation

Parallelizing Video Frame Segmentation using ... - onlinepresent.org

Digital Video Segmentation Using Level Set Theory

Video segmentation using minimum ratio similarity ...

Video Object Segmentation using Tracked Object Proposals

image segmentation in video sequences using

Video Object segmentation using Multiple Features

Hierarchical video segmentation using an observation scale

Efficient Spatiotemporal-Attention-Driven Shot ... - Semantic Scholar