th International conference on Sciences and Techniques 10th 10 International conference on Sciences and Techniques of Automatic Automatic control control & & computer computer engineering engineering of December 20-22, 20-22, 2009, 2009, Hammamet, Hammamet, Tunisia Tunisia December
Hardware Acceleration of video cut detection Algorithm on FPGA Lamjed Touil 1, Abdessalem Ben Abdelali1,2, Mohamed Nidhal Krifa1, Abdellatif Mtibaa1, Elbey Bourennane3 1Laboratory
EµE, Faculty of Sciences of Monastir, Monastir, Tunisia
[email protected],
[email protected] [email protected] 2High Institute of Informatics and Mathematics of Monastir, Monastir, Tunisia
[email protected], 3Laboratory LE2I, University of Burgundy, France
[email protected]
Abstract. With the rapid rise of interest in audio-visual material analysis, there is an important need for tools which can efficiently describe the AV content information. Generally the hierarchical structure of video must be determined before any content-based manipulation. This process is based on shot boundary detection known also as scene change detection. To enable online video indexing,, in this paper a method based on local histogram has been used to implement a shot cut detector for real-time applications. A hardware implementation of the technique using the Xup virtex 2 pro FPGA based board was proposed. The cut detection algorithm was tested on standard video processing benchmarks and significances of the result are presented.
Keywords. FPGA, Shot boundary detection, Cut, FPGA, Video decoder
1. Introduction Actually, video data is becoming very important in many application domains such as interactive-TV, video-on-demand and multi-media processing tools. Furthermore the computer vision and the image processing domains are mainly focusing on the analysis/processing of digital multimedia continent. Partitioning a video sequence into shots is considered as the first step toward video-content analysis and contentbased video browsing and retrieval. A video shot is defined as a series of interrelated consecutive frames taken contiguously by a single camera and representing a continuous action in time and space. In order to develop any content-based manipulations on video information, STA'2009-IRM-721, pages 1591-1601 Academic Publication Center of Tunis, Tunisia
2
IJ_STA, Vol. X, N° X, pages 1 à X
hierarchical structure must be determined. In this way, a standard hierarchical video model was defined as shown in Fig. 1. Video Scenes Schots Frames
Fig. 1. Standard hierarchical video model
In a video sequence there can be a number of different types of transition effects between shots (Figure 2). These transition effects may be classified into many categories [1]: • Cut: It is a hard boundary or a clear cut which appears through a complete shot over a span of two serial frames. It is mainly used in live transmissions. • Fade: Two different kinds of fade are used: The fade-in and the fade-out. The fade-out emerges when the image fades to a black screen or a dot. The fade-in appears when the image is displayed from a black image. Both effects duration is only for a few frames. • Dissolve: It is a synchronous occurrence of a fade-in and a fade-out. The images of the first shot get dimmer and those of the second shot get brighter until the second replaces the first. • Wipe: This is a virtual line going across the screen clearing the old scene and displaying a new one. It also occurs over more frames. In this work, we are interested only in the detection of the video cuts. In fact the cut is the most used video transition. Figure 2 presents an example of a cut transition.
Shot t
Shot t+1
Fig. 2. Video cut
The shot boundary detection is highly important for the Indexing and Retrieval process; it represents a fundamental step in the analysis of the video system, as shown in figure 3. In fact shots are considered to be the primitives for higher level content analysis, indexing, and classification. In recent years several methods have appeared in literature [2, 3, 4, 5, 6, 7, 8]. Figure 3 shows an example of video retrieval architecture [9]. The boundary shot detection block is the first and the fundamental step in the system architecture.
Article title 3 Videos
Scenes Bound Detection
Shots
Cut detection
Clustering
Scene cut detection
Labeling Scene description
Scene Units Images
Grouping Information
Shot representation
Synthetical representation frame (Mosaicing image)
Representation Frame
Akquitool
Still image analysis
Camera description
Data Base
Query Color
Color
Texture
Texture Contour
Contour Concepts
Domaine description
Object recognition
Object Description
Fig. 3. System architecture [9]
In this work, we present a hardware implementation of shot cut detector based on local histogram technique. In this context, an implementation of a prototype system based in the Xilinx Virtex-II Pro FPGA was developed. The proposed architecture integrates embedded processors, embedded memory, memory control modules, interface modules, Digital Clock Managers (DCM) and other resources. The rest of this paper is organized as follows: Section II gives a description of cut detection approach. Section III presents the software validation of the applied technique. Section IV shows the proposed architecture and gives details on the hardware implementation issues. Finally section V gives a conclusion and a brief discussion on future research direction.
2. Block Histogram for cut detection The cut detection is based on the analysis of the similarity of frames surrounding the cut, which should present enough significant changes in their visual contents. The cut detection consists of several computation steps. First, the feature extraction is performed, where features depict various aspects of the visual content of the video frames. Then, a metric distance is used to quantify feature variation between neighbouring frames. The discontinuity value is typically the magnitude of this variation and serves as an input for the detector. Finally, the cuts are detected by
4
IJ_STA, Vol. X, N° X, pages 1 à X
comparing these values to a threshold to identify whenever a significant visual discontinuity has occurred. In this paper we have used the local grey scale histograms proposed by Nagasaka and Tanaka [10] to detect video cut. This technique consists in dividing each frame into square blocks and evaluating local histograms before calculating a difference metric between consecutive frames. The principle of the applied technique for cut detection is shown in figure 4 Frame t Gray scale level
Frame t +1 Gray scale level
Division of the image into 16 blocks
Histograms calculation
i