International Journal of Pure and Applied Mathematics Volume 118 No. 9 2018, 571-584 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu Special Issue
Parallelization of Video Summarization over Multi-Core Processors Monika Jain, Rahul Saxena
[email protected],
[email protected] Dept. of Information Technology, Manipal University Jaipur
Abstract In digital era of 21st century, majority of the information on the web today is available in the form of multimedia files consisting of audio and video data. Data in the form of videos are the best informative repository but at the same time consumes a lot of storage space and the information extraction becomes a trickier and time consuming task owing to large volumes and high quality videos. Video summarization is a technique to extract the relevant frames from a bulky video in order to quickly get the summary of events and analyze the information. However, the traditional video summarization algorithms tend to perform slow due to complex algorithmic computations and large size videos. Thus, taking the advantage of the recent computer architectures that allow high parallelism, many algorithms has been proposed and used for summarization of video. In this paper, we have introduced aparallel version of video summarization algorithm, which reduces the processing time by using the multi core architecture. The algorithm takes into consideration the histogram, the entropy of a color, simple motion detection and cumulative motion detection to determine the image similarity over OpenMP computing platform for multi core processors. The experimental results show a considerable improvement in the execution timing of the parallelized versión of the algorithm in comparison to its serial counterpart under the lights of graphical results and analysis. Finally, the paper sums up with a conclusión with a view that the proposed method can further be investigated and enhanced algorithmic efficiency can be attained by using OpenMP and CUDA i.e. Multi-core and Many core computing platforms together.
1. Introduction Due to enormous amount of multimedia information like audio, video, images etc. a huge amount of storage is consumed in keeping this information which adds to the processing cost of operations in order to extract the useful information from it. It‟s important to extract only the relevant information while removing the duplicate frames of videos. A video, which takes hours to watch, can be seen in minutes by summarizing it [1]. A video is a synchronous sequence of a number of frames, each frame being a 2-D image. So the basic unit in a video is a frame. The video can also be thought of as a collection of many scenes, where a scene is a collection of shots that have the same context [2]. Summarizing a video represents abstract view of original video sequence which can be used in video browsing and
571
ijpam.eu
International Journal of Pure and Applied Mathematics
Special Issue
retrieval systems. It can be a highlight of original sequence, which is the concatenation of a user defined number of selected video segments or can be a collection of key frames. For instance, let us consider a video input of size 3.7 MB of 2 minutes 56 seconds as shown in Figure 1.1 (a). With the use of open source software, the video is converted into 1764 frames a small sample of frames is shown in Figure 1.1 (b). After splitting into frames, the duplicate frames from each row are detected and deleted by calculating appropriate similarity score; Figure 1.1 (c) shows the remaining frames after deleting the frames in between the rows. Figure 1.1. (d) shows the summarized videos made from the remaining frames, time taken by the summarized videos is 37 seconds.
Figure 1.1 (a) Video Input of time 2 Minutes 56 seconds. Figure 1.1 (b) Video converted into 1764 frames 1.1 containing some duplicate frames
Figure 1.1 (c) Duplicate frames are detected and removed. Figure 1.1 (d) Summarized video made from the remaining unique frames time taken by summarized video is 37 seconds.
The paper discusses the approach to generate a video summary as discussed in the above example along with state of art approaches. The paper has five sections;First section discusses about video summarization, need and its importance. Second section presents the state of art techniques over video and image processing along with video summarization algorithms used to target the problem.Further, the proposed method for summarizing the video and its parallelized
572
International Journal of Pure and Applied Mathematics
Special Issue
code along with brief description of the parallel computing architectures has been discussed. In the Third section, proposed algorithm along with the pseudo code has been discussed in detail. The Fourth section analyses the results obtained from the proposed algorithm under the lights of graphical results which claims a significant reduction in execution time of the algorithm producing an informative video summary of an elongated video. The Fifth section concludes the paper with analysis of the results and exploring the ideas for future possibilities in the proposed methodology. 1.1.1 Need for Video Summarization In the current era, video based data acquisition has emerged as the most prominent source of information. In fields like surveillance, criminal identification etc., information based on this data repository plays a crucial role. As per [13], a number of mysteries have been resolved with the help of camera recordings and footages. But maneuvering over long video clips of several hours having meaning less static clips is a waste of time. Thus, summarizing the video based on removal of static clips and keeping the recording squeezed to the events and happenings at the spot is more purposeful and saves both storage space as well as time. There are many aspects and lookouts that how a video should be cut down in order to represent the information in the most appropriate manner as per the need and demand of the application. In this reported work, we have focused more on executing the simple redundant frame elimination algorithm over multiple processors by developing a parallel code using OpenMP and OpenCV library routines. The aim is to quickly summarize a huge video clip to several minutes by retaining the maximum information for the user. 1.1.2 Related Work In early 1970‟s, text based image retrieval was done in which first the image is marked by text and then text based retrieval is done using the DBMS architectures [14,15]. However, the technique could not work well for large number of images. Also manual annotation was a tedious task. In 1990‟s, the content based image retrieval techniques came into existence. This image retrieval and matching technique involves a number of aspects like feature extraction, texture recognition etc. Using this many video summarization techniqueshave been proposed. Authors in [16] used perceived motion energy patterns in the video by selecting at turning points the frame with motion acceleration and deceleration as key frames. In Visual Edge Descriptors calculation [17] three visual highlights: shading histogram, wavelet insights and edge heading histogram are utilized for determination of key edges. Closeness measures are figured for every descriptor and joined to shape an edge distinction measure. Constancy, Shot Reconstruction Degree, Compression Ratio qualities are utilized to assess the video outline [17].In Motion Attention Model [18] shots are recognized utilizing shading conveyance and edge covering proportion that expansion the precision of shot location. Key edges are removed from each shot by utilizing the movement consideration show. Here the first and last edge of each shot are considered as key casing and the others are extricated by embracing movement consideration display [18]. These key edges are then grouped and a need esteem is processed by evaluating movement vitality and shading variety of shots.In Multiple Visual Descriptor Features calculation [19], the key edges are chosen by developing the total chart for the edge contrast esteems. The edges at the sharp incline show the noteworthy visual change; subsequently they are chosen and incorporated into
573
International Journal of Pure and Applied Mathematics
Special Issue
the last rundown.Movement concentrating strategy [20] concentrates on one steady speed movement and adjusts the video outlines by settling centered movement into a static circumstance. An outline is produced containing every single moving item and inserted with spatial and movement data. Foundation subtraction and min cut are for the most part utilized as a part of movement centering.In Camera Motion and Object Motion [21], the video is sectioned utilizing camera movement based classes: pan, zoom in, zoom out and settled. Last key edge determinations from each of these portions are separated in light of certainty esteem detailed for the zoom, pan and steady segments. In this paper, we have used a combined approach from the above mentioned techniques as they are well established and has performed well to generate video summaries. The above methods met their working limitation for large video clips as the image data set becomes huge to tolerate for which we will be presenting a multi-core architecture based modification to the algorithms based on above techniques.
2. Parallel approach to Video Summarization In the early 1970, first microprocessor was developed by Intel, from each generation of processors developed there were several developments like smaller size and faster [3]. Single core uses pipelining (second instruction starts when first finishes) to run the code in a sequential manner. But the demand of the hour is to run the code faster with the same clock speed for which multi core processing power of the machine came into play. Multi core processors [4] follow the ideology of distributing the problem space over the multiple cores where each core having thread units to work upon further parallelizes the problem.A normal program has the combination of serial code (dependent instructions which execute serially on one core) and parallel code (independent instructions which execute parallely among multiple cores) [5].
Figure 2.1 Simple program having dependent and independent instructions running on single and multi-core. 2.1.Many core architectures High performance of the problem is achieved using many core architectures. Many core uses the graphical processing unit based processors having processing power of some 100s or 1000s
574
International Journal of Pure and Applied Mathematics
Special Issue
independent cores [6]. Many supercomputers are introduced till now giving extremely high performance. Top Supercomputer isNational Supercomputing Center installed in Wuxi, Chinahaving 10,649,600 cores[7].Compute Unified Device Architecture (CUDA) is used for achieving high performance through many core architectures. However, in the reported work we have developed the multi-core solution for the algorithm and modification of the algorithm for many core architecture is being explored as future aspect to the problem.
2.2.Multi Core architecture Other way of parallelizing the code is through processors present in the system namely central processing unit. A normal systemavailable in the market has dual core: two processors, quad core: four processors present in the system. Each core supports multithreading i.e. uses the threads to complete the work of the program. Most of the programs available today, generally uses one core while remaining cores does not do any work. The idea of parallel processing through multi cores is to use these remaining cores also to complete the code in a faster and efficient way [7].Open Multiprocessing and Open Multiprocessing Interface are used for parallelizing the code among multi processors present in the system. 2.2.1. OPENMP Open Multiprocessing is an application-programminginterface; it is based on the multithreading and shared memory parallelism [8][9]. OPENMP is based on three main components: Compiler directives, OPENMP routines, environmental variables. OPENMP is based on fork join model. Simple program of OpenMP is given below: #include // Header files of open MP #include int main(void) { #pragma omp parallel // Compiler directives int id=omp_get_thread_num(); // Open MP Clause printf ("%d \n hello from thread", id); return 0; } export OMP_NUM_THREADS=2 // Environmental variables, setting number of threads outside the program.
Figure 2.2.1 Basic Open MP program
3. Process of Video Summarization
575
International Journal of Pure and Applied Mathematics
Special Issue
Video Summarization is the extraction of valuable data from long video sequences. The process is divided into mainly three steps: 1. Split the video into frames. 2. Parse the frames to detect repetitions. 3. Remove the repetitions leaving only the significant frames. 1. Splitting a Video into Frames As video cannot be directly parsed, we have to convert video into frames so that it can be processed. We have used an open-source tool called „ffmpeg‟ to accomplish this task [10]. ffmpeg is a very fast video and audio converter that can also grab from a live audio/video source. It can also convert between arbitrary sample rates and resize video on the fly with a high quality polyphase filter. ffmpeg is used to convert video files to a frames.
2. Parse the frames to Detect Repetitions In the first step, video is converted into frames using ffmpeg open source software. The next step is to parse the frames such that duplicate files are detected. Following are the steps to parse the frames. ● The video is divided into „shots‟ (gaps of 10 frames is taken). ● The current frame is compared to the next, the one after and so on with every image after it. ● Some pre-calculated tables and heuristic comparisons are used to detect similarity of the frames. ● If the frames are similar, frames are marked as such. ● The frame of the next shot is chosen and the process repeats.
3. Similarity Detection After completing two steps, the next step is to remove the duplicate frames.There are a bunch of heuristics used to determine image similarity, namely: a.
A histogram
b.
The entropy of a color
c.
Simple motion detection
d.
Cumulative motion detection
576
International Journal of Pure and Applied Mathematics
Special Issue
a. Histogram ● The implemented histogram counts the frequency of each color in the image. ● In this paper, we have histogram used to populate the entropy vector. ● It is useful for weighing in the decisión for image similarity by comparing histograms. Such a feature was not implemented in the project due to a lack of parameter tuning resources and thus the inability to find a good value for it‟s weight
b. Entropy ● The entropy of a color is defined here as a function of its probability of occurrence upon choosing a random pixel [11]. ● The probability is trivial to get from the histogram and the total count. ● The function used was: E(p) = p * log2(p) where p = probability of color
Figure 3. Basic function of entropy function c. Simple Motion Detection The implementación of this concept has provén effective for our test cases. The algorithm is shown below: ● We simply take the „difference‟ of the images.
577
International Journal of Pure and Applied Mathematics
Special Issue
● For each pixel, we check if the difference pixel is above a threshold value. ● If it is, we increment a counter to maintain the number of significant different pixels. ● If the significantly different pixel number is more than a threshold, then this weighs into the similarity criteria.
d. Cumulative Motion Detection Cumulative motion detection is an experimental feature for video summarization rather than parallel processing [12]. ● When little motion has occurred between frames, which means multiple frames have been detected similar in a row, the metric amount of motion detected is accumulated in a counter and added in each successive motion calculation. ● This allows for very slow, gradual changes to feature in the summary rather than simply being marked too similar to some frame before it. ● It does hinder the potential maximum speedup due to it‟s serial nature.
4. Pseudocode For parallelization of video summarization, the pseudo code is given below: 1) Represent each frame as an entry in a vector For each i included [i] = true;
2) Set each frame to be included in the summary #pragma omp parallel for for each index initialize entropy vector to zero Read the image from the string path using cv: imread function
578
International Journal of Pure and Applied Mathematics
3. For each frame, 3.1. Populate the histogram For each i and j Calculate grayscale value as the avg of the RGB values ++hist[grayscale value (img, i, j)];
3.2. Populate the entropy vector for i: 0 to 255 // Calculate the probability of occurrence of grayscale level 'i' double p = hist[i] / total; // Calculate entropy based on probability of grayscale_val(i) if (p > 0.0) entropy_vec[index] += p * std::log2(p);
4.
For each shot(with gaps of 10), for each shot and shot