MDC: A Software Tool for Developing MPEG Applications

4 downloads 0 Views 43KB Size Report
Joan L. Mitchell, William B. Pennebaker, Chad E. Fogg, and Didier J. LeGall, MPEG Video Compression. Standard, International Thomason Publishing, 1997. 2.
MDC: A Software Tool for Developing MPEG Applications∗ Dongge Li and Ishwar K. Sethi Vision and Neural Networks Laboratory Department of Computer Science, Wayne State University Detroit, Michigan 48202, USA {dil, sethi}@cs.wayne.edu

Abstract This paper presents a modularization method for the design of MPEG decoding software. Compared to the traditional MPEG decoder architecture, the architecture obtained from the modularization method can effectively improve the flexibility and reusability of MPEG decoding software without affecting the speed performance. Using this design approach the paper presents MPEG Developing Classes (MDC), a software tool for developing MPEG video applications. The feedback from users of MDC shows that MDC is flexible in use and easy to understand. It allows users to develop their specific decoders without going into details of MPEG.

1. Introduction The Moving Picture Expert Group (MPEG) standard[1] is perhaps the best example of how standardization can spur an industrial growth. Since its inception in 1988, the MPEG standard has not only drawn the attention of academic and industrial researchers but also it has been a catalyst for converging technologies from entertainment to computer industry. After many years' efforts, several software MPEG decoders[2-5] are now available in both commercial and public domain. The main advantage of a software MPEG decoder over a hardware decoder[6] is its flexibility. Because specifications and requirements of MPEG decoders vary for different applications, changes in functionality and algorithms are usually unavoidable. It is commonly accepted that software decoders are easier to reuse and modify than hardware decoders. This feature of software decoders, however, is obscured in practice due to the complexity of MPEG standard and algorithms. It is still difficult to modify a software ∗

decoder for a different purpose. Although several companies have client-server-based MPEG SDK tools[5], these are essentially software-driven MPEG players and can only provide limited functions and flexibility. A MPEG library thus seems to be a more promising solution. However, current MPEG libraries[7-9] are too crude to accommodate various application requirements. Most of them are simply wrappers written around existing MPEG decoders. The absence of effective MPEG developing tools has hindered research on MPEG applications. The goal of this paper is to correct this situation by describing the MPEG developing classes (MDC in brief) to provide researchers and developers with fundamental classes for MPEG software development. Our software design approach is aimed at improving the flexibility and reusability of MPEG software decoder. In this paper, when we say MPEG we only mean MPEG-1 and MPEG-2. Furthermore, we concentrate only on MPEG video as MPEG audio is currently out of the scope of our work. The remainder of the paper is organized as follows. Section 2 introduces the structural hierarchy of MPEG video and discusses the drawbacks of traditional architecture of MPEG software decoders. The design method and architecture of MDC are discussed in Section 3 and Section 4, respectively. In Section 5, the performance of MDC is evaluated. Finally, we provide a summary of the work and future directions in Section 6.

2. Structural hierarchy of MPEG video MPEG video uses a hierarchical layered syntax to help with error handling, random search, and synchronization. As shown in Figure 1, the hierarchical syntax of MPEG video consists of six layers: sequence, group of pictures (GOP), picture, slice, macroblock, and block. Data of higher layers consist of data from lower layers.

This research is supported in part by the National Science Foundation under grant EIA-97-29818.

MPEG Video bitstream Sequence

GOP

GOP

GOP

...

header

Picture

Sequence

Sequence

end code

Picture

...

Picture

GOP

Slice Picture

Slice ... Slice

Macroblock

3. Design method of MDC

Macroblock

...

Macroblock

Slice

8X8 8X8 Macroblock 8X8 8X8

+

Luminance block

pseudo-code. While such decoders are straightforward to implement, they lack in flexibility and reusability. Changes in one layer can easily propagate to its adjacent layers. Often, the developers have to go through the entire software for even a small change in the decoder. Another problem associated with straightforward implementation is the ensuing complexity of the decoder when too many functions are desired. Again, it is caused by tight coupling among different layers. The modularization based on MPEG layers can hardly encapsulate the data very well. Thus, an implementation of comprehensive MPEG decoding library using the traditional straightforward architecture is difficult.

8X8

+

8X8

Chrominance block

Figure 1: The layered structure of MPEG video The top level of a MPEG video bitstream, known as the sequence layer, is completely self-contained, and consists of one or more groups of pictures. The group of pictures is composed of one or more encoded pictures. The GOP header is mandatory in MPEG-1 and optional in MPEG-2. The picture layer contains the coded information for each picture in the group. Each picture has a picture header followed by one or more slices. In turn, each slice is made up of a slice header and a contiguous sequence of raster ordered macroblocks. The first slice always starts at the top left of the picture, and the last slice always ends in the bottom right corner. A macroblock is the basic building block of an MPEG picture. It usually consists of four 8x8 DCT blocks for luminance samples and two 8x8 DCT blocks for chrominance. The block layer is the lowest layer of MPEG video sequence. As Figure 1 indicates data at higher layers consist of data from lower layers; however, information encoded in higher layers is usually required for decoding information at lower layers. The different layers are coupled tightly with each other. Most MPEG software decoders use the MPEG hierarchy in a straightforward manner to encapsulate different modules. Actually, the MPEG documentation outlines such decoders through

MDC is composed of several fundamental classes for parsing and accessing MPEG bitstreams. It provides researchers and developers with dozens of facilities for software decoder development. However, MDC by itself is not a standalone executable MPEG decoder; it is essentially a MPEG developing tool. To satisfy requirements of different applications maximum flexibility and functionality are required for such developing tools. The main features possessed by MDC include: • Compatibility with multiple operating systems, such as UNIX, DOS, and WINDOWS. • Support for both video sequence layer and MPEG system layer (HP@HL). For MPEG system sequence, facilities for multi-channel management are also provided. • Support for data partitioning, signal-to-noise ratio, and spatial scalabilities. • Facilities for both sequential and random access. More than a dozen of functions are provided in MDC for different kinds of MPEG accessing operations. • Compressed domain feature extraction[10-12]. Several features, such as motion vectors and DCT components, can be extracted directly from compressed domain by calling functions provided by MDC. • Multiple output formats. Five image formats, BMP, SIF, TGA, PPM, and RAW, are supported for decoded pictures. • Good reusability of source codes. • Optimized flexibility in the usage of MDC. Most functions of MDC can be called in an arbitrary order without too many things to be taken care of. The operations performed in MPEG syntax layers can be sorted into two basic categories: accessing operations and decoding operations. Accessing

operations are the operations locating desired data position. Decoding operations are the operations decoding the bitstream data in the layer. For example, the operation locating a specific picture in MPEG bitstream is an accessing operation in picture layer, while the operation decoding the located picture is a decoding operation in the same layer. Because accessing operations in the same layer are always performed before decoding operations and information encoded in the header of the layer is required for accessing operations, operations for parsing headers of syntax layers are classified as accessing operations. For all of the six MPEG layers, operations performed in them are either accessing operations, or decoding operations, or combinations from the above two categories. Each MPEG layer can thus be divided into two basic operation modules: accessing module and decoding module, which consist of all accessing operations and decoding operations, respectively. By definition, decoding modules of higher layers contain both accessing modules and decoding modules of lower layers. Within the same MPEG layer, decoding operations usually require information obtained from accessing operations; however, accessing operations can be performed correctly without decoding operations. In other words, there is a unidirectional, dependant relationship between the decoding module and the accessing module in each syntax layer. Because information decoded in higher layers is required by operations in lower layers, both accessing modules and decoding modules in lower layers have dependent relationship with their adjacent higher layers. The decoding modules in higher layers also have dependent relationship with accessing modules in their adjacent lower layers. For example, the coding type of current frame, which is parsed by accessing operations in picture layer, could affect the decoding routines in GOP layer. Adjacent layers are thus tightly coupled with each other through the dependent relationship between operation modules. Based on above discussion, Figure 2 shows the dependency graph of operation modules of MPEG layers. The figure highlights the observation that only a unidirectional dependant relationship exists between the decoding modules and those modules above it. Thus, it is possible to segment the hierarchy at the interface between the decoding module and the accessing module in the same layer such that the two resulting segments will also have a unidirectional, dependant relationship. Good module architecture can thus be obtained by segmenting the hierarchy in this way. As shown in Figure 3, the modularization procedure of MPEG hierarchy consists of two steps. In the first

Accessing Sequence Layer

Decoding Accessing

GOP Layer

Decoding Accessing

Picture Layer

Decoding Accessing Decoding

Slice Layer

Accessing Macroblock Layer Decoding Block Layer

Dependent: Figure 2: Dependency graph of operation modules of MPEG layers step, the hierarchy is segmented at the interface between the decoding module and the accessing module in certain MPEG layer. After segmentation, the hierarchy is divided into two segments: the lower segment and the higher segment, which consist of the lower layers and the higher layers in the hierarchy, respectively. Like the relationship between the higher layers and the lower layers, the higher segment contains the lower segment. This architecture is still not good for avoiding the propagation of changes and improving the flexibility of software. In the second step, the lower segment is moved out from the higher segment. This can be easily done by adding a third module to contain and manage both the higher segment and the lower segment. An example for this modularization method is shown in Figure 4. In this example, we separate the hierarchy of six MPEG layers into three segments. The procedure of modularization is performed in a top-down style. First, the hierarchy is separated into two segments. The higher segment consists of sequence layer, GOP layer, and the accessing module of picture layer, and the lower segment contains all of the remaining modules and layers. Second, as shown in Figure 4, the lower segment is further separated into two sub-segments. The dependencies between different segments are also shown in Figure 4. After segmentation, no bidirectional dependency exists directly or indirectly

segment containing higher layers

higher layer accessing module lower layer

segment containing lower layers

decoding module

MPG decoder

Segment above picture layer

(a) segment containing higher layers

Layers:

Accessing module Decoding module

Sequence

Accessing module Decoding module

GOP

Accessing module

segment containing higher layers

Picture

segment containing lower layers

segment containing lower layers

(b) Figure 3: Procedure of modularization between any two segments. Now it is very easy to modularize the hierarchy and obtain a well-structured MPEG decoding software with the following features: • No variables are shared between any two modules obtained from the above modularization method. Thus the global variables commonly used in many software decoders can be totally removed without affecting the performance. • Data flow is only in one direction in the system. There is no loop of data movement. • Compared to traditional architecture, each module contains fewer MPEG layers. The complexity of modules can be controlled by segmenting the hierarchy in different ways. All of the above features make the software developed by using this method easy to be understood, maintained, and reused. Because the complexity and the size of each module will decrease when the number of segments increase, the more segments the hierarchy is separated into, the more flexible and better reusable the software will be. This design method is especially suitable for developing large MPEG software.

4. Structure of MDC To validate the design method discussed in the last section, and to provide researchers with a MPEG developing tool, we have implemented the MPEG Developing Classes and made it available to others1. Used as a developing tool, MDC must have good performance in speed, and at the same time it must be powerful and flexible in use and also easy for understanding, maintaining, and reusing. In our 1

The software is available at the web site: http://vision.cs.wayne.edu/mpeg/.

Decoding module Segment Segment above Accessing module macroblock Decoding module below layer picture Accessing module

Slice

layer

Macroblock Decoding module Segment below Accessing module macroblock Decoding module layer

Block

Dependent:

Figure 4: Segmentation of MPEG syntax layers implementation of MDC, we choose picture layer for separating the hierarchy into two segments. The higher segment consists of sequence layer, GOP layer, and the accessing module of picture layer. The lower segment contains all of the remaining modules and layers, which include the decoding module of picture layer, slice layer, macroblock layer, and block layer. This choice is mainly based on the following considerations: 1. Most of the functions provided by MDC are operated on picture level, such as the functions for locating a specific picture, and the functions for decoding a picture. Segmenting the hierarchy at the picture layer will limit most of the functions to only one segment and provide maximum flexibility for operations on this layer. 2. The time taken for decoding MPEG pictures is the dominant factor that affects the speed performance of the system. Thus, the number of segments below picture layer should be as small as possible for better performance. Once the hierarchy of layers is segmented, the operations and data associated with each segment are then encapsulated by using object oriented programming (OOP) technique. We will prove by experimental results in the next section that such implementation can obtain

the same speed performance as the traditional method does while outperforming it in many other aspects. The structure of MDC is illustrated in Figure 5. Each block in the graph represents a class in MDC. There are eight classes in MDC. Only the top two classes, CMpegStream and CMpegDecoder, are directly used by users of MDC. All other classes are supporting classes for these two classes. CMpegStream is developed for MPEG system streams, which contain more than one MPEG audio or video bitstreams. In this paper, we only concentrate on CMpegDecoder and the two classes below it. Only these three classes in MDC relate directly to the decoding of MPEG video. CMpegDecoder is the fundamental class for the development of MPEG decoding software. It provides dozens of functions for accessing and decoding MPEG bitstreams. It contains two classes: CLayerParser and CPicDecoder, which correspond to the higher segment, and the lower segment obtained from the segmentation of hierarchy, respectively. CLayerParser provides many functions for accessing pictures in MPEG bitstream, while CPicDecoder mainly deals with the decoding of MPEG pictures. There is no close relationship between these two classes except that information decoded by routines in CLayerParser is passed to CPicDecoder through CMpegDecoder. CMpegStream Contain CMpegDecoder

COBitFileBuf

CStreamBuf

COBitBuf

Bits/ Frame

Stream Werner Heuris Sonyct2 Nlquant Tens_080

CPicDecoder classes for managing input buffers

CDList

Figure 5: Structure of MDC

5. Performance evaluation of MDC To evaluate the performance of MDC and validate our design method, we implemented two MPEG decoders: MDC-based decoder and traditional decoder, and tested both of them on a number of MPEG streams. The MDC-based decoder as its name suggests is based on classes of MDC and implemented by using the

Sort

No.

MDC decoder Fps, bps

Traditional decoder Fps, bps

1004

40.2, 1.7M

36, 1.5M

1.4M 3540 15.2M 30 4.5M 120 80.2M 450

33.1, 1.5M 5, 2.5M 8.6, 1.3M 5.2, 1.4M

32.8, 1.5M 5, 2.5M 9.2, 1.4M 5.4, 1.5M

Second

MPEG1 1.3M MPEG1 MPEG2 MPEG2 MPEG2

Table 1: Performance comparison of MPEG decoders

classes for decoding MPEG video

Derive CLayerParser

functions provided by MDC. The code for this decoder is given in the Appendix. The traditional decoder is developed by the traditional straightforward design. It adopts the same decoding and accessing algorithms as MDC. Performance comparison of these two decoders is shown in Table 1. All of the tests are performed on a 266MHz P2 IBM-PC. The performance is measured by both frames decoded per second and bits decoded per second. For each MPEG stream, we also list its bit rate and the total number of frames. As shown in Table 1, the MDC based decoder has speed performance similar to the traditional decoder. This demonstrates that our modularization design can achieve the same speed performance as traditional architecture, yet provide us with a flexible and easy to reuse decoder. The main factor that affects the speed performance of MPEG decoder is the adopted decoding algorithms.

Besides implementing several MPEG decoders, we also used MDC for moving object detection in compressed domain. With the help of MDC, we can concentrate our efforts on methodology without too many concerns about MPEG syntax details. Within two days, three different methods are implemented and tested. The final version of the prototype system consists of only 364 lines of code.2 The feedback from users provides us with another source for evaluating MDC. In less than three months, over one thousand researchers and developers from universities and corporations have already downloaded the software through Internet. A lot of people have already developed their own MPEG decoding software by using MDC, including some nonprofessional programmers and undergraduate students. Many developers do not have strong background knowledge about MPEG. From user feedback, we also find that the MDC is being used for many purposes that we did not 2

The moving object demo is available at the web site: http://vision.cs.wayne.edu/

anticipate. One company is even discussing about embedding such a tool into MPEG graphic card.

6. Conclusion and future works This paper presented a modularization method for the design of MPEG decoding software and implemented a MPEG decoding tool, MDC, with the help of this method. Compared to the straightforward method, this method can effectively improve the flexibility and reusability of MPEG decoding software without affecting the speed performance. Because of better software structure and smaller modules obtained from this method, MPEG software developed by using this method is easy for understanding, maintaining, and debugging. This makes the method especially suitable for developing large MPEG software. As the feedback from users of MDC shows, MDC is flexible in use and easy for understanding. It allows people to develop their specific video decoders without going into details of MPEG. All these advantages of MDC, in turn, prove from practice the validation of our design method presented in this paper. Although the method presented in this paper is initially designed for MPEG video, we believe it can be applied easily for MPEG audio and other similar situations, such as JPEG. The users' feedback tells us that developing tools for encoding MPEG video and audio are needed. We hope to extend our work for these tasks.

7. Acknowledgements We would like to thank Professor Vaclav Rajlich of Wayne State University for his careful reading of the manuscript and his valuable comments. We also extend our thanks to numerous people who have provided comments and feedback about their experience with MDC.

4.

Lawrence A. Rowe, Ketan D. Patel, Brian C. Smith, and Kim Liu, "MPEG Video in Software Representation Transmission, and Playback," IS&T/SPIE Symposium on High Speed Networking and Multimedia Computing, pp. 1-11, San Jose, CA, February 1994. 5. URL: http://www.mpegtv.com/sdk.html 6. Nam Ling, Nien T. Wang, and Duan J. Ho, "An Efficient Controller Scheme for MPEG-2 Video Decoder," IEEE Transactions on Consumer Electronics, Vol.44, No. 2, pp. 451-453, May 1998. 7. MPEG library: http://www4.informatik.uni-erlangen.de/ Services/Doc/graphics/doc/mpeg/. 8. OOMPEG library: http://www.cs.brown.edu/software/ ooMPEG/. 9. MPEG-1 library (mpeg_lib): http://hpux.cae. wisc.edu/ hppd/hpux/X11/Graphics/mpeg_lib-1.2.1/ 10. N. Patel and I.K. Sethi, "Compressed Video Processing for Cut Detection," IEE Proceedings - Vision, Image and Signal Processing, Vol. 143, No. 3, pp. 315-323, October 1996. 11. Bo Shen and I.K. Sethi, "Direct Feature Extraction from Compressed Images," Proc. IS&T/SPIE Conf. Storage and Retrieval for Image and Video Databases, San Jose, Jan. 1996. 12. Bo Shen, Dongge Li, and Ishwar K. Sethi, "HDH Based Compressed Video Cut Detection," Proceedings of Visual 97, pp. 149-156, San Diego, CA, December 1997.

APPENDIX: SOURCE CODE OF A SIMPLE MDC DECODER #include #include "cmpvdeco.h" void main() { CMpvDecoder MpvDecoder; char psFileName[60]; cout

Suggest Documents