An Overview of Automatic Event Detection in Soccer Matches Samuel F. de Sousa J´unior and Arnaldo de A. Ara´ujo Departamento de Ciˆencia da Computac¸a˜ o Universidade Federal de Minas Gerais Belo Horizonte, Brazil
David Menotti Departamento de Computac¸a˜ o Universidade Federal de Ouro Preto Ouro Preto, Brazil
{samuelfelix,arnaldo}@dcc.ufmg.br
[email protected]
Abstract Sports video analysis has received special attention from researchers due to its high popularity and general interest on semantic analysis. Hence, soccer videos represent an interesting field for research allowing many types of applications: indexing, summarization, players’ behavior recognition and so forth. Many approaches have been applied for field extraction and recognition, arc and goalmouth detection, ball and players tracking, and high level techniques such as team tactics detection and soccer models definition. In this paper, we provide an hierarchy and we classify approaches into this hierarchy based on their analysis level, i.e., low, middle, and high levels. An overview of soccer event identification is presented and we discuss general issues related to it in order to provide relevant information about what has been done on soccer video processing.
1. Introduction Many researchers have proposed solutions for soccer related problems. A meaningful survey on sports video analysis presented in [22] discusses about several issues: sports tactics summarization, ball and players tracking, highlights extraction among others. In this paper, we divided analysis into three categories: low, middle and high. Basic marks recognition (field, lines, arcs, and goalmouth) are located on low-level analysis. It allows one to perform higher level analysis such as goal, penalty, offside, and kick detection. For most automatic parsing systems, the first step is to extract low-level features such as line marks as proposed in [5]. For arc (middle and penalty) recognition, three different techniques are presented: Ellipse Hough Transform (EHT) [24], Least Square Fitting (LSF) [10] and Invariant Pattern Filter (IPF) [1]. We define middle-level study as an analysis of players and ball behavior. Ball and players tracking system is required to understand what we are going to define later as
978-1-4244-9497-2/10/$26.00 ©2010 IEEE
high-level analysis. For instance, it is not really easy to develop an effective ball tracking system due to, among other things, the size of the ball, its irregular appearance on frames (i.e., size, shape, color, velocity, etc.), and sometimes it is occluded by people [26]. Finally, high-level analysis is related to soccer semantics, including players dribbling, game strategies, and specific events such as placed kick. For instance, a placed kick recognition (free kick, penalty, and corner kick) is performed in [15] and many approaches for ball tracking and detection are available [17], [26], [27], [28]. Furthermore, ASPOGAMO [2] is a robust system for soccer analysis. We also consider as high-level analysis those methods for indexing soccer games, e.g., highlight extraction, play/break detection, and specific scene detection, i.e., a placed kick, top-left corner kick, and so forth. Figure 1 shows how we have organized this overview. We placed those papers in each layer whose main focus fits to the categorization we have made, but they may not be restricted to it. For instance, [26] is placed in middlelevel, but it also presents event detection. However, we are interested in how they have addressed the ball detection problem. It is also important to mention that most of studies from higher layers provide solutions for addressing the problems faced by lower layers.
Semantics
High-level Event Indexing [2][4] [8] [9] [7] Summarization General Events [14] [5] [3] [16] [15] [18] Ball Tracking Ball Trajectory Goalmouth Field Lines
Player Tracking
Middle-level
[27] [17] [26] [6] [23] [12][25] [28]
Referee Detection
Low-level [10]
[21]
Corners [24]
[20]
[13]
[15]
Arcs
Figure 1. Levels
This overview aims to provide relevant information
31
about what has been done on event detection related to sports video, focusing on soccer videos. It is organized as follows. Section 2 presents general models for soccer analysis. It focuses on which steps are required for reaching the desired goal (lines recognition, tracking, etc.). After that, Section 3 focuses on field detection approaches (goalmouth, arc, and lines). Tracking methods applied to ball, player, and referee are detailed in Section 4. Event recognition and indexing are discussed in Section 5. Section 6 presents highlights and summarization techniques. Finally, conclusions are presented in Section 7.
2. General Approaches The automatic soccer video parser proposed in [5] allows line mark, motion, ball, and players recognition. This approach requires a camera shots classification. They classified the player’s position according to nine classes: (1) in the midfield; (2) around the left penalty area; (3) around the right penalty area; (4) near the top-left corner; (5) near the bottom-left corner; (6) near the top-right corner; (7) near the bottom-right corner; (8) in between the midfield and the left-penalty area; and finally (9) in between the midfield and the right-penalty area. Line mark recognition is performed by identifying prominent edges, then applying an edge trimming algorithm and finally recognizing mark patterns. Motion detection is performed by using blocking matching method. The motion vector (ui , vi ) is defined by minimizing the function: X e(ui , vi ) = |f1 (x − ui , y − vi ) − f2 (x, y)| (1) (x,y)∈Li
where i stands for a pixel, f1 and f2 represent two frames, and Li is the pixel neighbor to i with a certain size. Finally, ball detection is performed using morphological and chromatic features and players are detected by recognizing their uniform colors using histogram peaks. Three problems related to soccer video analysis were addressed in [18] using color based tracking and image mosaick. They are: (i) field extraction; (ii) ball and players tracking using template matching and Kalman filter; the occlusion problem is addressed using color histogram backprojection. Finally; (iii) the absolute players position is determined by means of a field model construction and image transforms. ASPOGAMO [2] is a system for team sport games analysis that extracts meaningful information from soccer videos. Also, ASPOGAMO presents an interface for analysis. The idea of automatic analysis is based on previous games. It is divided into two main components: the observation system which extracts ball and players motion characteristics and the automated hierarchical model which is divided into five layers. The first layer is the motion
model (position data and trajectories). The second one is the episode model which is based on ball actions. From that, there is the situative model which considers the game situation as action context. The fourth layer is the tactical model corresponding with the action selection and parametrization. The last layer is the analytical model, taking into account tactical behavior and game strategies. An approach combining audio clues and Hidden Markov Model is proposed in [3] for structuring soccer video. They argue that a soccer video is composed of five semantic parts: First-half, Advertisement, Studio, Advertisement, and finally, the Second-half. In order to segment a soccer game, it is necessary to define three different classes of audio behavior and search for those ones in soccer videos: gameaudio (noisy speech), advertisement-audio (speech and music) and studio-audio (just speech). In [14], audio clues are combined with visual information for semantic indexing and event detection. Audio features have been combined with visual features for goal detection in [4] (Section 5.2). An offside detection method is presented in [7]. However, before discussing about this event detection, it is suitable to describe the approach applied for reaching that. It uses six cameras distributed around the field, three in each side. Images gathered from those cameras are transferred to six nodes. Those nodes are connected to a central node (supervisor). The process is divided in: moving object segmentation, that detects motion by a background subtraction algorithm; players and referee classification, whose focus is to assign players and referee to some classes using an unsupervised method; player tracking (Section 4.2); ball and shot detection; and offside detection (Section 5.3).
3. Detection Soccer field segmentation can be sometimes a hard task to perform due to different grass colors, shadows over the field and weather variation. In this section, some of known approaches for field, arc, lines, and ball recognition techniques will be presented and discussed.
3.1. Field Lines The automatic soccer parser approach presented in [5] and briefly described in Section 2 uses three steps for field lines recognition. The first one applies a GaussianLaplacian edge detector for picking line marks, and then they remove lines whose colors are not white and perform a thinning operation. The second step focuses on edge trimming for patching up broken edges and finally, the last step aims to recognize line marks. A field detection method is implemented in [18] using the peaks of the histograms of the RGB color space. Binary
32
image (B(x, y)) is calculated using the following rule: |IR (x, y) − Rpeak | < Rth |IG (x, y) − Gpeak | < Gth 1 : |I (x, y) − B G peak | < Bth , B(x, y) = IG (x, y) > IR (x, y) I G (x, y) > IB (x, y) 0 : otherwise.
(2)
where IR (x, y), IG (x, y), and IB (x, y) represent the RGB values of the pixel, peaks are represented by Rpeak , Gpeak , and Bpeak values, and Rth , Gth , and Bth represent the thresholds for each channel. Due to holes produced by the players, they performed a boundary following algorithm for filling those holes and extract the field region. After that, they applied a player mask (P (x, y)) for detecting players, i.e., ( 1 if (x, y) ∈ F ield and B(x, y) = 0 P (x, y) = (3) 0 otherwise. A very similar approach is presented by Yoon et al. in [23]. However, according to [12], the suitable threshold may be different from one image to another due to weather condition, lighting and so forth. So, they have improved Yoon et al.’s algorithm for extracting grass field no matter what the conditions are. They select the mean value in the peak vicinity as follows: P H(i)≥α×H(Apeak ) i × H(i) 0 , ∀A ∈ {R, G, B} . Apeak = P H(i)≥α×H(Apeak ) H(i) (4) A kick refinement method [15] (Section 5) performs field lines detection. Initially, the dominant color based field is extracted using the HSI color space. The trained dominant color of the grass field region is represented by (Hd0 , Id0 , Sd0 ). The algorithm decides if a pixel belongs to the field if distance D(i, j) between the pixel to the dominant color is smaller than an experimentally defined threshold Dth . The distance is defined as: D (i, j) = |Hd (i, j)−Hd0 |+|Id (i, j)−Id0 |+|Sd (i, j)−Sd0 | (5) A top-hat transform is applied in the image (Section 3.4). Then, using morphological operators, the algorithm searches for line candidates and removes some candidates based on length criteria. Finally, a thinning algorithm is applied and the lines are found using Hough transform. SIMULFOOT project [13] proposes a field detection method using color and spatial coherence. It uses HSL space and consists in: (i) the pixel distribution analysis; (ii) the color coherence (relevant area characterization); (iii)
points selection; and (iv) the spatial coherence for those selected points. The authors argue that their approach is robust. The only constraint is that the soccer field color has to be majority in the image meaning that the camera should not close up on a player. Those authors initially point out some problems of simple approaches such as the one which finds the most frequent color points for deciding which color should be used in the segmentation process. In order to build their solution, they first provide a discrete representation of the HSL space made of 554 cells. Then, they induce interactions among cells by considering each point as a potential source. The area coherence is used for determining the soccer field. There are some properties related to players and lines impact on the field: players and ball regions (producing holes) and white lines (splitting the region of interest into different regions that are close to each other). Thus, some morphological operators are applied: the closing operation connects those components, then an opening operation removes non-relevant regions that are close to the border. The region of interest is obtained by keeping the connected component that contains the most pixels.
3.2. Goalmouth Detection Real time goalmouth detection has been performed in [20]. The authors estimate the dominant green color using the HSI space based on the assumption that field color may vary from stadium to stadium and the color is constrained to shadows and weather conditions. Coarse Spatial Representation (CSR) is also implemented for dominant green detection when coarse image resolution is found. The authors argue noise is reduced when applying the Hough transform within CSR green blocks. In order to find the dominant goal line orientation, their method starts with fixed parameters for the possible angle ranges, being refined during the first five minutes of the game. The authors also point out ways for making this calibration quicker in live game context. The two vertical bars are characterized by vertical strips of white followed by high contrast against the backdrop. For the horizontal cross-bar detection, it requires the goalline orientation previously detected in order to generate candidate lines connecting the vertical bars.
3.3. Arc Detection Arc detection can facilitate soccer video analysis, e.g., if one arc is successfully obtained, it can be easier to infer camera parameters. Different approaches were found for ellipse detection: Ellipse Hough Transform (EHT) [24]; Least Square Fitting (LSF) [10]; and Invariant Pattern Filter (IPF) [1]. The EHT is the most common one [24]. However, the authors of [21] argue that LSF and IPF approaches are faster
33
than EHT based ones. They also propose a LSF-based approach. Some of those approaches will be discussed in this section. An Ellipse Hough Transform based approach is presented in [24] using a measure function which can handle whole and partial ellipses. The proposed algorithm is based on three components: (i) ellipse estimation, which searches for the field central line and it rotates the image in order to have it in vertical orientation. Then, the algorithm estimates the horizontal ellipse by using template matching and a statistic approach (sample point statistics). After that, the (ii) ellipse search component searches among the estimated ellipses of the previous step and it selects the one with highest measured value. Finally, the (iii) ellipse refinement component converts and refines the found ellipse. Their results indicate that the proposed algorithm is better than EHT algorithms based on AMF (Absolute Measure Function). A fast arc detection has been performed in [21] using the LSF approach. It allows both middle and penalty arcs detection. Nevertheless, there are some obstacles to overcome. For instance, LSF is really sensitive to noise. Thus, the authors propose to separate arc points from others, which may lead to problems where arcs may appear segmented in many parts. The execution pipeline is described as follows: firstly, the field is segmented using the methods proposed in [16]. An edge point extraction is performed by using a 5x5 Laplace of Gaussian and Otsu methods. Secondly, straight lines are detected (using Hough Transform) and removed. Finally, they apply the Advanced Least Square Fitting (ALSF) for fitting arcs and false ones are removed.
3.4. Ball Detection During a soccer match, the ball is the center of attention and it represents one of the most interesting objects to understand on soccer video analysis environment. Data from balls on frames improves soccer video analysis [28]. This section presents some ball detection and ball size estimation techniques. In Section 4.1, some approaches for ball tracking will be discussed. A ball detection method is implemented in [15] focusing on kick refinement (Section 5). For ball detection, it uses a white top-hat Transform (f −(f ◦g)) where f, ◦, g represent the luminance, the opening morphological operation, and the structure element (circle with radius 3) respectively. The authors emphasize the following aspects related to the ball behavior: • Contrast between ball and field is strong. • Ball size is really small and can be estimated using players ratio to ball as presented in [26]. By doing this, non-ball regions can be removed from the image. • Finally, the ball usually has circular shape with almost the same height and width.
Using those assumptions, the algorithm proceeds removing non-relevant regions until only ball candidates remain. The candidates are dilated and those presenting circular contours with similar height and width are marked as a detected ball. The ball detection method proposed in [25] estimates the ball size in order to remove all non-ball objects from the scene. According to authors, it is easier to find the size of players and referee than to find the ball size. So, they first detect the players size and then estimate the ball size range according to the following function: R(F ) = [h/7; h/3], where F is a frame and h is the players average height of F . The ball may not have a circular shape, but according to statistical results, they came up with a height/width ratio less than 3. They present the following function (K) for deciding if an object (O) is a non-ball object: 1 if O is a line or a player, 1 if O has no ball color, K(O) = 1 if O is out of the ball size range, (6) 1 if 3 × w < h or 3 × h < w, 0 otherwise. After that, they obtain ball trajectory candidates by a verification procedure using Kalman filter. This approach is discussed in Section 4.1. An algorithm for ball size estimation has been proposed in [28]. The estimation is based on the relationship among salient objects (center line, goal mouth, people and the arc). This idea is interesting because it considers different objects for ball size estimation, not only one which can provide some sort of robustness for the detection. The second step focuses on four different ways for estimating the ball size. The first one is based on the vertical center line width. However, the authors argue that it is not very accurate considering that the line width and ball diameter ratio is sensitive to illumination and camera calibration. Thus, arc detection is required for ball size adjustment. If the arc is successfully found, then the ball size is estimated using the ellipse rather than the center line width. However, if the center line is not found, then the framework detects the goalmouth. If the goalmouth is not found, the algorithm estimates ball size using the people in the field. For ball size estimation using people size, they first use a seed-growing algorithm to find general objects, then, nonpeople objects are removed using a Bayesian rule. Based on the probability of the object, one can reject that this object is a person. They also can infer what kind of shot is based on the size of the people. For instance, the shot can be a close-up frame if the size of detected people is really large. The last step is the ball size adjustment. The algorithm smooths the curve of the estimated ball size function improving the accuracy of the ball sizes.
34
4. Tracking
4.2. Player Tracking
Many approaches proposed for tracking in soccer environments use Kalman filter and particle filter. In this section, we show some of those approaches focusing on ball and players tracking.
The player tracking method performed by [18] uses Kalman filter and template matching. Template of the players are generated using the mask presented in Equation 3. The process first searches for new players that do not overlap already tracked players and it inserts them into the tracking list. The new location is predicted using Kalman filter and the template is updated. Occlusion is considered only among players from different teams. For solving that, the histogram back-projection technique [19] is applied. The team identification is performed computing vertical RGB distribution and comparing it with team’s model distribution. A soccer player tracking system has been proposed in [6] using particle filter. The shape of a player is represented by a collection of particles. Candidate regions which are likely to be players are segmented and tracking is performed using sample importance resampling (SIR) particle filter [11]. An interesting approach for players tracking is performed by removing fast camera motions effects [12]. The algorithm detects the player’s position in the current frame by projecting the player’s position obtained in the previous frame. In case of occlusion, template matching, histogram back-projection, and merge-split method may be applied for addressing this problem. The approach proposed in [7] (mentioned in Section 2) presents a player’s tracking method. In this method, a player is represented by a Bounding Box (BB) and a state vector: xit = (pixt , vxi t , dixt , lxi t , cixt , sixt ). BB’s position, velocity and dimension are represented by pixt , vxi t , dixt respectively. Then, sixt maps BB’s status (single blob, merge blob, exiting blob, disappeared blob, and finally a single blob that belongs to a group blob). If the blob is a merge blob, then lxi t contains a set of labels, otherwise, it contains only a single label. Finally, cixt is a class number (from 1 to 5). They predicted the new state configuration based on past events. Many other steps are applied during tracking process but they will not be discussed here.
4.1. Ball Tracking Many researchers have focused on ball tracking [17][26][27]. The information about ball position during the match will improve soccer analysis. It is essential for obtaining information such as ball possession, passes, and event detection [26]. The ball tracking approach presented in [27] focus on ball trajectory characterization. Instead of judging if an object is a ball, this method judges if a candidate trajectory is a ball trajectory using statistical ideas, object motion theory, and computer vision. Their approach contains two components: ball trajectory mining and ball trajectory extension. It initializes using candidates generated by the ball detection framework [25] discussed in Section 3.4. That framework provides ball candidates by removing non-ball objects using color, shape and so on. Ball position is estimated using Kalman filter formulation. Tracking process works as follows: first, the ball trajectory mining procedure creates candidate feature images, then false candidates are removed using some heuristics. The trajectories are obtained using Kalman filter-based procedure and finally ball trajectories can be identified. After that, authors have improved the algorithm and presented a semantic analysis application in [26]. This new analysis detects passing and touching (discussed in Section 5). The method proposed in [17] estimates ball route using spatio-temporal relationships. Authors argue that it is difficult to identify the ball solely based on its color or shape. It is, sometimes, occluded by players or overlapped by field lines. There are some harder situations when the ball disappears with a player (occlusion) and reappears with another player. Thus, in order to handle it appropriately, they analyze possible ball routes. It works as follows: • Itemize possible ball transitions among objects that might have caused the disappearance (players, lines and so on); • Generate ball route candidates; • Generate rough trajectories for ball route candidates. • Evaluate those trajectories to find the best ball route. That paper discusses all those steps carefully. It also provides information related to success and failure cases.
5. Event Detection One of the most interesting things about soccer analysis is the ability to recognize events, such as a kick, goal, passes, offside, cards, ball possession, etc. Such systems could help referee to decide if a goal is a valid one or a ghost one. In this section, some of the approaches for event detection are discussed. Four events were supported by [26]: (i) touching (when a player touches the ball), (ii) passing (when a player passes the ball to another player), (iii) ball possession and(iv) playbreak structure. Before detailing this process, they defined the pivot points concept. Pivot points are points of the ball motion which direction and velocity are changed when a
35
player touches the ball. They may be generated not only by players but also by ball bouncing, camera motion and so on. For touching detection, they first detect pivots and remove those ones which were not generated by players and the rest form the touching points set. The passing event is detected by analyzing the trajectory assuming that generally passes produce a significant ball trajectory. Ball possession is computed by determining which team touches the ball. They applied SVM for recognizing the team of each player. Play and break events represent the moment when the ball gets in and out of the game (or when the referee stops it). The play-break event is recognized by analyzing the ball trajectory and by detecting the referee whistling.
this event, it is possible to explore some patterns that appear after a goal event. They propose the following procedures: checking the break duration (that is at least 30 and at most 120 seconds); checking for at least one shot classified as a out of field shot (audience) or as close-up shot; and checking for at least one slow-motion replay shot. The goal detection approach presented in [4] uses a decision tree-based multimodal data mining framework combining visual and audio features. Those features are extracted during shot detection. After that, there is a pre-filtering step whose goal is to eliminate noise and irrelevant data. Finally, there is the data mining step which builds the decision tree model using the cleaned data obtained on the previous step.
5.1. Kick Detection
5.3. Offside Detection
A kick detection method has been performed in [15] using the relationship among the ball and the field lines. The author classifies kicks into three categories: free, penalty, and corner kick. The method is based initially on simple assumptions: in the corner kick, the ball is omitted or it is placed in the field corner. The process first takes the global shot of a placed kick and extracts the dominant green color. After that, using a multi-clue based ball detection, the ball may be found or not. If no ball is found, then it is a corner kick. On the other hand, if the ball is found, a Hough transform is applied searching for parallel lines. In case that parallel lines can not be extracted, it is a free kick. However, if they are found and the ball is between the lines, it is assumed to be a penalty kick, otherwise it is a free kick. The dominant color field extraction has been discussed in Section 3 and the ball detection in Section 4.1. Thus, the relationship among ball and the field lines defines which placed kick has been detected.
Offside detection technique proposed in [7] has been introduced in Section 2. There are two situations: the simple one in which frames are judged separately and the complex one that happens when the ball trajectory changes drastically. In this case, it first performs the hard task of determining which player has touched the ball. Authors argue that it is a hard task even for human beings due to, among other things, the fact that many players are close to the ball. For reaching that goal, they create a 3D reconstruction using homography. The most important task inside of offside detection context is to evaluate if a player location is considered to be in an offside one. The proposed technique analyzes a temporal window and the supervisor node tracks those players whose positions are considered to be offside candidates and it verifies if the ball has entered into the 3D sphere around each of those candidate players.
5.2. Goal Detection A real time system for goal detection has been presented in [8] using four cameras with high frames rates (200 fps) in order to assure that high velocity shots would be captured. The authors argue that normal broadcast images from 30 fps are not capable to record ghost goal events, that is, for instance, when the ball touches the under side of the bar and immediately bounces back into the field. The ball detection approach used by them is composed of two techniques: a circle detection algorithm (Circle Hough Transform) for ball candidate generation and a neural classifier to validate that candidate. In this approach, there is a supervisor node that provides evaluation, comparison, and it decides if a goal event was found. A goal detection approach based on cinematic features is presented in [9]. Details of the process are discussed in Section 6. Authors argue that it is hard to detect goals by video processing algorithms. However, due to the nature of
6. Highlights and Summarization A viewer may not be interested in the whole match, but just the moments considered to be the good ones (highlights). Thus, automatic summarization and highlight extraction systems allow one to watch few parts of the game focusing on the most interesting moments. An automatic framework for analysis and summarization has been proposed in [9] using cinematic and object-based features. A highlight extraction technique was proposed in [16] using a pyramidwise model. Both approaches are discussed in this section. The automatic framework described in [9] proposes: (1) a dominant color extraction algorithm (it will not be discussed here); (2) two novel features that will improve shot classification (the absolute difference between two frames based on a color ratio criteria and the difference in color histogram similarity); (3) algorithms for goal detection (discussed in Section 5.2), but also referee, and penalty box detection; and (4) finally an efficient and effective framework
36
for soccer analysis and summarization. In this framework, it is not necessary to compute object-based features when cinematic features are sufficient. However, they can be used for obtaining more detailed summaries. Shot classification (Bayesian classifier) considers the following classes: (1) long shot (global view); (2) in-field medium shot (part of field zoomed in); (3) close-up; and (4) out of field shot. Three types of summaries can be generated: all slow-motion replays, all goals events, and an extension of those two using object-based features. A highlight extraction detection has been proposed in [16]. They first provide a soccer pyramid containing layers such as GOAL, ATTACK and so forth. Similarly to the approach shown before, a Bayesian classification is performed using what authors have defined as SEG: consequential segment with a uniform content. SEG’s classes are also similar to the shot’s classes provided by [9]: CLOSE-UP, FARCENTER, FAR-SIDE and MIDDLE. They define and detect an interesting event named GOA (Group Of Attacks) as a series of relevant attacks. Hence, the highlight system allows one to view all goals (with or without replays), to view all attacks or GOA, and to view particular attacks from both teams. It is also possible to view any of those events in a non-linear way.
[3]
[4]
[5]
[6]
[7]
[8]
7. Conclusions Many researchers have focused on soccer related problems providing different solutions. Some of them using multiple cameras offering accurate analysis, others providing solutions based only on the broadcast images. Semantic event detection has become the main focus of recent works due to its analysis can produce high level results. For instance, known patterns may be matched and teams’ strategies may be recognize. Hence, video processing, computer vision, and machine learning together represent a powerful way to apply technology in soccer entertainment field in order to create and improve accurate systems to help referees during decision making moments and soccer fans during the matches.
Acknowledgements Authors would like to thank the Coordenac¸a˜ o de Aperfeic¸oamento de Pessoal de N´ıvel Superior (CAPES), and to the Post-Graduate Programs in Computer Science of Universidade Federal Minas Gerais (UFMG) and Universidade Federal de Ouro Preto (UFOP).
References [1] T. J. Atherton and D. J. Kerbyson. Size invariant circle detection. Image and Vision Computing, 17(11):795–803, 1999. [2] M. Beetz, N. von Hoyningen-Huene, B. Kirchlechner, S. Gedikli, F. Siles, M. Durus, and M. Lames. Aspogamo:
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
37
Automated sports game analysis models. International Journal of Computer Science in Sport, 8(1):1–21, 2009. J. Y. Chen, Y. H. Li, S. Lao, L. D. Wu, and L. Bai. Structuring soccer video based on audio classification and segmentation using hidden markov model. In International Conference on Image and Video Retrieval (CIVR’04), pages 2073–2075, 2004. S.-C. Chen, M. ling Shyu, M. Chen, and C. Zhang. A decision tree-based multimodal data mining framework for soccer goal detection. In IEEE International Conference on Multimedia and Expo (ICME’04), pages 265–268, 2004. C. H. Chuan. Automatic parsing of tv soccer programs. In IEEE International Conference on Multimedia Computing and Systems (ICMCS’95), pages 167–174, 1995. A. Dearden, Y. Demiris, and O. Grau. Tracking football player movement from a single moving camera using particle filters. In European Conference on Visual Media Production (CVMP’2006), pages 29–37, 2006. T. D’Orazio, M. Leo, P. Spagnolo, P. L. Mazzeo, N. Mosca, M. Nitti, and A. Distante. An investigation into the feasibility of real-time soccer offside detection from a multiple camera system. IEEE Transactions on Circuits and Systems for Video Technology, 19(12):1804–1818, 2009. T. D’Orazio, M. Leo, P. Spagnolo, M. Nitti, N. Mosca, and A. Distante. A visual system for real time detection of goal events during soccer matches. Computer Vision and Image Understanding, 113(5):622–632, 2009. A. Ekin, A. M. Tekalp, and R. Mehrotra. Automatic soccer video analysis and summarization. IEEE Transactions on Image Processing, 12(7):796–807, 2003. A. Fitzgibbon, M. Pilu, and R. B. Fisher. Direct least square fitting of ellipses. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(5):476–480, 1999. N. J. Gordon, D. J. Salmond, and A. F. M. Smith. Novel approach to nonlinear/non-gaussian bayesian state estimation. IEE Proceedings F Radar and Signal Processing, 140(2):107–113, 2002. S. H. Khatoonabadi and M. Rahmati. Automatic soccer players tracking in goal scenes by camera motion elimination. Image and Vision Computing, 27(4):469–479, 2009. A. le Troter, S. Mavromatis, and J. Sequeira. Soccer field detection in video images using color and spatial coherence. In IEEE International Conference on Image Analysis and Recognition (ICIAR’04), volume II, pages 265–272, 2004. R. Leonardi, P. Migliorati, and M. Prandini. Semantic indexing of soccer audio-visual sequences: A multimodal approach based on controlled markov chains. IEEE Transactions on Circuits and Systems for Video Technology, 14(5):634–643, 2004. Y. Li, G. Liu, and X. Qian. Ball and field line detection for placed kick refinement. In WRI Global Congress on Intelligent Systems (GCIS’09), pages 404–407, 2009. M. Luo, Y.-F. Ma, and H.-J. Zhang. Pyramidwise structuring for soccer highlight extraction. In IEEE Pacific-Rim Conference on Multimedia, pages 945–949, 2003. J. Miura, T. Shimawaki, T. Sakiyama, and Y. Shirai. Ball route estimation under heavy occlusion in broadcast soc-
[18]
[19] [20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
cer video. Computer Vision and Image Understanding, 113(5):653–662, 2009. Y. Seo, S. Choi, H. Kim, and K.-S. Hong. Where are the ball and players? soccer game analysis with color based tracking and image mosaick. In IEEE International Conference on Image Analysis and Processing (ICIAP’97), volume II, pages 196–203, 1997. M. J. Swain and D. H. Ballard. Color indexing. International Journal of Computer Vision, 7(1):11–32, 1991. K. Wan, X. Yan, X. Yu, and C. X. Real-time goal-mouth detection in mpeg soccer video. In ACM international conference on Multimedia (MULTIMEDIA’03), pages 311–314, 2003. F. Wang, L. Sun, B. Yang, and S. Yang. Fast arc detection algorithm for play field registration in soccer video mining. In IEEE International Conference on Systems, Man, and Cybernetics (SMC’06), pages 4932–4936, 2006. J. R. Wang and N. Parameswaran. Survey of sports video analysis: Research issues and applications. In Workshop on Visual Information Processing (VIP’05), pages 87–90, 2004. H.-S. Yoon, Y.-L. J. Bae, and Y.-K. Yang. A soccer image sequence mosaicking and analysis method using line and advertisement board detection. ETRI Journal, 24(6):443–454, 2002. X. Yu, H. W. Leong, C. Xu, and Q. Tian. A robust houghbased algorithm for partial ellipse detection in broadcast soccer video. In IEEE International Conference on Multimedia and Expo (ICME’03), pages 1555–1558, 2004. X. Yu, Q. Tian, and K. W. Wan. A novel ball detection framework for real soccer video. In IEEE International Conference on Multimedia and Expo (ICME’03), volume II, pages 265–268, 2003. X. Yu, C. Xu, H. W. Leong, Q. Tian, Q. Tang, and K. W. Wan. Trajectory-based ball detection and tracking with applications to semantic analysis of broadcast soccer video. In ACM International Conference on Multimedia (MULTIMEDIA’03), pages 11–20, 2003. X. Yu, C. Xu, Q. Tian, and H. W. Leong. A ball tracking framework for broadcast soccer video. In IEEE International Conference on Multimedia and Expo (ICME’03), volume 2, pages 273–276, 2003. X. Yu, C. Xu, Q. Tian, X. Yan, K. W. Wan, and Z. Jiang. Estimation of the ball size in broadcast soccer video using salient objects. In IEEE Pacific-Rim Conference on Multimedia, volume 2, pages 930–934, 2003.
38