CoCharNet: Extracting Social Networks using Character Co-occurrence from Movies Quang Dieu Tran (Department of Computer Engineering, Yeungnam University Gyeongsan, Korea 712-749
[email protected]) Jason J. Jung1 (Department of Computer Engineering, Chung-Ang University Seoul, Korea 156-756
[email protected])
Abstract: Discovering contents and stories in the movies is one of the most important concept in research studies. To efficiently consider discovering relationships among character, we focus on the appearance of each character during movie play and analysis characters’ relationships to build a CoCharNet - a social network to discover characters relationships in the movie. In this paper, we propose the innovative method which extract and analyze characters relationships’ based on social network measurement and evaluation. Key Words: CoCharNet, character identification, social network, movie analysis Category: H.1.1, H.3.5, I.2.11
1
Introduction
Nowadays, the number of movies produced annually has been increased significantly. Understanding the message behind movies has becomes complex and challenging. Hence, discovering content and stories from movie has becomes one of important research fields in computer science and many technologies and approaches were proposed. Any particular movie has its own hierarchy and tell stories under the director ideas. The story of the movie usually contain in itself some main parts of movie theory, these some central concepts that prove immediately portrays useful information and story. In drama theory, the concepts of the movie are presented and they reveal the practical side of the theory and provide a solid foundation for more in-depth exploration to come. When the audience watches a movie, he/she usually analyze characters and the relationships among them. Archetypal exits as a form of storytelling shorthand. Characters in the movie known as “Main Character(s)” or “Protagonist(s)” - the heroes or who have main purpose or a chief proponent and principal driver of the effort to achieve the story’s goal, “Antagonists” - characters who 1
Correcsponding author
opposite of the hero and try to stop the mission of the hero, “Reason” - characters who are calm, cold collected and “Emotions” characters who are frenetic, disorganized, and driven by feeling, “Sidekicks” - characters who support protagonist and “Skeptic” - characters who are opposite to Sidekicks, “Guardian” - characters who strongly support protagonist and “Contagonists” - characters who are opposite to Guardian [Phillips and Huntley 2001]. One of the movie normally contain all types of characters and the problem is how to know who belongs to each groups.
Protagonist Protagonist
Antagonist
Guardian
Contagonist
Antagonist
Reason
Emotion
Sidekick
Sidekick
Skeptic
Guardian
Contagonist
Reason
Emotion Skeptic
(a)
(b)
Figure 1: The relationship of characters in the movie theory. (a). The relation between archetype character. (b). The driver quad of archetype character
It is very trivial to start thinking about the main character (known as “protagonist” or “hero”). A main character is the player who makes the audience experiments the story of the movie first hand and the minor characters, who support the main character to tell the story line. Extracting characters to analyze relationships among them is one of the methods to retrieve useful information from the movie. Fig. 1 illustrates the relationships belong to characters in the movie and their roles in the story. Identifying this fact is the most important work in movie analysis. Many approaches are used to attempt proposing in the movie analysis. These approaches are based on recommendation system [Jung 2012], social network analysis [Park et al. 2012, Jung et al. 2013, Weng et al. 2009]. Scrutinize to movie analysis, we can start by discovering contents or abstract of the movie to understand this concept, many approaches are proposed to retrieve useful information from the movie and numerous systems have been implemented such as text/script mining, image processing and video clustering and segmentation
[Adams et al. 2002, Rasheed et al. 2005]. In the content-based concepts, extracting and retrieving information based on three low-level features: visual, audio and textual features. These techniques are used to extract and retrieve information by several schemes, e.g., visual-based, audio-based, text-based or hybridbased ones. However, they are mainly based on feature detection and recognition approaches [Fabro and B¨ osz¨ormenyi 2013]. Content-based usually use to detect and recognize objects or feature in the movie or video. Recently, story-based approach has become a new concept of analysis knowledge in the movie, especially in considering characters story and retrieving information from them. The relationships among characters and scenes, characters to characters are extracted and discovered to understand the movie. Social network analysis should be applied to measure and discover story in the movie. The social network theory has been widely applied to many research fields such as recommendation system [Jung 2012], video segmentation and discovering [Ding and Yilmaz 2010], movie analysis [Rasheed et al. 2005], and characters and relationships among characters analysis [Weng et al. 2009, Park et al. 2012]. In the drama or movie theory [Phillips and Huntley 2001], the characters have important role. The relationships among the characters and the appearance of the characters tell the story of the movie. In this research, we propose a method to extract social network by considering the appearance of the characters in the movie. In this research, we propose a method to extract a social network by considering the appearance of the characters during movie play. The proposed method provide the concept for analysis and retrieval of useful information in the movie such as the appearance of the characters, relationships among characters, strength of the relationships among characters, characters classification and the importance of the characters in the movie using social network analysis and evaluation. Our proposed method uses the appearance and co-occurrence of the characters in the movie to extract the social network namely CoCharNet. Based on the appearance of the characters and relationships among them, we classify the importance of characters into certain classes to understand the movie. CoCharNet is a method based on story-based analysis and uses social network analysis and evaluation to retrieve useful information in the movie. Our objective is to develop a system that can evaluate the importance of the character in the movie. The contributions of our research are as follow: – Extract CoCharNet - the social network based on co-occurrence of the characters and relationships among them – Classification the characters to evaluate the importance of characters in the movie by using CoCharNet The paper is organized as follow. Recent research and related works will be describe in the Sect. 2. Main definitions and methods for constructing CoChar-
Net will be described in Sect. 3. The evaluation and discussion of the proposed method will be described in Sect. 4. Finally, in Sect. 5, the conclusion of this work and future works will be described.
2
Related work
As previously mentioned, movie research fields are complex and challenging. Many concepts and approaches were used to discover and retrieve useful information in the movie. Traditional approaches are used featuring object recognition, known as content-based method was proposed. These approaches fields based on low-level features: visual features, audio features and textual features. The features can be extracted and analyzed by segmentation. Various researches based on scenes segmentation and scenes detection methods. Almost scenes segmentation approaches start by detection features in the frame at the initial stage. A shot within the movie may contains some useful information are needed to discover. A method attempt survey of key-frame extraction algorithms in detection given by [Truong and Venkatesh 2007]. One other approaches are video segmentation, that is only considered to visual features which are usually used in the movie. RoleNet [Weng et al. 2009] is a method to extract the roles in the movies. This method uses face recognition strategy to detect the co-occurrence of the characters on the scenes. If two characters co-occurrence in the scenes, a relationship between them is created and the weight of this relationship is measured. Social network analysis technology is used to analyze and discover the story of characters in the movie. RoleNet automatically classifies major roles, supporting roles and identify characters belong to its. RoleNet is a weighted graph with the node illustrates the character in the movie, the edges instance the relationships among characters and the weights are the number of characters’ co-occurrence in the frames. [Park et al. 2012] is proposed Character-net, an approach to extract social network by considering to dialogs among characters in the movie. Dialogs in the movie contain time, speaker and listener. These features could extract from the subtitles and the speaker of the characters by text mining and speech recognition. Character-net automatically classifies characters into several roles: major roles, minor roles and extra roles. Character-net is directed graph with the nodes representing characters and the edges representing relationships among characters, the weights and the directions illustrate the direction of the dialogs in the movie. The idea of this work is to form movie analysis and social network analysis, which are members in the research field in social science [Jung et al. 2013]. In social science, interactions between elements and their relations are modeled as a complex network, and the techniques are designed to discover and understand
hidden knowledge of social networks which are not directly perceived or measured (e.g., [Jung et al. 2013, Weng et al. 2009]). In this paper, we extract social network from the movie using character annotation, and computational strength of relationship between characters in the movies to discover hidden knowledge on movies. We attempt to extract social networks, identify characters in the movie using social science analysis and social network evaluation strategy. By discovering information of characters in movie, we describe the basic concept of social networks based on characters annotation, the methodology is to build social networks from them and how to discover main characters and their relationships.
3
CoCharNet and CoCharNet analysis
In this paper, we use an annotation strategy to extract CoCharNet. Fig. 2 illustrates an example of annotation segment; – (a) When the Character A appears in the scene, this character will be annotated. – (b) When Character B appears in the scene, this character will be annotated and relationship between Character A and Character B will be measured. – (c) When Character A disappears in the scene, the annotation segment of this character is stopped. – (d) When Character B disappears in the scene, the annotation segment of this character is stopped.
Character A Character B
Character A
(b)
Character B
(d)
(a)
(c)
200ms
800 ms
1.6s
2.4s
3.0s
3.6s
4.2s
4.8s
Figure 2: An example of annotation segment
5.4s
6.0s
3.1
CoCharNet
Definition 1 (CoCharNet). The CoCharNet is weighted graph describes as G = hC, Ri where C = {c1 , c2 , ..., cn } is represented as a set of characters in a movie, R = {rij | if character ci and character cj have relationships}, is represented a set of relationships between the characters.
C2
C1
Character 2 (C2)
Character 1 (C1)
300 ms
900 ms
1.50s s
2.10s
2.60s
3.20s
3.70s
Character 1 Character 2 Character n
Figure 3: Annotation characters in the movie and the construction of relationships between characters
Fig. 3 illustrates how to create a node and edge in the CoCharNet, C1 is Character 1 and C2 is Character 2 appear together at a certain time. When characters appear together in the certain time, CoCharNet update nodes and edges of the network by creating new or updated exits. All nodes and edges are calculated from characters annotation. If in the characters appear together, CoCharNet adds a new node and/or edge in case where network does not have any same node or edge, updated node and/or edge in the case node and/or edge exist. To extract CoCharNet, we annotate the appearance and co-occurrence of the characters in the movie. CoCharNet annotate characters when they appear on the scene. As shown on the Fig. 4, when a character appear on the scene,
CoCharNet stores and calculates the frequency and density which the characters appeared in the movie. The appearance of the characters is considered during movie play. We also aim how to classify the relationship among characters. In this research, the relationships among characters are discovered when the characters appeared at the same time or related-scenes. If the characters are appeared at the same time, we assume that there is a strong relationship between them. Therefore, we quantify character’s relationships as the number of appearance on the scenes.
C1
C2
C1
(a)
C2 C2
C3
C3
C4 (b)
C4
(c)
Figure 4: CoCharNet construction. (a)The case have two characters appear together; (b)The case have three character appear together; (c)CoCharNet
In the CoCharNet, social network from movies was extracted based on characters annotation, relationships among characters and strength of relationships among characters. In the graph of CoCharNet, nodes represent the characters, edges represent relationships among characters and width of edges represent how much the relationship strong is. Suppose that a movie has m minutes length and n characters that are annotated, R could be measured by the W matrix that is represented the weight of relationships among characters. Definition 2 (Relationships). R is related matrix measured by: R = [aij ]m×n th wij , if the i character have a relationship where the elements aij = with a j th character as weigt wij 0, otherwise. Definition 3 (Annotation Matrix). An annotation matrix A is measured by A = hτs , τe , Ct i
(1)
where τs : the start time on the movie that characters are appeared, τe : the end time on the movie that characters are appeared and Ct is a subset of characters C, Ct = {ct1 , ct2 , . . .} are characters that belong to time τe − τs . A movie τ is a set of τi , where τi is each milliseconds of the movie. Each τi contain appearing of characters at this time. The appearance of the character in the time interval of playing will be recorded as the annotation matrix A in Definition 2 above. The R matrix denotes the relationships among characters. Using this cooccurrence matrix, we can identify and measure the weight of the relationship between ith and j th character. 3.2
CoCharNet discovering social networks
CoCharNet uses annotation data of character to extract a social network. Based on the appearance of the characters in the movie. We can wonder the appearance of characters in the movie in several approaching, such as appear together, co-occurrence frequency, character appearance, etc. In this paper, we use the appearing of the characters during movie play to measure, evaluate and analysis social network. Two methods are used in this research describe below. 3.2.1
Total time of character appearance
We measure the weight of the relationship between characters based on total time of co-occurrence in the movie. Using a annotation matrix A, the weight of relationship between ith character and j th character is measured as bellow: aij =
n X
(τk ci cj | (ci , cj ∈ C; {ci , cj } ⊆ Ct ; k = 0..n)) .
(2)
k=1
where τk ci cj = 1 at the time that character ci is appearing together with a character cj measure by milliseconds; ci , cj is character in C, , n are total length of the movie as milliseconds.. aij then should be normalized to [0..1] by device to M AX(aij ), finally we have weight matrix W that is represented for a total time of characters co-occurrence in the movie. aij W = . (3) M AX(aij ) 3.2.2
The number of co-occurrence between characters
Using annotation matrix A, we can calculate the weight of relationship between ith character and j th character based on the number of co-occurrence is measured by count how many time characters are appearing together as follow:
bij =
n Y
(τe − τs | (ci , cj ∈ C; {ci , cj } ⊆ Ct ; τk ⊆ τe − τs ; k = 0..n)) .
(4)
k=1
where τs is the time that characters ci , cj are start to appearing in the scenes τe is time that character ci , cj are disappeared, n are total length of the movie as milliseconds. We also normalized W 0 as below: W0 =
bij . M AX(bij )
(5)
W, W 0 should be used as weight matrix in CoCharNet, Its have same role to measure the strength of relationships among characters in the movie. We also can measure the total time of the character appear in the movie as follow: T (c) =
X
(τk ci | (ci ∈ C}) , k = 1..n) .
(6)
where τk ci is time k when characters appear in the movie, n are total length of the movie as milliseconds. Based on the total time of the character appearance in the movie, we can measure the Score of appearance as follows: S(c) =
T (ci ) . M AX(T (ci ))
(7)
The score of appearance then uses to a part of formula when we identify the protagonist of the movie.
Table 1: Information of the annotated movies
ID
Movie title
Length
A1 Episode I: The Phantom Menace 136 A2 Episode II: Attack of the Clones 142 A3 Episode III: Revenge of the Sith 140 A4 Episode VI: A New Hope 121 A5 Episode V: The Empire Strikes Back 124 A6 Episode VI: Return of the Jedi 134
mins mins mins mins mins mins
Number of Release character date 10 10 11 11 10 12
1999 2002 2005 1977 1980 1983
3.3
CoCharNet evaluation and analysis
The most importance of movie analysis is to determine the most important characters in the movie. After calculating the relationship matrix W, W 0 , we evaluate the CoCharNet using Betweenness Centrality, Closeness Centrality and Weighted Average. The centrality values determine how character importance is. Based on centrality values of the characters, we can decide what characters are important in the movie. Betweeness Centrality of a node c is given by the expression: B(c) =
X ξst (c) . ξij )
(8)
s]c]t
where ξst (c) is the number of shortest paths from node s to node t that pass through c and ξst is the total number of shortest paths from node s to node t. We also known that the Betweenness Centrality of a node scales with the number of pairs of nodes as implied by the summation indices, so the calculation may be rescaled by dividing through by the number of pairs of nodes not including c, therefore B ∈ [0, 1]. The division is done by: (N − 1)(N − 2) . 2
(9)
where N is the number of nodes in the network. Note that this scales for the highest possible value, where one node is crossed by every single shortest path. A normalization of Betweenness Centrality can be performed: normal(B(c)) =
B(c) − min(B) . max(B) − min(B)
(10)
which results in: max(normal) = 1 and min(normal) = 0. Closeness Centrality is based on the length of the average shortest path between a vertex and all vertices in the graph is given by the expression: 1 . s ξcs
C(c) = P
(11)
where c is the node to evaluate, s is another node in the network, and ξcs is the shortest distance between these two nodes c and s. Weighted Degree of the social network could be measure by follow: X Ψ (c) = wij (12) j∈π(i)
where wij is strength of relationship between ith character and j th character, π(i) is the neighborhood of node i.
Table 2: The Centrality measure from the Star Wars Episode V. The Empire Strikes Back Total time of characters appear together ID
Character Name
Closeness
Betweeness Weighted
1 2 3 4 5 6 7 8 9 10
Luke Skywalker 1.0000 0.1481 Han Solo 0.9000 0.0556 R2-D2 0.9000 0.1019 Princess Leia 0.8182 0.0046 Landro Clrissian 0.8182 0.0046 C-3PO 0.8182 0.0046 Chewbacca 0.8182 0.0046 Darth Vader 0.7500 0.0000 Ben Kenobi 0.6429 0.0093 Yoda 0.6000 0.0000 Number of co-occurrence
ID
Character Name
Closeness
Betweeness Weighted
1 2 3 4 5 6 7 8 9 10
Luke Skywalker Han Solo R2-D2 Princess Leia Landro Clrissian C-3PO Chewbacca Darth Vader Ben Kenobi Yoda
1.0000 0.9000 0.9000 0.8182 0.8182 0.8182 0.8182 0.7500 0.6429 0.6000
0.1481 0.0556 0.1019 0.0046 0.0046 0.0046 0.0046 0.0000 0.0093 0.0000
0.5153 0.9243 0.2673 1.0000 0.3459 0.6510 0.7494 0.1442 0.0156 0.1860
0.5281 0.8663 0.3742 1.0000 0.4247 0.7337 0.8966 0.1685 0.1022 0.2180
In the movie, characters are important factors to story analysis story and content. It is simple to consider of the character in the movie as a “main character” or a “protagonist” and “minor character”. Main character is the player through whom the audience experiences the story first hand and minor character support to main character in the movie story [Phillips and Huntley 2001]. CoCharNet applies the Betweenness Centrality and Closeness Centrality and Weighted Degree measurement to evaluate the importance of characters by using equations Eq. 7, Eq. 10. and Eq. 11. Results of evaluation session are used to measure the importance of characters in the movies and classify them to difference communities: Main Characters and Minor Characters. A Main Characters
Figure 5: The Centrality of the characters in the movie The Empire Strikes Back. (a)Closeness Centrality and Weighted Degree; (b)Betweeness Centrality (a) 1.2
Appearing Closeness Co-Occurence Closeness
1
(b) 0.2
Appearing Weighted Degree Co-Occurence Weighted Degree
0.8 0.6
0.1
0.4
0.05
0.2
bi
r
n
ia
ss
r de
no
Ke
da
Yo
n
Be
ke
al
ri Cl
Va
ro
rth
nd
Da
La
w
a cc
ba
O
lo
So
ew
3P
C-
Ch
n
Ha
y Sk
2 -D
ke
bi
r
n
ia
ss
r de
no
Ke
da
R2
Lu
0
Yo
n
Be
ri Cl
Va
ro
rth
Da
O
nd
La
ia
Le
ke
al
w
a cc
ba
ew
3P
C-
Ch
lo
So
ss ce
in
Pr
n
Ha
y Sk
2 -D
ke
R2
Lu
0
Appearing Betweeness Co-Occurrence Betweeness
0.15
class contains characters that are most important in the movie. A Minor Characters class contains characters that support main character but appearance and co-occurrence with among of characters in the movie with threshold value. By using social network evaluation technique, we can classify characters in to class as below: – Main Characters: Which character has value of evaluation is bigger than average value. – Minor Characters: Which character has value of evaluation is smaller than average value. The average value of Betweenness Centrality, Closeness Centrality and Weighted Degree could be measured as below: Average of Betweenness Centrality P (Bi (c)) . (13) AverageBC = n Average of Closeness Centrality AverageCC
P (Ci (c)) = . n
(14)
Average of Weighted Degree P AverageΨ C =
(Ψi (c)) . n
(15)
where n is number of characters are used to annotated in the movie. Using average threshold to classify character in the certain communities should help us to understand the movie. Classification of Main Characters class
and Minor Characters class over the average value help us to understand the importance of character in the movie. To identify the main character in the movie, we use average of the Score of appearance and Closeness Centrality, Betweenness Centrality and Weighted Degree as follows: ImportanceCi = S(ci ) ∗ B(ci ) ∗ C(ci ) ∗
Ψ (ci ) . M AX(Ψ (ci ))
(16)
where ci is character i; S(ci ) is the Score of appearance value; B(ci ) is Betweenness Centrality value; C(ci ) is the Closeness Centrality, Ψi is the Weighed Degree in the movie and B(ci ), C(ci ), Ψ (ci ) 6= 0
4
Experimental results
We selected 6 movies from the Star Wars series: Star Wars series: Episode I, II, III, IV, V, VI to extracting the CoCharNet. Among the characters in these movies were annotated by time appeared and disappeared during movie play. To extract the CoCharNet, we must perform of technique such as centrality measurement, co-occurrence measurement and social network analysis and representation. We extract a CoCharNet by social network technique with co-occurrence and total time appearing together during movie play, analyzed it using character classification and social network evaluation strategy. The environment of CoCharNet system was implemented in Java with Vlcj2 and Gephi API3 . CoCharNet is a social network extracted from annotation of the character in the movie with the nodes are represented characters, the edges are represented the relationships among characters and the thickness of the edge is represented the strength of the relationship is. List of movies that we use to extract CoCharNet are listed in Table 1. More than 60 characters were annotated with more than 750 minutes of movie play. We annotated 100% time of movie play the appearance, appearing together and the number of co-occurrence of the character in the movie. Therefore, it is more comprehensive and deeply than RoleNet and Character-net analysis. Table 2 shows the results of Closeness Centrality, Betweenness Centrality and Weighted Degree measured by CoCharNet from Star Wars Episode V. The Empire Strikes Back. These results are also illustrated by Fig. 5. We can see in this result, the most importance characters always have high value of measurement. Fig. 6 illustrates the CoCharNet Annotation and Social Network Extraction system, this system could help us to extract and visualize network of characters in the movie. During movie play, we can see network and know what is the most important character in the movie, how many relate relationships and how 2 3
Caprica. http://www.capricasoftware.co.uk/projects/vlcj/index.html Gephi API. https://gephi.org/docs/api/
Figure 6: CoCharNet Annotation and Social Network Extraction System
characters importance are. We also can update social network easily by making new annotate session. The centrality values are used to classify character in the movie into certain class: Main Characters Class with the centrality value is bigger than Average values measured by Eq. 12, Eq. 13 and Eq. 14. Results of the characters classification show in Table 3, Table 4 and Table 5, these results are compared with the classification of IMDb database [IMDB 2014]. With IMDb database, 5 first characters are listed in the stars list of casts and crews belong to Main Characters class and other characters are belong to a Minor Character class. As shown in Table 3, Table 4 and Table 5, results of classification of 6 movies measured and evaluated by social network strategy. Characters are classified to Main Characters class and Minor Character Class and compare with the real data from IMDb database. Difference to RoleNet and Character-net, our method is more confident because that in our proposed method we annotate all appearance and disappearance of characters in the movie during play. By doing this, we could consider all time which the character has appeared in the scene, an thus provides a better analysis of movie contents and relationships among characters. Fig. 7 illustrates the precision and recall of proposed method. When using Closeness Centrality, Main characters class is accomplished with a precision of 74.16% and a recall of 71.89% on average. Minor characters class achieve a
Table 3: Classes classification by Closeness Centrality ID A1
A2
A3
A4
A5
A6
Method M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M1. Number of co-occurrence IMDB
Main character(s) Minor character(s) 1,2,3,4,5,6 1,2,3,7,4,5,6 2,3,4,5,7,8 1,2,3,4,5,6 1,2,3,4,5 1,2,3,7,8 1,2,3,4 1,2,3,4 1,2,5,6,8 1,2,3,4,5 1,3,4,5 1,2,3,4,6 1,2,3,4,5,6,7 1,2,3,4,5,6,7 1,3,4,5,6 1,2,3,4,5,6 1,2,3,4,5,6 1,2,3,4,5
7,8,9,10 7,8,9,10 1,6,9,10 7,8,9,10 6,7,8,9,10 4,5,6,9,10 5,6,7,8,9,10,11 5,6,7,8,9,10,11 3,4,7,9,10,11 6,7,8,9 2,6,7,8,9 7,8,9 8,9,10 8,9,10 7,8,9,10,2 8,9,10,11,12 7,8,9,10,11,12 6,7,8,9,10,11,12
precision of 67.5% and a recall of 71.32% on average (i.e., Fig. 7a and Fig. 7b). When using Weighted Degree measurement and evaluation, Main characters class accomplished use Closeness Centrality with a precision of 70% and a recall of 78% on average. Minor characters class achieve a precision of 81% and a recall of 69% on average (i.e., Fig. 7c and Fig. 7d). But when using Betweenness Centrality measurement strategy, Main characters class accomplished a precision of 40.28% and a recall of 79.86% on average. The Minor characters class achieves a precision of 89.72% and a recall of 56.62% on average, because that this method consider the number of co-occurrence of the characters only, and in some action scenes, characters appear, disappear, co-occurrence very frequently (i.e., Fig. 7e and Fig. 7f). In the movie story, it is very useful if we could know who is a hero or main character. Our proposed method could measure the total time of appearing of the character in the movie and combine with Closeness Centrality, Betweenness Centrality and Weighted Degree measurement and evaluation, we can evaluate the protagonist (known as the hero or main character) in the movie as shown in Table 6. Results of Protagonist evaluation and measurement is similar to classification in IMDb database [IMDB 2014]. Use this evaluation we could make
Table 4: Classes classification by Betweenness Centrality ID A1
A2
A3
A4
A5
A6
Method M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M1. Total time appearing together IMDB M1. Total time appearing together M1. Total time appearing together IMDB
Main character(s) Minor character(s) 1,2,3 1,2,3,7 2,3,4,5,7,8 1 1 1,2,3,7,8 1,2,3 1,2,4 1,2,5,6,8 1,2 1 1,2,3,4,6 1,2 1,2,3 1,3,4,5,6 1,2,3,4 1,2,3,4 1,2,3,4,5
4,5,6,7,8,9,10 4,5,6,8,9,10 1,6,9,10 2,3,4,5,6,7,8,9,10 2,3,4,5,6,7,8,9,10 4,5,6,9,10 4,5,6,7,8,9,10,11 3,5,6,7,8,9,10,11 3,4,7,9,10,11 3,4,5,6,7,8,9 2,3,4,5,6,7,8,9 7,8,9 3,4,5,6,7,8,9,10 4,5,6,7,8,9,10 7,8,9,10,2 5,6,7,8,9,10,11,12 5,6,7,8,9,10,11,12 6,7,8,9,10,11,12
the movie story segmentation but we will explain more in the future plan if we wish to relationships between character and itself.
5
Discussion
Through experimental results, we can see that CoCharNet is a method to analysis the movie using social network measurement and evaluation strategy. We depict a CoCharNet as a social network and visualize as a undirected graph which the nodes are represented characters, the edges are represented relationships between characters and thickness of the edge is represented how strength of relationship is. In our proposed method, we use social network measurement and evaluation as a principle to classify characters into Main Characters class and Minor Characters class by using Closeness Centrality, Betweenness Centrality and Weighted Degree measurement and evaluation. Compared to RoleNet and Character-net. While RoleNet use face recognition to measure the co-occurrence of the characters in the movie, RoleNet classified characters into leader role and extra role to analysis the movie story. Character-net uses dialogs to measure and classify characters in to major role, minor roles and extra. But both of two pro-
Table 5: Classes classification by Weighted Degree ID
Method
A1
A2
A3
A4
A5
A6
Main character(s) Minor character(s)
M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Number of co-occurrence IMDB M1. Total time appearing together M2. Total time appearing together IMDB M1. Total time appearing together M2. Total time appearing together IMDB
1,2,3,4,7 1,3,4,5,7 2,3,4,5,7,8 1,2,7 1,7,2,4,3 1,2,3,7,8 1,2,5,6 1,2,6,3 1,2,5,6,8 1,2,4 1,3,6,4,9 1,2,3,4,6 4,2,6,7,1 4,7,2,6 1,3,4,5,6 1,2,3,4,6 1,2,3,4,6 1,2,3,4,5
5,6,8,9,10 2,6,8,9,10 1,6,9,10 3,4,5,6,8,9,10 5,6,8,9,10 4,5,6,9,10 4,7,8,9,10,11 5,7,8,9,10,11 3,4,7,9,10,11 3,5,6,7,8,9 2,5,7,8,9 7,8,9 3,5,8,9,10 1,3,5,8,9,10 2,7,8,9,10 5,7,8,9,10,11,12 5,7,8,9,10,11,12 6,7,8,9,10,11,12
Table 6: The most important character measured by CoCharNet Episde I Episde II Episde III Episde IV Episde IV Episde VI Proposed Jinn In IMDb Jinn
Kenobi Kenobi
Kenobi Kenobi
Luke Luke
Luke Luke
Luke Luke
posed methods above could not identify the main character as the protagonist or the hero in the movie. Additionally, RoleNet and Character-net also could not measure all of the case in the movie such as who is main character, the case that the character appeared alone or in the case that is laking of dialogs between characters. We have compared the result of the proposed method to the IMDb database [IMDB 2014] and the result is similar to the IMDb ranking of the importance of the characters in the movie. We also found the main character (as the protagonist or the hero) in the movie. This result is matched with IMDb ranking (as shown in Table 3, Table 4, Table 5 and Table 6). Annotation is long time consuming. It is not easy to annotate correctly and this requires a significant among of time. Although, this work comprehends all of the appearance of the characters in the movie, but this proposed method have
Figure 7: Precision and Recall evaluation. (a)Main Character classification, (b) Minor Character classification using Closeness Centrality measurement; (c)Main Character classification, (d)Minor Character classification using Weighted Degree measurement; (e)Main Character classification, (f)Minor Character classification using Betweenness Centrality measurement (a)
(b)
1.2
1.2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
A1
A2
A3
A4
A5
A6
0
A1
A2
(c) 1.2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2 A1
A2
A3
A4
A5
A6
0
A1
A2
(e) 1.2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2 A1
A2
A3
A5
A6
A3
A4
A5
A6
M1
M1
M1
(f)
1.2
0
A4
(d)
1.2
0
A3
A4
A5
A6
0
A1
M1
M1
some limitation such as only consider to the undirected relationship between characters, or limitation of the number of characters use in annotation session.
Moreover, movie analysis is not only based on characters principle, study the scene, story segmentation and clustering is needed. As shown in Fig. 1, [Phillips and Huntley 2001] describes the importance of movie analysis is how to understand and identify characters, character role, character relationship, etc. This problem will be considered in the future work. We have plans to improve CoCharNet performances and efficiencies by examining to analyze the movie using movie story segmentation and analysis.
6
Conclusion
Movie analysis is a widely research topic and has many interesting concepts and challenges. Recently, to understanding the movie and discover useful information from the movie has becomes a hot research topic. In this paper, we proposed CoCharNet - a method to extract social network in the movie using characters annotation, measured by social network analysis and evaluation technology. Based on CoCharNet, we should discover the story of the movie through character appearance and relationships among characters. Our proposed method provides a new concept to discover useful information in the movie such as the characters’ importance, character relationships and understanding the movie’s story. A CoCharNet discovers character relationship by annotation and analyze them using social network measurement and evaluation. The important characters and character relationships are measured by Betweenness Centrality, Closeness Centrality and Weighted Degree strategy. Through these measurement results, we classify characters to certain classes as a Main Characters class, a Minor Character class. We developed a CoCharNet system to annotate, visualization, measure and evaluate our proposed method by comparing with real data - IMDb’s list of casts and crews. We extracted CoCharNet from the Star Wars series with 6 movies, more than 60 characters and more than 750 minutes of movie play. We can see in the experiment session the performance and efficiency of the proposed method. Despite having a better results of movie analysis, our proposed method still have some limitation and weaknesses. It still has some incorrect classification of characters. These limitations will be considered in the future work that that will be more efficiently and performance. The future works will consider scenes and movie segmentation, through which, discovering and understanding the movie story will be easier and more efficient. We have plans to consider the time stamp of subtitles and speech recognition to understanding the Subjective Throughline and The Character Throghline in the movie. For example, we can discover and analyze the relationships between Protagonist and Antagonist, Reason and Emotions, Sidekicks and Septic, Guardian and Contagonists in the movie. We will improve performance and efficiency of CoCharNet through deeper and wider
analysis compared to Character-net and Rolenet. By attempting these, we examine CoCharNet to analyze more semantic (as meaning, features or relations) from the movie.
Acknowledgment. This work was supported by the ICT R&D program of MSIP/IITP [10035348, Development of a Cognitive Planning and Learning Model for Mobile Platforms]. Also, this work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. NRF2014R1A2A2A05007154).
References [Adams et al. 2002] Adams, B., Dorai, C., Venkatesh, S.: “Toward automatic extraction and expression of expressive elements from motion pictures”; Multimedia, IEEE Trans., Vol. 4 , Iss. 4, (2002) 472 - 481. [Albrecht 2009] Albrecht, K.: “Social Intelligence: The New Science of Success”. New York: Wiley, ISBN: 978-0-470-44434-4 (Jan 2009). [Cour et al. 2007] Cour, T., Jordan, C., Miltsakaki, E., Taskar, B.: “Movie/script: Alignment and parsing of video and text transcription”; Proc. ECCV ’08 Proc. of the 10th Euro. Conf. on Comp. Vis.: Part IV, Springer-Verlag Berlin (2008) 158 - 171. [Chen et al. 2004] Chen, H. W., Kuo, J. H., Chu, W. T., Wu, J. L.: “Action movies segmentation and summarization based on tempo analysis; Proc. ACM SIGMM Int. Workshop on Multimedia Info. Retrieval, (2004) 251258. [Chen et al. 2009] Chen, J., Zaiane, O., Goebel, R.: “Detecting communities in social networks using max-min modularity”; Proc. SDM, (2009) 978-989. [Ding and Yilmaz 2010] Ding, L., Yilmaz, A.: “Learning Relations Among Movie Characters: A Social Networks Perspective”; Proc. ECCV’10 Proc. of the 11th European Conf. on Comp. Vis.: Part IV, Springer-Verlag Berlin, Heidelberg (2010) 410-423. [Fabro and B¨ osz¨ ormenyi 2013] Fabro, B. M., B¨ osz¨ ormenyi, L.: “State-of-the-Art and Future Challenges in Video Scene Detection: a Survey”; Multimedia Sys., Vol. 19, Iss. 5, (Oct 2013) 427-454. [Garg et al. 2008] Garg, N. P., Favre, S., Salamin, H., Hakkani, T. D., Vinciarelli, A.: “Role recognition for meeting participants: an approach based on lexical information and social network analysis”; Proc. MM ’08 Proc. of the 16th ACM Int’l Conf. on Multimedia, New York, NY, USA, (2008) 693-696. [Hung et al. 2007] Hung, H,. Jayagopi, D., Yeo, C., Friedland, G., Ba, S., Odobez, J.M., Ramchandran, K., Mirghafori, N. and Perez, D. G.: “Using Audio and Video Features to Classify the Most Dominant Person in a Group Meeting”; Proc. MULTIMEDIA ’07 Proc. of the 15th Int. Conf. on Multimedia, ACM New York, NY, USA (2007) 835-838. [IMDB 2014] “IMDb - Movie, TV and Celebrities website”; http://imdb.com/, (visited Oct 13, 2014). [Jung 2012] Jung, J. J.: “Attribute selection-based recommendation framework for short-head user group: An empirical study by MovieLens and IMDB”; Exp. Sys. with Appl. Vol. 39, Iss. 4, (Mar 2012) 40494054.
[Jung et al. 2013] Jung, J. J., You, E. S., Park, S. B.: “Emotion-based character clustering for managing story-based contents: a cinemetric analysis”; Multimedia Tools and Appl., Vol. 65, Iss. 1, (Jul 2013) 29-45. [Li et al. 2006] Li, Y., Lee, S. H., Yeh, C. H., Kuo, C. C. J. “Techniques for movie content analysis and skimming: Tutorial and overview on video abstraction techniques”; IEEE Signal Process. Mag., Vol. 23, No. 2, (2006) 7989. [Li et al. 2004] Li, Y., Narayanan, S., Kuo, C.C.J.: “Content-based movie analysis and indexing based on audiovisual cues; Circuits and Systems for Video Technology, IEEE Trans., Vol. 14 , Iss. 8, (2004) 1073 - 1085. [Lienhart et al. 1997] Lienhart, R., Pfeiffer, S., Effelsberg, W.: “Video abstracting”; Mag. Comm. of the ACM, Vol. 40 Iss. 12, ACM New York, NY, USA (Dec 1997) 54-62. [May et al. 2006] May, P., Ehrlich, H. C., Steinke, T.: “ZIB Structure Prediction Pipeline: Composing a Complex Biological Work flow through Web Services”; EuroPar 2006 Parallel Proc. Lect. Notes in Comp. Sci, Vol. 4128, Springer (2006) 11481158. [Otsuka et al. 2005] Otsuka, I., Nakane, K., Divakaran, A., Hatanaka, K., Ogawa, M.: “A highlight scene detection and video summarization system using audio feature for a personal video recorder”; Cons. Electron., 2005. ICCE. 2005 Digest of Technical Papers. Int Conf. on (2005) 223 - 224. [Park et al. 2012] Park, S. B., Oh, K. J., Jo, G. S.: “Social network analysis in a movie using character-net”; Multimedia Tools and Appl., Vol. 59, Iss. 2, (Jul 2012) 601-627. [Phillips and Huntley 2001] Phillips, M. A., Huntley, C.: “Dramatica: A New Theory of Story”; Screen Systems Incorporated, Fourth Edition, (2001). [Poppe et al. 2009] Poppe, C., Martens, G., Mannens, E., de Walle, R. V.: “Personal content management system: a semantic approach”; Vis. Comm. and Img. Rept. archive, Vol. 20, Iss. 2 , Academic Press, Inc. Orlando, FL, USA (Feb 2009) 131144. [Rasheed et al. 2005] Rasheed, Z., Sheikh, Y., Shahm, M.: “On the use of computable features for film classification”; Circuits and Syst. for Video Tech., IEEE Trans., Vol. 15 , Iss. 1, (2005) 52-64. [Rienks et al. 2006] Rienks, R., Zhang, D., Post, W.: “Detection and application of influence rankings in small group meetings”; Proc. ICMI ’06 Proc. of the 8th Int. Conf. on Multimedia interfaces, ACM New York, NY, USA (2006) 257 - 264. [Ronfard and Thuong 2003] Ronfard, R., Thuong, T. T.: “A framework for aligning and indexing movies with their script”; Proc. IEEE Int. Conf. Multimedia and Expo (ICME 2003), (2003) 2124. [Truong and Venkatesh 2007] Truong, B. T., Venkatesh, S.: “Video abstraction: A systematic review and classification”. ACM Trans. on Multimedia Comp., Comm., and Appl. (TOMM), Vol. 3, Iss. 1, Article No. 3 (Feb 2007 ). [Weng et al. 2009] Weng, C. Y., Chu, W. T., Wu, J. L.: “RoleNet: Movie Analysis from Perspective of Social Networks”; IEEE Trans. on Multimedia, Vol. 11, No.2 (Feb 2009).