Jun 20, 2018 - 49th Scientific meeting of the Italian Statistical Society. University of Palermo ..... universities, research institutes) reported on the extracted applications. 64 individual nodes ...... Institute for. Scientific Interchange Foundation.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
1
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Carlo Drago
University of Rome “Niccolo Cusano”
49th Scientific meeting of the Italian Statistical Society University of Palermo 20-22 June 2018
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Outline
2
.
.
Outline
Research problem Methodology Simulation study Application on real data Conclusions and Directions for Future Research
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Outline
3
.
.
The Representation of the Communities on a Network Communities The different communities are groups of nodes which tend to be strongly connected to each other and they tend to be loosely connected with nodes of other communities (Fortunato 2010). The communities are the relevant elements on the construction of a network. Each different network is based on the different communities which can be identified inside the same network. The identification of the community structure is very important in order to detect group of nodes which can be part of the same functional structure of the same network. The first step is to identify the different communities which can be considered inside the network and then represent them. Aims The aim is to represent the different communities and identify the core of the community structure or retain the most relevant part of the entire structure .
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Outline
4
.
.
Identification of the Communities
There are various different methodologies with the aim of detecting the different communities (Khan Niazi 2017). Each different method can have a different performance (Mahmoud et al 2016 Leskovec et al 2010). Different algorithms can have different biases for different network structures and so we have to compare the results we obtain using different community algorithms. In this sense it is useful to consider approaches which can take into account an ensamble of different algorithms or approaches in order to synthetized the results obtained (Drago 2017 -2). Robust Community Structure So we obtain a robust community structure via multiple correspondence analysis (MCA) and we validate them using the Rand Index.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Outline
5
.
.
Consensus Community Detection Algorithm
Following Drago 2017 (2) We start from the network adjacency matrix We consider an ensamble of different community detection algorithms and we obtain different results We collect the results in the Consensus Matrix We apply Multiple Correspondence Analysis (Le Roux, Rouanet 2009) on the Consensus Matrix We perform a Hierarchical Cluster Analysis on the first two dimension (the most relevant), using the Euclidean Distance and the Ward Method (Härdle & Simar 2007) We use a dendrogram to explore the different partitions
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Outline
6
.
.
The Algorithm
We use several well-known methods in community detection: Edge Betweenness community (Clauset Newman Moore 2004) Walktrap community (Pons Latapy 2005) Fastgreedy community (Clauset Newman Moore 2004) Spinglass community (Sathik Rasheed) Leading Eigenvector Community (Newman 2006) Infomap Community (Rosvall Axelsson and Bergstrom 2009) Label Propagation (Raghavan Albert Kumara) Blockmodeling as a tool in Community Detection (Zhao Levina and Zhu 2011 and Karrer Newman 2011)
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Outline
7
.
.
The Algorithm
The Consensus Matrix is useful to observe the comparisons between the partitions The factor map related to the methods is useful to identify the different patterns between different community detection methods A relevant problem could be to identify the stable communities that it is possible to detect by utilizing an ensamble of different methods The factor map related to the nodes allows to identify the different communities (the nodes).
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Outline
8
.
.
The Algorithm
Finally it is possible to obtain the different clusters by using an appropriate distance (Euclidean distance) by performing a cluster analysis using the Ward method The number of clusters is decided in order to explore different partitions (the approach followed is exploratory). The final result is the detection of different clusters which represent the stable communities
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Outline
9
.
.
The Algorithm
Figure: The algorithm
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Outline
10
The Algorithm
Each axis can be interpreted as the pattern for each method of community memberships. The information obtained in that way is also important because it allows to compare the different results obtained by the algorithms. Finally by considering the majority of the numbers of communities extracted from each algorithm we obtain the number of classes to obtain in the hierarchical clustering (HCA). In this way we cut the dendrogram in order to detect the final communities. So we can analyse the results obtained using different numbers of communities (and the stable structures we are able to identify). Finally based on the different communities obtained by the procedure we compare the membership using the Rand Index (see Drago and Balzanella 2015).
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Simulation Study
11
Simulation Study
In order to experiment the different results obtained by the algorithm using different network structures, we consider some simulation experiments using synthethic data. In particular we run the different algorithms to generate different network typologies and we perform the consensus approach in order to detect the different communities. The software used is R and in particular the package Igraph (Csardi Nepusz 2006).
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Simulation Study
12
Simulation Study
The different network typologies are in particular: the Barabasi model, the random graph model etc. Then we consider and compare the different results obtained by the different algorithms with the solution obtained by the consensus. The results show that we obtain a reasonable synthesis from the different algorithms and in particular we are able to see the different communities. In particular we show the results from the simulation on the Barabasi Model. The final results show as a result six communities from the algorithm proposed. The final partition show a Rand index of 0.9-1 with the different partitions obtained with the initial community detection methods. is interesting because it is able to identify nodes which vary their membership from a community to another one, due for example we switch to a lower number of communities (or different type of communities). Usually those nodes which have an higher centrality they tend to be part of different communities using various algorithms
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Simulation Study
13
Simulation Study
Figure: The algorithm
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
14
Applying the Method on Joint-Patent Networks
This application is on Drago Cucco (2013) To illustrate the application of CDTs to innovation networks, a joint patent application network was constructed starting from the three Italian branches of a leading firm (electronics) We present the preliminary results of the analysis and describe the next steps
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
15
Data
Source: OECD Regpat database, which reports patent applications to the European Patent Office and applications filed under the Patent Cooperation Treaty Data Structure Originally two mode data (applicant(s) - patent) Projected onto a one mode applicant-applicant network
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
16
The Network
All patent applications filed by the three branches of the firm after 1990 were extracted A first node set was created by listing all the co-applicants (firms, universities, research institutes) reported on the extracted applications 64 individual nodes were identified in this step using (a) harmonized names in OECD Harmonized Applicants Names database; (b) manual checks for ambiguous cases All patent applications filed by the identified actors were extracted The process was repeated, resulting in a final node set of 1,703 nodes
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
17
Edge Cuts
All joint patent applications between the 1703 nodes were extracted and transformed into a one-mode valued network (applicant-applicant) Edge weights equal to the number of joint patent applications To remove occasional collaborations, we operated an edge-cuts on the network (less than five collaborations) Isolates were removed, and the networks were binarized
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
18
Edge-Cut Network (at least 5 joint applications)
Network Descriptives We start from the general features of the network, described by their descriptives then we explore the network to find the stable communities.
vertices edges density diameter centralization degree betweenness
Edge Cut Network 216 248 0.01 526.00 0.26 2.30 390.52
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
19
Edge-Cut Network (at least 5 joint applications; 216 nodes) Figure: Edge-Cut Network
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
20
Edge-Cut Network: communities Figure: Edge Cut Network
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
21
Edge-Cut Network: communities Figure: Edge Cut Network
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
22
Edge-Cut Network: Consensus Matrix
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 …
me 1 2 3 4 4 4 1 5 5 5 4 4 6 5 5 5 …
me2 13 3 8 6 6 6 2 6 5 5 6 6 1 5 5 5 …
me3 1 7 2 3 3 3 1 3 5 5 3 3 3 5 5 5 …
me5 1 2 9 14 14 14 1 10 10 10 14 14 13 10 10 10 …
me6 10 13 2 8 8 8 10 8 11 11 8 8 8 11 11 11 …
me7 16 6 3 1 1 1 5 1 2 2 1 1 8 2 2 2 …
.
.
.
.
.
.
me8 1 2 3 4 4 4 5 4 6 6 4 4 4 6 6 6 … . . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
23
Edge-Cut Network: Consensus Matrix
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 …
me 1 2 3 4 4 4 1 5 5 5 4 4 6 5 5 5 …
me2 13 3 8 6 6 6 2 6 5 5 6 6 1 5 5 5 …
me3 1 7 2 3 3 3 1 3 5 5 3 3 3 5 5 5 …
me5 1 2 9 14 14 14 1 10 10 10 14 14 13 10 10 10 …
me6 10 13 2 8 8 8 10 8 11 11 8 8 8 11 11 11 …
me7 16 6 3 1 1 1 5 1 2 2 1 1 8 2 2 2 …
.
.
.
.
.
.
me8 1 2 3 4 4 4 5 4 6 6 4 4 4 6 6 6 … . . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
24
Edge-Cut Network: Consensus Matrix
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 …
me 1 2 3 4 4 4 1 5 5 5 4 4 6 5 5 5 …
me2 13 3 8 6 6 6 2 6 5 5 6 6 1 5 5 5 …
me3 1 7 2 3 3 3 1 3 5 5 3 3 3 5 5 5 …
me5 1 2 9 14 14 14 1 10 10 10 14 14 13 10 10 10 …
me6 10 13 2 8 8 8 10 8 11 11 8 8 8 11 11 11 …
me7 16 6 3 1 1 1 5 1 2 2 1 1 8 2 2 2 …
.
.
.
.
.
.
me8 1 2 3 4 4 4 5 4 6 6 4 4 4 6 6 6 … . . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
25
Edge-Cut Network: MCA, methods Figure: Edge-Cut Network: MCA, methods
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
26
Edge-Cut Network: MCA, nodes Figure: Edge Cut Network: MCA, nodes
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
27
Edge-Cut Network: dendrogram Figure: Edge Cut Network: dendrogram
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
28
Edge-Cut Network: stable communities Figure: Edge Cut Network: stable communities
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
29
Edge-Cut Network: stable communities Figure: Edge Cut Network: stable communities
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
30
Advantages in using the method
There are clear advantages in using ensambles when different methods produce different information about communities: Measure the persistence of the co-participation of some nodes in the same community Overcome the biases of each method Understand what each method concretely tells us
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
31
Representation of the communities
At this point it is necessary to represent the different communities in such a way which cannot lead to the loss of relevant information from the original data. The computed Rand Index gives us information on the capacity of the resultant representation to ”capture” the initial results of the different community detection on the network.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
32
From the Communities to their Representation
The entire community structure, each different community, can be represented as a different interval data (Drago 2017). Differently it is possible to consider the entire network as a symbolic data (Giordano Brito 2014). In this sense we are able to obtain different interval data for each community. So the procedure used is comprised of three steps: identifying the different communities from a network using an approach of community detection (Khan Niazi 2017), and then from the different member community memberships obtaining the interval data. Each different community is based on all the single nodes of the network From the interval data considered it is possible to measure the different attributes which are relevant in order to represent the entire community. Each measure is related to structural characteristics or attributes of the same node
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
33
From the Communities to their Representation
So we can have the attribute or structural characteristics for the entire community in addition to the attribute of those of the single n nodes. We have the single interval measure for the community based on those of the nodes which are members of the community: Xa = (x1 , x2 , . . . , xn )
(1)
Where x are the different measures for the nodes belonging to a community Xa (for instance the different betweenness or the degree). The interval data for the single community is: XI,a = [x, x]
(2)
Where x represents the upper bound of the measure belonging to the community and the x the lower bound. At this point we can consider the descriptors of the different communities as intervals (Gioia Lauro 2005). In this way we can consider both the single different observations, but also the different communities by considering the intervals of their measures.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
34
From the Communities to their Representation
It is possible to compare the different communities by their attributes (the upper and the lower bound) but also the centers and the radii [?]. So we have the center: 1 (x + x) (3) 2 we can also consider the range between the upper and the lower bound XI,a center =
XI,a range = (x − x)
(4)
and the radii 1 (x + x) (5) 2 These descriptors allow to take into account the different communities and to compare them. XI,a radius =
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
35
Ranking the different Representations Rankings At this point it is necessary to identify the different rankings of the representations. In this sense we have to explicitly consider the different intervals and their attributes. In particular each interval can be characterized by their attributes as the lower bound and the upper bound. Starting from their descriptors it is possible compare the different attributes or structural indicators for each community considered. Following M’Ballo and Diday 2005 we consider the ranking for the different intervals obtained. The comparison can be conducted by considering the different attributes of the intervals (the upper and the lower bounds, the range and the radii). One of the possible applications of ranking the different attributes or the different structural characteristics of the different communities is to detect the centre of the network based on the different communities. In this sense we are interested not in single nodes but in considering the communities as the initial point of the analysis. .
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
36
Ranking the different Representations Ranking of centrality The ranking of centrality, for instance, is computed by considering the different communities, and at the same time those selected are considered on the final network selected by their structural characteristics. At this point, it is possible to consider the ranking also by taking into account only a number of different communities with the aim of detecting the central part of the network for some relevant structural characteristics. We obtain in this way a stylized structure of the network considering the most relevant communities. The choice and the validation is performed by observing a graph in which are visualized the changes on some indicators (betweenness and degree for instance). We consider in this case the changes on the center values for each community. A radar plot (Noirhomme 2002) is a tool to analyze and compare the different measures on the ranking. So it could be used as a diagnostic tool in the choice. The final network structure is based on considering only . . . . . . . . . . . . . . . . these communities. .
.
.
. . . .
. . . .
. . . .
.
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
37
Simulation Study and Application on Real Data
It is possible to consider different simulated networks in order to evaluate the procedure proposed. In order to test the algorithms then we consider various types of networks and we consider the approach for each different network. In particular we simulate different networks of different typology and different size and then we apply the approach (Barabasi Game, Erdos Renyi and also Forest Fire (Csardi Nepusz 2016). We are able to show the community structure by detecting the different communities using the MCA-based community algorithm procedure (Drago 2017). Then we represent them as interval data and we represent them as two descriptors as upper and lower bound for each community. Finally we are able to compute also the center and the radius. The statistical methods considered on the different intervals based on the communities are on Gioia Lauro 2005. The package RSDA on R allows the performing of different computations based on interval data (Rodriguez 2017).
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
38
Simulation Study and Application on Real Data
We visualize the ranking of the different communities obtained by considering the appropriate methods and we can visualize them by using a radar plot. A radar plot visualizes each attribute of the community and structural indicator expressed as interval. At the same time we can choose the number of communities by observing the change on the relevant center parameters in the different communities (on betweenness and degree in our case). So we are able to visualize the most central communities by considering the highest ranked communities by their betweenness and the degree. At the same time the radar plot is actually showing the ranking considering also the other structural characteristics represented as interval data for each specific data.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
39
Simulation Study and Application on Real Data
Finally by choosing the first ranked communities we are able to identify the stylized structure of the network starting from their specific initial structure. In this sense we start from the entire structure and then we are able to rank the different communities by considering the different attributes. Finally we select the first communities and we obtain the most central communities from the network. In the case of application on real data we consider the network of the Zachary karate club (Zachary 1977). Here we are able to observe and select the most relevant part of the network by selection of the most central communities. The community-core of the network These communities identify the ”core of the network” rather than other peripherical network structures.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
40
Results Figure: Edge Cut Network: stable communities
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
41
Conclusions
The approach followed in this paper is to consider the different communities representing them as interval data and then ranking them. The procedure considered determines the different communities of the network and detects the most central different representations by considering some structural indicators as the betweenness or the Freeman degree. It is important to emphasize that the analysis is community-based and it is robust enclosing the results of many community detection algorithms. Of course other attributes of the different communities can be considered and the focus in this sense can be on different structural characteristics of the network.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
42
References
A.V (2012) Evidence of Networking in the European Research Area. Project financed by the 6th Framework Programme for Research, for the implementation of the specific programme ?Strengthening the Foundations of the European Research Area? (Invitation to tender n� DG RTD 2005 M 02 02) Blondel, V.D. Guillaume, J.L. Lambiotte, R. Lefebvre E. (2008): Fast unfolding of communities in large networks. J. Stat. Mech. P10008 Burger-Helmchen, T. (Ed.). (2013). The Economics of Creativity: Ideas, Firms and Markets (Vol. 60). Routledge. Christ, J. (2009). The Geography and Co-Location of European Technology-Speci fic Co-Inventorship Networks. University of Hohenheim FZID Discussion Paper, (14-2010). Csardi G, Nepusz T (2006) The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. Danon L, Diaz-Guilera A, Duch J, Arenas A: Comparing community structure identification. J Stat Mech P09008, 2005.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
43
References
Drago, C. (2017) MCA Based Community Detection. Classification, (Big) Data Analysis and Statistical Learning, Edition: Studies in Classification, Data Analysis, and Knowledge Organization, Publisher: Springer, Editors: Francesco Mola, Claudio Conversano, Maurizio Vichi Drago, C., & Balzanella, A. (2015). Nonmetric MDS Consensus Community Detection. In Advances in Statistical Models for Data Analysis (pp. 97-105). Springer International Publishing. Drago C. & Cucco I. (2013) ”Robust Communities Detection in Joint-Patent Application Networks”. XXXIII Sunbelt Social Networks Conference of the International Network for Social Network Analysis (INSNA), Hamburg; 05/2013 Drago C & Ricciuti R. (2015) Bootstrapping the Gini Index of the Network Freeman Degree. Statistics and Demography: the Legacy of Corrado Gini, Corrado Crocetta Editor, ISBN 978 88 678, Treviso, September 2015 (also presented at Italian Statistical Society SIS conference 2015)
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
44
References
Girvan M. and Newman M. E. (2002), Proc. Natl. Acad. Sci. USA 99, 7821. Fortunato S. (2009), arXiv:0906.0612. Fortunato, S. (2010) Community detection in graphs. Physics Reports, 486(3), 75-174. Fortunato, S. (2013) Community structure in networks. Institute for Scientific Interchange Foundation Fortunato, S., & Castellano, C. (2007). Community structure in graphs. arXiv preprint arXiv:0712.2716. Härdle, W., & Simar, L. (2007). Applied multivariate statistical analysis. Springer Verlag.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
45
References
Karrer, B., & Newman, M. E. (2011). Stochastic blockmodels and community structure in networks. Physical Review E, 83(1), 016107. Kim, J., & Wilhelm, T. (2008). What is a complex graph? Physica A: Statistical Mechanics and Its Applications, 387(11), 2637?2652. doi:10.1016/j.physa.2008.01.015 Lancichinetti, A., & Fortunato, S. (2010). Community detection algorithms: A comparative analysis. Physical review E, 80(5), 056117. Lancichinetti, A., & Fortunato, S. (2012). Consensus clustering in complex networks. Scientific reports, 2. Lancichinetti A., Radicchi F., Ramasco J.J. and Fortunato S. (2011) Finding statistically significant communities in networks. PloS One 6, e18961 Le Roux, B., & Rouanet, H. (2009). Multiple correspondence analysis (Vol. 163). SAGE Publications, Incorporated.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
46
References Billard, L., & Diday, E. (2006) Symbolic Data Analysis: Conceptual Statistics and Data Mining. England: Wiley & Sons Ltd Csardi G, Nepusz T: The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. http://igraph.org Drago C. (2017) Identifying Meta Communities on Large Networks. SIS Italian Statistical Society 2017 Conference: Statistics and Data Science: New Challenges, New Generations. Drago C. -2 (2017) MCA Based Community Detection In book: Classification, (Big) Data Analysis and Statistical Learning, Edition: Studies in Classification, Data Analysis, and Knowledge Organization, Publisher: Springer, Editors: Francesco Mola, Claudio Conversano, Maurizio Vichi Duan, L., & Binbasioglu, M. (2017). An ensemble framework for community detection. Journal of Industrial Information Integration, 5, 1-5. Fortunato, S. (2010). Community detection in graphs. Physics reports, 486 (3), 75-174. Giordano G., Brito M. P. (2014) Social Networks as Symbolic Data, in: Analysis and Modeling of Complex Data in Behavioral and Social Sciences, Edited by Vicari, D, Okada, A, Ragozini, G, Weihs, C. (Eds, 06/2014; Springer Series: Studies in Classification, Data Analysis, and Knowledge Organization. . . . . . . . . . . . . . . . Gioia, F., & Lauro, C. N. (2005). Basic statistical methods for interval .. . . . . . . . . . . . . . . . .
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
47
References Newman, M. E.J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577-8582. Newman, M.E.J. (2013) Modularity, Community, Structure and Spectral Properties of Networks. Preprint physics 0602124 (PNAS in press) Newman M.E.J. & Girvan G. (2004) Finding and evaluating community structure in networks. Phys. Rev E 69, 026113 Noirhomme-Fraiture, M. (2002). Visualization of large data sets: the zoom star solution. International Electronic Journal of Symbolic Data Analysis, 26-39. Rand W M. (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66 (336):846-850, 1971. Richards,W., & Macindoe, O. (2010, August). Decomposing social networks. In Social Computing (SocialCom), 2010 IEEE Second International Conference on (pp. 114-119). IEEE. Rodriguez R.O. (2017) with contributions from Carlos Aguero, Olger Calderon, Roberto Zuniga and Jorge Arce. RSDA: R to Symbolic Data Analysis. R package version 2.0.2. https://CRAN.R-project.org/package=RSDA Steinhaeuser, K., & Chawla, N. V. (2010). Identifying and evaluating community structure in complex networks. Pattern Recognition Letters, . . . . . . . . . . . . . . . . 31(5), 413-421. . . . . . . . . . . . . . . . . .
. .
.
.
.
.
.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection
Application to a Joint-Patent Networks
48
References Strogatz, S. H. (2001). Exploring complex networks. Nature, 410(6825), 268-276. Tang, L., & Liu, H. (2010). Graph mining applications to social network analysis. In Managing and Mining Graph Data (pp. 487-513). Springer US. Treviño, S., Sun, Y., Cooper, T. F., & Bassler, K. E. (2012). Robust Detection of Hierarchical Communities from Escherichia coli. Gene Expression Data. PLoS Computational Biology, 8, e1002391. van Dongen S (2000) Performance criteria for graph clustering and Markov cluster experiments. Technical Report INS-R0012, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, May 2000. Yang, J., & Leskovec, J. (2015). Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, 42(1), 181-213. Zachary, W.W. (1977) An information flow model for conflict and fission in small groups, Journal of Anthropological Research 33, 452-473 .
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. .
.
. .
.
.
.
.
.