Decomposing Large Networks: An Approach Based ...

2 downloads 0 Views 939KB Size Report
Jun 20, 2018 - 49th Scientific meeting of the Italian Statistical Society. University of Palermo ..... universities, research institutes) reported on the extracted applications. 64 individual nodes ...... Institute for. Scientific Interchange Foundation.
Decomposing Large Networks: An Approach Based on the MCA based Community Detection

1

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Carlo Drago

University of Rome “Niccolo Cusano”

49th Scientific meeting of the Italian Statistical Society University of Palermo 20-22 June 2018

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Outline

2

.

.

Outline

Research problem Methodology Simulation study Application on real data Conclusions and Directions for Future Research

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Outline

3

.

.

The Representation of the Communities on a Network Communities The different communities are groups of nodes which tend to be strongly connected to each other and they tend to be loosely connected with nodes of other communities (Fortunato 2010). The communities are the relevant elements on the construction of a network. Each different network is based on the different communities which can be identified inside the same network. The identification of the community structure is very important in order to detect group of nodes which can be part of the same functional structure of the same network. The first step is to identify the different communities which can be considered inside the network and then represent them. Aims The aim is to represent the different communities and identify the core of the community structure or retain the most relevant part of the entire structure .

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Outline

4

.

.

Identification of the Communities

There are various different methodologies with the aim of detecting the different communities (Khan Niazi 2017). Each different method can have a different performance (Mahmoud et al 2016 Leskovec et al 2010). Different algorithms can have different biases for different network structures and so we have to compare the results we obtain using different community algorithms. In this sense it is useful to consider approaches which can take into account an ensamble of different algorithms or approaches in order to synthetized the results obtained (Drago 2017 -2). Robust Community Structure So we obtain a robust community structure via multiple correspondence analysis (MCA) and we validate them using the Rand Index.

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Outline

5

.

.

Consensus Community Detection Algorithm

Following Drago 2017 (2) We start from the network adjacency matrix We consider an ensamble of different community detection algorithms and we obtain different results We collect the results in the Consensus Matrix We apply Multiple Correspondence Analysis (Le Roux, Rouanet 2009) on the Consensus Matrix We perform a Hierarchical Cluster Analysis on the first two dimension (the most relevant), using the Euclidean Distance and the Ward Method (Härdle & Simar 2007) We use a dendrogram to explore the different partitions

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Outline

6

.

.

The Algorithm

We use several well-known methods in community detection: Edge Betweenness community (Clauset Newman Moore 2004) Walktrap community (Pons Latapy 2005) Fastgreedy community (Clauset Newman Moore 2004) Spinglass community (Sathik Rasheed) Leading Eigenvector Community (Newman 2006) Infomap Community (Rosvall Axelsson and Bergstrom 2009) Label Propagation (Raghavan Albert Kumara) Blockmodeling as a tool in Community Detection (Zhao Levina and Zhu 2011 and Karrer Newman 2011)

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Outline

7

.

.

The Algorithm

The Consensus Matrix is useful to observe the comparisons between the partitions The factor map related to the methods is useful to identify the different patterns between different community detection methods A relevant problem could be to identify the stable communities that it is possible to detect by utilizing an ensamble of different methods The factor map related to the nodes allows to identify the different communities (the nodes).

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Outline

8

.

.

The Algorithm

Finally it is possible to obtain the different clusters by using an appropriate distance (Euclidean distance) by performing a cluster analysis using the Ward method The number of clusters is decided in order to explore different partitions (the approach followed is exploratory). The final result is the detection of different clusters which represent the stable communities

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Outline

9

.

.

The Algorithm

Figure: The algorithm

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Outline

10

The Algorithm

Each axis can be interpreted as the pattern for each method of community memberships. The information obtained in that way is also important because it allows to compare the different results obtained by the algorithms. Finally by considering the majority of the numbers of communities extracted from each algorithm we obtain the number of classes to obtain in the hierarchical clustering (HCA). In this way we cut the dendrogram in order to detect the final communities. So we can analyse the results obtained using different numbers of communities (and the stable structures we are able to identify). Finally based on the different communities obtained by the procedure we compare the membership using the Rand Index (see Drago and Balzanella 2015).

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Simulation Study

11

Simulation Study

In order to experiment the different results obtained by the algorithm using different network structures, we consider some simulation experiments using synthethic data. In particular we run the different algorithms to generate different network typologies and we perform the consensus approach in order to detect the different communities. The software used is R and in particular the package Igraph (Csardi Nepusz 2006).

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Simulation Study

12

Simulation Study

The different network typologies are in particular: the Barabasi model, the random graph model etc. Then we consider and compare the different results obtained by the different algorithms with the solution obtained by the consensus. The results show that we obtain a reasonable synthesis from the different algorithms and in particular we are able to see the different communities. In particular we show the results from the simulation on the Barabasi Model. The final results show as a result six communities from the algorithm proposed. The final partition show a Rand index of 0.9-1 with the different partitions obtained with the initial community detection methods. is interesting because it is able to identify nodes which vary their membership from a community to another one, due for example we switch to a lower number of communities (or different type of communities). Usually those nodes which have an higher centrality they tend to be part of different communities using various algorithms

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Simulation Study

13

Simulation Study

Figure: The algorithm

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

14

Applying the Method on Joint-Patent Networks

This application is on Drago Cucco (2013) To illustrate the application of CDTs to innovation networks, a joint patent application network was constructed starting from the three Italian branches of a leading firm (electronics) We present the preliminary results of the analysis and describe the next steps

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

15

Data

Source: OECD Regpat database, which reports patent applications to the European Patent Office and applications filed under the Patent Cooperation Treaty Data Structure Originally two mode data (applicant(s) - patent) Projected onto a one mode applicant-applicant network

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

16

The Network

All patent applications filed by the three branches of the firm after 1990 were extracted A first node set was created by listing all the co-applicants (firms, universities, research institutes) reported on the extracted applications 64 individual nodes were identified in this step using (a) harmonized names in OECD Harmonized Applicants Names database; (b) manual checks for ambiguous cases All patent applications filed by the identified actors were extracted The process was repeated, resulting in a final node set of 1,703 nodes

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

17

Edge Cuts

All joint patent applications between the 1703 nodes were extracted and transformed into a one-mode valued network (applicant-applicant) Edge weights equal to the number of joint patent applications To remove occasional collaborations, we operated an edge-cuts on the network (less than five collaborations) Isolates were removed, and the networks were binarized

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

18

Edge-Cut Network (at least 5 joint applications)

Network Descriptives We start from the general features of the network, described by their descriptives then we explore the network to find the stable communities.

vertices edges density diameter centralization degree betweenness

Edge Cut Network 216 248 0.01 526.00 0.26 2.30 390.52

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

19

Edge-Cut Network (at least 5 joint applications; 216 nodes) Figure: Edge-Cut Network

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

20

Edge-Cut Network: communities Figure: Edge Cut Network

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

21

Edge-Cut Network: communities Figure: Edge Cut Network

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

22

Edge-Cut Network: Consensus Matrix

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 …

me 1 2 3 4 4 4 1 5 5 5 4 4 6 5 5 5 …

me2 13 3 8 6 6 6 2 6 5 5 6 6 1 5 5 5 …

me3 1 7 2 3 3 3 1 3 5 5 3 3 3 5 5 5 …

me5 1 2 9 14 14 14 1 10 10 10 14 14 13 10 10 10 …

me6 10 13 2 8 8 8 10 8 11 11 8 8 8 11 11 11 …

me7 16 6 3 1 1 1 5 1 2 2 1 1 8 2 2 2 …

.

.

.

.

.

.

me8 1 2 3 4 4 4 5 4 6 6 4 4 4 6 6 6 … . . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

23

Edge-Cut Network: Consensus Matrix

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 …

me 1 2 3 4 4 4 1 5 5 5 4 4 6 5 5 5 …

me2 13 3 8 6 6 6 2 6 5 5 6 6 1 5 5 5 …

me3 1 7 2 3 3 3 1 3 5 5 3 3 3 5 5 5 …

me5 1 2 9 14 14 14 1 10 10 10 14 14 13 10 10 10 …

me6 10 13 2 8 8 8 10 8 11 11 8 8 8 11 11 11 …

me7 16 6 3 1 1 1 5 1 2 2 1 1 8 2 2 2 …

.

.

.

.

.

.

me8 1 2 3 4 4 4 5 4 6 6 4 4 4 6 6 6 … . . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

24

Edge-Cut Network: Consensus Matrix

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 …

me 1 2 3 4 4 4 1 5 5 5 4 4 6 5 5 5 …

me2 13 3 8 6 6 6 2 6 5 5 6 6 1 5 5 5 …

me3 1 7 2 3 3 3 1 3 5 5 3 3 3 5 5 5 …

me5 1 2 9 14 14 14 1 10 10 10 14 14 13 10 10 10 …

me6 10 13 2 8 8 8 10 8 11 11 8 8 8 11 11 11 …

me7 16 6 3 1 1 1 5 1 2 2 1 1 8 2 2 2 …

.

.

.

.

.

.

me8 1 2 3 4 4 4 5 4 6 6 4 4 4 6 6 6 … . . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

25

Edge-Cut Network: MCA, methods Figure: Edge-Cut Network: MCA, methods

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

26

Edge-Cut Network: MCA, nodes Figure: Edge Cut Network: MCA, nodes

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

27

Edge-Cut Network: dendrogram Figure: Edge Cut Network: dendrogram

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

28

Edge-Cut Network: stable communities Figure: Edge Cut Network: stable communities

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

29

Edge-Cut Network: stable communities Figure: Edge Cut Network: stable communities

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

30

Advantages in using the method

There are clear advantages in using ensambles when different methods produce different information about communities: Measure the persistence of the co-participation of some nodes in the same community Overcome the biases of each method Understand what each method concretely tells us

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

31

Representation of the communities

At this point it is necessary to represent the different communities in such a way which cannot lead to the loss of relevant information from the original data. The computed Rand Index gives us information on the capacity of the resultant representation to ”capture” the initial results of the different community detection on the network.

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

32

From the Communities to their Representation

The entire community structure, each different community, can be represented as a different interval data (Drago 2017). Differently it is possible to consider the entire network as a symbolic data (Giordano Brito 2014). In this sense we are able to obtain different interval data for each community. So the procedure used is comprised of three steps: identifying the different communities from a network using an approach of community detection (Khan Niazi 2017), and then from the different member community memberships obtaining the interval data. Each different community is based on all the single nodes of the network From the interval data considered it is possible to measure the different attributes which are relevant in order to represent the entire community. Each measure is related to structural characteristics or attributes of the same node

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

33

From the Communities to their Representation

So we can have the attribute or structural characteristics for the entire community in addition to the attribute of those of the single n nodes. We have the single interval measure for the community based on those of the nodes which are members of the community: Xa = (x1 , x2 , . . . , xn )

(1)

Where x are the different measures for the nodes belonging to a community Xa (for instance the different betweenness or the degree). The interval data for the single community is: XI,a = [x, x]

(2)

Where x represents the upper bound of the measure belonging to the community and the x the lower bound. At this point we can consider the descriptors of the different communities as intervals (Gioia Lauro 2005). In this way we can consider both the single different observations, but also the different communities by considering the intervals of their measures.

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

34

From the Communities to their Representation

It is possible to compare the different communities by their attributes (the upper and the lower bound) but also the centers and the radii [?]. So we have the center: 1 (x + x) (3) 2 we can also consider the range between the upper and the lower bound XI,a center =

XI,a range = (x − x)

(4)

and the radii 1 (x + x) (5) 2 These descriptors allow to take into account the different communities and to compare them. XI,a radius =

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

35

Ranking the different Representations Rankings At this point it is necessary to identify the different rankings of the representations. In this sense we have to explicitly consider the different intervals and their attributes. In particular each interval can be characterized by their attributes as the lower bound and the upper bound. Starting from their descriptors it is possible compare the different attributes or structural indicators for each community considered. Following M’Ballo and Diday 2005 we consider the ranking for the different intervals obtained. The comparison can be conducted by considering the different attributes of the intervals (the upper and the lower bounds, the range and the radii). One of the possible applications of ranking the different attributes or the different structural characteristics of the different communities is to detect the centre of the network based on the different communities. In this sense we are interested not in single nodes but in considering the communities as the initial point of the analysis. .

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

36

Ranking the different Representations Ranking of centrality The ranking of centrality, for instance, is computed by considering the different communities, and at the same time those selected are considered on the final network selected by their structural characteristics. At this point, it is possible to consider the ranking also by taking into account only a number of different communities with the aim of detecting the central part of the network for some relevant structural characteristics. We obtain in this way a stylized structure of the network considering the most relevant communities. The choice and the validation is performed by observing a graph in which are visualized the changes on some indicators (betweenness and degree for instance). We consider in this case the changes on the center values for each community. A radar plot (Noirhomme 2002) is a tool to analyze and compare the different measures on the ranking. So it could be used as a diagnostic tool in the choice. The final network structure is based on considering only . . . . . . . . . . . . . . . . these communities. .

.

.

. . . .

. . . .

. . . .

.

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

37

Simulation Study and Application on Real Data

It is possible to consider different simulated networks in order to evaluate the procedure proposed. In order to test the algorithms then we consider various types of networks and we consider the approach for each different network. In particular we simulate different networks of different typology and different size and then we apply the approach (Barabasi Game, Erdos Renyi and also Forest Fire (Csardi Nepusz 2016). We are able to show the community structure by detecting the different communities using the MCA-based community algorithm procedure (Drago 2017). Then we represent them as interval data and we represent them as two descriptors as upper and lower bound for each community. Finally we are able to compute also the center and the radius. The statistical methods considered on the different intervals based on the communities are on Gioia Lauro 2005. The package RSDA on R allows the performing of different computations based on interval data (Rodriguez 2017).

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

38

Simulation Study and Application on Real Data

We visualize the ranking of the different communities obtained by considering the appropriate methods and we can visualize them by using a radar plot. A radar plot visualizes each attribute of the community and structural indicator expressed as interval. At the same time we can choose the number of communities by observing the change on the relevant center parameters in the different communities (on betweenness and degree in our case). So we are able to visualize the most central communities by considering the highest ranked communities by their betweenness and the degree. At the same time the radar plot is actually showing the ranking considering also the other structural characteristics represented as interval data for each specific data.

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

39

Simulation Study and Application on Real Data

Finally by choosing the first ranked communities we are able to identify the stylized structure of the network starting from their specific initial structure. In this sense we start from the entire structure and then we are able to rank the different communities by considering the different attributes. Finally we select the first communities and we obtain the most central communities from the network. In the case of application on real data we consider the network of the Zachary karate club (Zachary 1977). Here we are able to observe and select the most relevant part of the network by selection of the most central communities. The community-core of the network These communities identify the ”core of the network” rather than other peripherical network structures.

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

40

Results Figure: Edge Cut Network: stable communities

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

41

Conclusions

The approach followed in this paper is to consider the different communities representing them as interval data and then ranking them. The procedure considered determines the different communities of the network and detects the most central different representations by considering some structural indicators as the betweenness or the Freeman degree. It is important to emphasize that the analysis is community-based and it is robust enclosing the results of many community detection algorithms. Of course other attributes of the different communities can be considered and the focus in this sense can be on different structural characteristics of the network.

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

42

References

A.V (2012) Evidence of Networking in the European Research Area. Project financed by the 6th Framework Programme for Research, for the implementation of the specific programme ?Strengthening the Foundations of the European Research Area? (Invitation to tender n� DG RTD 2005 M 02 02) Blondel, V.D. Guillaume, J.L. Lambiotte, R. Lefebvre E. (2008): Fast unfolding of communities in large networks. J. Stat. Mech. P10008 Burger-Helmchen, T. (Ed.). (2013). The Economics of Creativity: Ideas, Firms and Markets (Vol. 60). Routledge. Christ, J. (2009). The Geography and Co-Location of European Technology-Speci fic Co-Inventorship Networks. University of Hohenheim FZID Discussion Paper, (14-2010). Csardi G, Nepusz T (2006) The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. Danon L, Diaz-Guilera A, Duch J, Arenas A: Comparing community structure identification. J Stat Mech P09008, 2005.

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

43

References

Drago, C. (2017) MCA Based Community Detection. Classification, (Big) Data Analysis and Statistical Learning, Edition: Studies in Classification, Data Analysis, and Knowledge Organization, Publisher: Springer, Editors: Francesco Mola, Claudio Conversano, Maurizio Vichi Drago, C., & Balzanella, A. (2015). Nonmetric MDS Consensus Community Detection. In Advances in Statistical Models for Data Analysis (pp. 97-105). Springer International Publishing. Drago C. & Cucco I. (2013) ”Robust Communities Detection in Joint-Patent Application Networks”. XXXIII Sunbelt Social Networks Conference of the International Network for Social Network Analysis (INSNA), Hamburg; 05/2013 Drago C & Ricciuti R. (2015) Bootstrapping the Gini Index of the Network Freeman Degree. Statistics and Demography: the Legacy of Corrado Gini, Corrado Crocetta Editor, ISBN 978 88 678, Treviso, September 2015 (also presented at Italian Statistical Society SIS conference 2015)

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

44

References

Girvan M. and Newman M. E. (2002), Proc. Natl. Acad. Sci. USA 99, 7821. Fortunato S. (2009), arXiv:0906.0612. Fortunato, S. (2010) Community detection in graphs. Physics Reports, 486(3), 75-174. Fortunato, S. (2013) Community structure in networks. Institute for Scientific Interchange Foundation Fortunato, S., & Castellano, C. (2007). Community structure in graphs. arXiv preprint arXiv:0712.2716. Härdle, W., & Simar, L. (2007). Applied multivariate statistical analysis. Springer Verlag.

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

45

References

Karrer, B., & Newman, M. E. (2011). Stochastic blockmodels and community structure in networks. Physical Review E, 83(1), 016107. Kim, J., & Wilhelm, T. (2008). What is a complex graph? Physica A: Statistical Mechanics and Its Applications, 387(11), 2637?2652. doi:10.1016/j.physa.2008.01.015 Lancichinetti, A., & Fortunato, S. (2010). Community detection algorithms: A comparative analysis. Physical review E, 80(5), 056117. Lancichinetti, A., & Fortunato, S. (2012). Consensus clustering in complex networks. Scientific reports, 2. Lancichinetti A., Radicchi F., Ramasco J.J. and Fortunato S. (2011) Finding statistically significant communities in networks. PloS One 6, e18961 Le Roux, B., & Rouanet, H. (2009). Multiple correspondence analysis (Vol. 163). SAGE Publications, Incorporated.

.

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

46

References Billard, L., & Diday, E. (2006) Symbolic Data Analysis: Conceptual Statistics and Data Mining. England: Wiley & Sons Ltd Csardi G, Nepusz T: The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. http://igraph.org Drago C. (2017) Identifying Meta Communities on Large Networks. SIS Italian Statistical Society 2017 Conference: Statistics and Data Science: New Challenges, New Generations. Drago C. -2 (2017) MCA Based Community Detection In book: Classification, (Big) Data Analysis and Statistical Learning, Edition: Studies in Classification, Data Analysis, and Knowledge Organization, Publisher: Springer, Editors: Francesco Mola, Claudio Conversano, Maurizio Vichi Duan, L., & Binbasioglu, M. (2017). An ensemble framework for community detection. Journal of Industrial Information Integration, 5, 1-5. Fortunato, S. (2010). Community detection in graphs. Physics reports, 486 (3), 75-174. Giordano G., Brito M. P. (2014) Social Networks as Symbolic Data, in: Analysis and Modeling of Complex Data in Behavioral and Social Sciences, Edited by Vicari, D, Okada, A, Ragozini, G, Weihs, C. (Eds, 06/2014; Springer Series: Studies in Classification, Data Analysis, and Knowledge Organization. . . . . . . . . . . . . . . . Gioia, F., & Lauro, C. N. (2005). Basic statistical methods for interval .. . . . . . . . . . . . . . . . .

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

47

References Newman, M. E.J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577-8582. Newman, M.E.J. (2013) Modularity, Community, Structure and Spectral Properties of Networks. Preprint physics 0602124 (PNAS in press) Newman M.E.J. & Girvan G. (2004) Finding and evaluating community structure in networks. Phys. Rev E 69, 026113 Noirhomme-Fraiture, M. (2002). Visualization of large data sets: the zoom star solution. International Electronic Journal of Symbolic Data Analysis, 26-39. Rand W M. (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66 (336):846-850, 1971. Richards,W., & Macindoe, O. (2010, August). Decomposing social networks. In Social Computing (SocialCom), 2010 IEEE Second International Conference on (pp. 114-119). IEEE. Rodriguez R.O. (2017) with contributions from Carlos Aguero, Olger Calderon, Roberto Zuniga and Jorge Arce. RSDA: R to Symbolic Data Analysis. R package version 2.0.2. https://CRAN.R-project.org/package=RSDA Steinhaeuser, K., & Chawla, N. V. (2010). Identifying and evaluating community structure in complex networks. Pattern Recognition Letters, . . . . . . . . . . . . . . . . 31(5), 413-421. . . . . . . . . . . . . . . . . .

. .

.

.

.

.

.

Decomposing Large Networks: An Approach Based on the MCA based Community Detection

Application to a Joint-Patent Networks

48

References Strogatz, S. H. (2001). Exploring complex networks. Nature, 410(6825), 268-276. Tang, L., & Liu, H. (2010). Graph mining applications to social network analysis. In Managing and Mining Graph Data (pp. 487-513). Springer US. Treviño, S., Sun, Y., Cooper, T. F., & Bassler, K. E. (2012). Robust Detection of Hierarchical Communities from Escherichia coli. Gene Expression Data. PLoS Computational Biology, 8, e1002391. van Dongen S (2000) Performance criteria for graph clustering and Markov cluster experiments. Technical Report INS-R0012, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, May 2000. Yang, J., & Leskovec, J. (2015). Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, 42(1), 181-213. Zachary, W.W. (1977) An information flow model for conflict and fission in small groups, Journal of Anthropological Research 33, 452-473 .

.

.

.

.

.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. .

.

. .

.

.

.

.

.

Suggest Documents