Robust Communities Detection in Joint-patent Application Networks

0 downloads 0 Views 896KB Size Report
Robust Communities Detection in Joint-patent Application Networks. 1 ... BUT: to date there is still little joint patenting activity between members/partners ..... 26. Edge-Cut Network: Consensus Matrix me me2 me3 me5 me6 me7 me8. 1. 1. 13.
Robust Communities Detection in Joint-patent Application Networks

Robust Communities Detection in Joint-patent Application Networks A Consensus Approach

Carlo Drago and Ivan Cucco

University of Naples “Federico II” Department of Economics and Statistics

INSNA Sunbelt 2013, Hamburg

1

Robust Communities Detection in Joint-patent Application Networks

Outline

Research Problem Method Application to a Joint Patenting Network Results and Discussion Conclusions and Directions for Future Research

Outline

2

Robust Communities Detection in Joint-patent Application Networks

Research Problem

Research Framework

This work is part of REPOS, a research project involving several universities in Italy and the University of Ljubljana Aims: to develop methologies for the evaluation of Network-Based Policies (NBP) in favour of innovation Emphasis on the effects of NBP on innovative networks Empirical focus on Italian government-sponsored technological districts (aerospace, biotechnologies, nanotech, new materials)

3

Robust Communities Detection in Joint-patent Application Networks

Research Problem

Research Framework

Several works within the project analyze cooperation networks among TD members and partners We are interested in integrating research on joint participation in TD project with the analysis of the patenting networks in which members/partners are involved before and after the establishment of TDs BUT: to date there is still little joint patenting activity between members/partners

4

Robust Communities Detection in Joint-patent Application Networks

Research Problem

Research Framework

We look therefore at larger patenting networks that include TD members/partners as well as their co-applicants These networks are relatively large (for example, the post-2004 patenting network for one of the districts includes about 6,000 nodes) Research Aim The aim is to detect the stable patenting communities to which TD members/partners belong, and to track their evolution over time

5

Robust Communities Detection in Joint-patent Application Networks

Research Problem

Community Structure

Community Structure A network has a community structure if the nodes can be grouped into sets which are densely connected with ”many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters” (Fortunato 2010). In general: One of the main reason for detecting communities is to find the characteristic pattern of each community (related for example to specific node attributes) At the same time there are different functions related to the communities inside a network

6

Robust Communities Detection in Joint-patent Application Networks

Community Structure Figure : Community Structure: Karate Zachary Club (Zachary 1977)

Research Problem

7

Robust Communities Detection in Joint-patent Application Networks

Community Detection

”Complex systems are usually organized in compartments, which have their own role and or function”. ”In the network representation, such compartments appear as sets of nodes with a high density of internal links, whereas links between compartments have a comparatively lower density”. ”These subgraphs are called communities, or modules, and occur in a wide variety of networked systems”. See: Lancichenetti and Fortunato (2009), Girvan and Newman (2002) and Fortunato (2009).

Method

8

Robust Communities Detection in Joint-patent Application Networks

Community Structure Figure : Community Structure: Karate Zachary Club (Zachary 1977)

Method

9

Robust Communities Detection in Joint-patent Application Networks

Community Detection

Communities can be flat or separated, overlapping or nested in a hierarchical structure Community detection algorithms aim at identifying the modules and the hierarchical organization by considering only the graph topology (Fortunato 2007) Community detection techniques usually do not model the network but adopt an algorithmic approach in order to detect patterns in the network

Method

10

Robust Communities Detection in Joint-patent Application Networks

Community Detection Algorithms

A relevant problem in literature is to identify communities in a network when: their exact number is unknown the communities can be characterized by unequal sizes and densities

Method

11

Robust Communities Detection in Joint-patent Application Networks

Community Detection Algorithms

Various algorithms and methods have been proposed for accomplishing this task. The choice of the algorithm is however problematic since: In an explorative framework where no apriori information is available on the communities in the network, the choice of the “right” algorithm can be unfeasible Different methods show different performances and can suffer from different biases (Leskovec Lang Mahoney 2010) Each method seems to be more appropriate for some specific network typologies The partitions generated by different methods do not necessarily match (Good Montjoye Clauset 2010) When a given method produces several outputs, it is dificult to consider a single partition as being more representative of the actual community structure (Lancichenetti Fortunato 2012)

Method

12

Robust Communities Detection in Joint-patent Application Networks

Community Detection Algorithms

We use a consensus algorithm to assess the stability of the detected communities Drago and Balzanella (2013) propose to use an ensemble of community detection algorithms, to then find a consensus partition which allows to combine the information produced by various community detection methods. We apply different ensambles of methodologies on the same relational data and use statistical procedures to evaluate the level of agreement between the different procedures (see Lancichenetti and Fortunato 2012) As a first illustration, the methodology is applied to a joint patent application network drawn from the OECD Regpat database

Method

13

Robust Communities Detection in Joint-patent Application Networks

Consensus Community Detection Algorithm

We start from the network adjacency matrix

Method

14

Robust Communities Detection in Joint-patent Application Networks

Consensus Community Detection Algorithm

We start from the network adjacency matrix We consider an ensamble of different community detection algorithms and we obtain different results

Method

14

Robust Communities Detection in Joint-patent Application Networks

Consensus Community Detection Algorithm

We start from the network adjacency matrix We consider an ensamble of different community detection algorithms and we obtain different results We collect the results in the Consensus Matrix

Method

14

Robust Communities Detection in Joint-patent Application Networks

Consensus Community Detection Algorithm

We start from the network adjacency matrix We consider an ensamble of different community detection algorithms and we obtain different results We collect the results in the Consensus Matrix We apply Multiple Correspondence Analysis (Le Roux, Rouanet 2009) on the Consensus Matrix

Method

14

Robust Communities Detection in Joint-patent Application Networks

Consensus Community Detection Algorithm

We start from the network adjacency matrix We consider an ensamble of different community detection algorithms and we obtain different results We collect the results in the Consensus Matrix We apply Multiple Correspondence Analysis (Le Roux, Rouanet 2009) on the Consensus Matrix We perform a Hierarchical Cluster Analysis on the first two dimension (the most relevant), using the Euclidean Distance and the Ward Method (H¨ ardle & Simar 2007)

Method

14

Robust Communities Detection in Joint-patent Application Networks

Consensus Community Detection Algorithm

We start from the network adjacency matrix We consider an ensamble of different community detection algorithms and we obtain different results We collect the results in the Consensus Matrix We apply Multiple Correspondence Analysis (Le Roux, Rouanet 2009) on the Consensus Matrix We perform a Hierarchical Cluster Analysis on the first two dimension (the most relevant), using the Euclidean Distance and the Ward Method (H¨ ardle & Simar 2007) We use a dendrogram to explore the different partitions

Method

14

Robust Communities Detection in Joint-patent Application Networks

The Algorithm

We use several well-known methods in community detection: Edge Betweenness community (Clauset Newman Moore 2004) Walktrap community (Pons Latapy 2005) Fastgreedy community (Clauset Newman Moore 2004) Spinglass community (Sathik Rasheed) Leading Eigenvector Community (Newman 2006) Infomap Community (Rosvall Axelsson and Bergstrom 2009) Label Propagation (Raghavan Albert Kumara) Blockmodeling as a tool in Community Detection (Zhao Levina and Zhu 2011 and Karrer Newman 2011)

Method

15

Robust Communities Detection in Joint-patent Application Networks

The Algorithm

The Consensus Matrix is useful to observe the comparisons between the partitions

Method

16

Robust Communities Detection in Joint-patent Application Networks

The Algorithm

The Consensus Matrix is useful to observe the comparisons between the partitions The factor map related to the methods is useful to identify the different patterns between different community detection methods A relevant problem could be to identify the stable communities that it is possible to detect by utilizing an ensamble of different methods

Method

16

Robust Communities Detection in Joint-patent Application Networks

The Algorithm

The Consensus Matrix is useful to observe the comparisons between the partitions The factor map related to the methods is useful to identify the different patterns between different community detection methods A relevant problem could be to identify the stable communities that it is possible to detect by utilizing an ensamble of different methods The factor map related to the nodes allows to identify the different communities (the nodes).

Method

16

Robust Communities Detection in Joint-patent Application Networks

The Algorithm

Finally it is possible to obtain the different clusters by using an appropriate distance (Euclidean distance) by performing a cluster analysis using the Ward method

Method

17

Robust Communities Detection in Joint-patent Application Networks

The Algorithm

Finally it is possible to obtain the different clusters by using an appropriate distance (Euclidean distance) by performing a cluster analysis using the Ward method The number of clusters is decided in order to explore different partitions (the approach followed is exploratory). The final result is the detection of different clusters which represent the stable communities

Method

17

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Applying the Method on Joint-Patent Networks

To illustrate the application of CDTs to innovation networks, a joint patent application network was constructed starting from the three Italian branches of a leading firm (electronics) We present the preliminary results of the analysis and describe the next steps

18

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Data

Source: OECD Regpat database, which reports patent applications to the European Patent Office and applications filed under the Patent Cooperation Treaty Data Structure Originally two mode data (applicant(s) - patent) Projected onto a one mode applicant-applicant network

19

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

The Network

All patent applications filed by the three branches of the firm after 1990 were extracted

20

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

The Network

All patent applications filed by the three branches of the firm after 1990 were extracted A first node set was created by listing all the co-applicants (firms, universities, research institutes) reported on the extracted applications

20

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

The Network

All patent applications filed by the three branches of the firm after 1990 were extracted A first node set was created by listing all the co-applicants (firms, universities, research institutes) reported on the extracted applications 64 individual nodes were identified in this step using (a) harmonized names in OECD Harmonized Applicants Names database; (b) manual checks for ambiguous cases

20

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

The Network

All patent applications filed by the three branches of the firm after 1990 were extracted A first node set was created by listing all the co-applicants (firms, universities, research institutes) reported on the extracted applications 64 individual nodes were identified in this step using (a) harmonized names in OECD Harmonized Applicants Names database; (b) manual checks for ambiguous cases All patent applications filed by the identified actors were extracted

20

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

The Network

All patent applications filed by the three branches of the firm after 1990 were extracted A first node set was created by listing all the co-applicants (firms, universities, research institutes) reported on the extracted applications 64 individual nodes were identified in this step using (a) harmonized names in OECD Harmonized Applicants Names database; (b) manual checks for ambiguous cases All patent applications filed by the identified actors were extracted The process was repeated, resulting in a final node set of 1,703 nodes

20

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Edge Cuts

All joint patent applications between the 1703 nodes were extracted and transformed into a one-mode valued network (applicant-applicant) Edge weights equal to the number of joint patent applications To remove occasional collaborations, we operated an edge-cuts on the network (less than five collaborations) Isolates were removed, and the networks were binarized

21

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Edge-Cut Network (at least 5 joint applications)

Network Descriptives We start from the general features of the network, described by their descriptives then we explore the network to find the stable communities.

vertices edges density diameter centralization degree betweenness

Edge Cut Network 216 248 0.01 526.00 0.26 2.30 390.52

22

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Edge-Cut Network (at least 5 joint applications; 216 nodes) Figure : Edge-Cut Network

23

Robust Communities Detection in Joint-patent Application Networks

Edge-Cut Network: communities Figure : Edge Cut Network

Application to a Joint-Patent Networks

24

Robust Communities Detection in Joint-patent Application Networks

Edge-Cut Network: communities Figure : Edge Cut Network

Application to a Joint-Patent Networks

25

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Edge-Cut Network: Consensus Matrix

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ...

me 1 2 3 4 4 4 1 5 5 5 4 4 6 5 5 5 ...

me2 13 3 8 6 6 6 2 6 5 5 6 6 1 5 5 5 ...

me3 1 7 2 3 3 3 1 3 5 5 3 3 3 5 5 5 ...

me5 1 2 9 14 14 14 1 10 10 10 14 14 13 10 10 10 ...

me6 10 13 2 8 8 8 10 8 11 11 8 8 8 11 11 11 ...

me7 16 6 3 1 1 1 5 1 2 2 1 1 8 2 2 2 ...

me8 1 2 3 4 4 4 5 4 6 6 4 4 4 6 6 6 ...

26

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Edge-Cut Network: Consensus Matrix

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ...

me 1 2 3 4 4 4 1 5 5 5 4 4 6 5 5 5 ...

me2 13 3 8 6 6 6 2 6 5 5 6 6 1 5 5 5 ...

me3 1 7 2 3 3 3 1 3 5 5 3 3 3 5 5 5 ...

me5 1 2 9 14 14 14 1 10 10 10 14 14 13 10 10 10 ...

me6 10 13 2 8 8 8 10 8 11 11 8 8 8 11 11 11 ...

me7 16 6 3 1 1 1 5 1 2 2 1 1 8 2 2 2 ...

me8 1 2 3 4 4 4 5 4 6 6 4 4 4 6 6 6 ...

27

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Edge-Cut Network: Consensus Matrix

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ...

me 1 2 3 4 4 4 1 5 5 5 4 4 6 5 5 5 ...

me2 13 3 8 6 6 6 2 6 5 5 6 6 1 5 5 5 ...

me3 1 7 2 3 3 3 1 3 5 5 3 3 3 5 5 5 ...

me5 1 2 9 14 14 14 1 10 10 10 14 14 13 10 10 10 ...

me6 10 13 2 8 8 8 10 8 11 11 8 8 8 11 11 11 ...

me7 16 6 3 1 1 1 5 1 2 2 1 1 8 2 2 2 ...

me8 1 2 3 4 4 4 5 4 6 6 4 4 4 6 6 6 ...

28

Robust Communities Detection in Joint-patent Application Networks

Edge-Cut Network: MCA, methods Figure : Edge-Cut Network: MCA, methods

Application to a Joint-Patent Networks

29

Robust Communities Detection in Joint-patent Application Networks

Edge-Cut Network: MCA, nodes Figure : Edge Cut Network: MCA, nodes

Application to a Joint-Patent Networks

30

Robust Communities Detection in Joint-patent Application Networks

Edge-Cut Network: dendrogram Figure : Edge Cut Network: dendrogram

Application to a Joint-Patent Networks

31

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Edge-Cut Network: stable communities Figure : Edge Cut Network: stable communities

32

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Edge-Cut Network: stable communities Figure : Edge Cut Network: stable communities

33

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Preliminary Results

The three branches from which we started the construction of the network are in the same community together with some of their district members and patners.

34

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Preliminary Results

The three branches from which we started the construction of the network are in the same community together with some of their district members and patners. In this community there is also an Italian university.

34

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Preliminary Results

The three branches from which we started the construction of the network are in the same community together with some of their district members and patners. In this community there is also an Italian university. All other Italian universities belong to a separate community in which there are no private firms.

34

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Preliminary Results

The three branches from which we started the construction of the network are in the same community together with some of their district members and patners. In this community there is also an Italian university. All other Italian universities belong to a separate community in which there are no private firms. This is relevant, because the technological district policies had among its aims the cooperation between universities and firms. It is one of the points we should look at in more detail, when we apply the methodology to all the firms in the district

34

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Preliminary Results

The three branches from which we started the construction of the network are in the same community together with some of their district members and patners. In this community there is also an Italian university. All other Italian universities belong to a separate community in which there are no private firms. This is relevant, because the technological district policies had among its aims the cooperation between universities and firms. It is one of the points we should look at in more detail, when we apply the methodology to all the firms in the district Incidentally, the remaining communities show strong geographical patterns

34

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Advantages in using the method

There are clear advantages in using ensambles when different methods produce different information about communities: Measure the persistence of the co-participation of some nodes in the same community

35

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Advantages in using the method

There are clear advantages in using ensambles when different methods produce different information about communities: Measure the persistence of the co-participation of some nodes in the same community Overcome the biases of each method

35

Robust Communities Detection in Joint-patent Application Networks

Application to a Joint-Patent Networks

Advantages in using the method

There are clear advantages in using ensambles when different methods produce different information about communities: Measure the persistence of the co-participation of some nodes in the same community Overcome the biases of each method Understand what each method concretely tells us

35

Robust Communities Detection in Joint-patent Application Networks

Conclusions and Directions for Future Research

Conclusions and Future Extensions

Next steps in the application to the empirical research problem:

36

Robust Communities Detection in Joint-patent Application Networks

Conclusions and Directions for Future Research

Conclusions and Future Extensions

Next steps in the application to the empirical research problem: Apply the method to networks elicited from all members of a TD

36

Robust Communities Detection in Joint-patent Application Networks

Conclusions and Directions for Future Research

Conclusions and Future Extensions

Next steps in the application to the empirical research problem: Apply the method to networks elicited from all members of a TD Perform substantive analysis on the identified communities (node attributes)

36

Robust Communities Detection in Joint-patent Application Networks

Conclusions and Directions for Future Research

Conclusions and Future Extensions

Next steps in the application to the empirical research problem: Apply the method to networks elicited from all members of a TD Perform substantive analysis on the identified communities (node attributes) Identify the changes in community membership for TD members before and after the implementation of government NBP

36

Robust Communities Detection in Joint-patent Application Networks

Conclusions and Directions for Future Research

Conclusions and Future Extensions

Next steps in the application to the empirical research problem: Apply the method to networks elicited from all members of a TD Perform substantive analysis on the identified communities (node attributes) Identify the changes in community membership for TD members before and after the implementation of government NBP Analyze whether TD project partners tend, in time, to be located in the same patenting communities

36

Robust Communities Detection in Joint-patent Application Networks

Conclusions and Directions for Future Research

Conclusions and Future Extensions

Next steps in the application to the empirical research problem: Apply the method to networks elicited from all members of a TD Perform substantive analysis on the identified communities (node attributes) Identify the changes in community membership for TD members before and after the implementation of government NBP Analyze whether TD project partners tend, in time, to be located in the same patenting communities Differentiate across sectors (according to IPC codes)

36

Robust Communities Detection in Joint-patent Application Networks

Conclusions and Directions for Future Research

References A.V (2012) Evidence of Networking in the European Research Area. Project financed by the 6th Framework Programme for Research, for the implementation of the specific programme ?Strengthening the Foundations of the European Research Area? (Invitation to tender n DG RTD 2005 M 02 02) Burger-Helmchen, T. (Ed.). (2013). The Economics of Creativity: Ideas, Firms and Markets (Vol. 60). Routledge. Christ, J. (2009). The Geography and Co-Location of European Technology-Speci fic Co-Inventorship Networks. University of Hohenheim FZID Discussion Paper, (14-2010). Drago C. (2012) Stable Communities Detection. Mimeo Girvan M. and Newman M. E. (2002), Proc. Natl. Acad. Sci. USA 99, 7821. Fortunato S. (2009), arXiv:0906.0612. Fortunato, S. (2010) Community detection in graphs. Physics Reports, 486(3), 75-174. Fortunato, S. (2013) Community structure in networks. Institute for Scientific Interchange Foundation Fortunato, S., & Castellano, C. (2007). Community structure in graphs. arXiv preprint arXiv:0712.2716. H¨ ardle, W., & Simar, L. (2007). Applied multivariate statistical analysis. Springer Verlag.

37

Robust Communities Detection in Joint-patent Application Networks

Conclusions and Directions for Future Research

References Lancichinetti, A., & Fortunato, S. (2009). Community detection algorithms: A comparative analysis. Physical review E, 80(5), 056117. Lancichinetti, A., & Fortunato, S. (2012). Consensus clustering in complex networks. Scientific reports, 2. Lancichinetti A., Radicchi F., Ramasco J.J. and Fortunato S. (2011) Finding statistically significant communities in networks. PloS One 6, e18961 Le Roux, B., & Rouanet, H. (2009). Multiple correspondence analysis (Vol. 163). SAGE Publications, Incorporated. Mascolo C. (2013) Lecture 4: Modularity and Overlapping Communities. Lecture Notes Cambridge University Moradi, F., Olovsson, T., & Tsigas, P. (2012). An evaluation of community detection algorithms on large-scale email traffic. In Experimental Algorithms (pp. 283-294). Springer Berlin Heidelberg Newman, M. E.J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577-8582. Newman, M.E.J. (2013) Modularity, Community, Structure and Spectral Properties of Networks. Preprint physics 0602124 (PNAS in press) Newman M.E.J. & Girvan G. (2004) Finding and evaluating community structure in networks. Phys. Rev E 69, 026113 Tang, L., & Liu, H. (2010). Graph mining applications to social network

38

Suggest Documents