Profiles in Professional Social Networks - Springer Link

Profiles in Professional Social Networks Jaroslav Pokorný

Abstract This paper discusses user profiles as an important component of online social networks (OSN). A special category of OSN includes professional social networks, where so-called professional profiles are significant. They enable to connect not only people but also projects to people, courses to students, etc. A powerful tool for representing profiles is ontologies, particularly various classification hierarchies. A contribution of this paper is a matching framework able to consider profiles, whose some features are described by concepts from classification hierarchies. Moreover, users can assign weights to these concepts and influence an associated similarity measure. We discuss the notions of similarity and compatibility of such profiles and show some new possibilities how to tackle the matching problem.

1 Introduction Online social networks (OSN) are emerging as a new type of application on the Internet, which can be considered as a natural extension of Web applications that establishes and manages explicit relationships between users [10]. OSNs have reached a great development and popularity for users to connect as well as express content and share it. For example, MySpace (http://www.myspace.com) with over 275 million users, Facebook (http://www.facebook.com) with over 845 million users in February 2011 and Orkut (http://www.orkut.com) with over 100 million users mainly from India and Brazil are examples of popular networks used to find and organize contacts. A special category of OSN includes professional social networks (PSN), e.g. well-known LinkedIn (http://www.linkedin.com) and Academia. edu (over 1.2 million academics in 2012)—a free social networking Website and

J. Pokorný (*) Department of Software Engineering, Charles University of Prague, Prague, Czech Republic e-mail: [email protected] H. Linger et al. (eds.), Building Sustainable Information Systems: Proceedings of the 2012 International Conference on Information Systems Development, DOI 10.1007/978-1-4614-7540-8_30, © Springer Science+Business Media New York 2013

387

388

J. Pokorný

collaboration tool aimed at academics and researchers from all fields. The other examples of PSNs include HR.com (0.194 million users in 2010)—the largest social networking site dedicated to the human resources professionals—and ResearchGate (http://www.researchgate.net) with 0.9 million users. We are reminded also one of the oldest network communication systems USENET (http://www.slyck.com/ ng.php) developed by the academic community. For ICT community in the Czech Republic, there exists the social network SoSIReČR (http://www.sitit.cz). A more detailed overview of PSN specialized on ICT can be found, e.g. in [6]. OSNs adapt real world social structures to online channels, both web and mobile. Their members construct personal profiles with the information they want others to know, share interests through recommendations, links or documents and build lists of people with whom they are connected to. PSNs use not only personal profiles but also profiles of other actors, like companies, teams and projects. Most of OSN sites carry out a type of virtual community. When people join social networking sites, they allow users to create personal profiles viewable to anyone in a given network. Users can enter “friendship” relationships with other registered users and share content, e.g. photo albums that can be linked to the profiles of those present in a picture. In PSNs, such content includes scientific documents, research projects, university courses and PhD thesis topics. Regarding the definition of profiles, these usually comprehend both structured information in the form of key-value pairs (features) and unstructured or semistructured information, mainly text-free fields or uploaded contents. Structured information provides basic descriptors about actors. For a person some descriptors are name, age, gender, schools attended, geographical location, interests or Web page. Features describing employee’s expertise usually occur in PSNs. PSNs are often business-oriented social networks where many core services such as recruiting, job seeking, expert/profile search and item recommendation rely on successful identification of similar actors. A similar approach can be found in educational systems which adapt the learning material to the knowledge of a student, display personalized help texts or tailor descriptions to the technical background of an actor. The PSNs based on these general ideas are used to be supported by a web portal containing a browsable repository. A particular problem of OSN is user’s profiles matching required in several scenarios. One of them consists of linking data corresponding to the same actor in the same or different data sources (see, e.g. [11]). The other one concerns matching two actors inside one OSN. In the context of PSN, the latter includes, e.g. coupling a project and appropriate researchers and searching for suitable job seekers based on an advertisement described as a professional profile. Our contribution in this paper is a novel matching framework able to consider profiles, whose some features are described by concept names from a classification hierarchy. The framework is based on distances between topics and their matching. Moreover, users can assign weights to the features and influence an associated similarity measure (score). As a main result of the paper, we consider our approach to profile matching, particularly a measure called profile compatibility. Its values enable, e.g. to rank candidates for a position in an organization with respect to a given requester profile.

Profiles in Professional Social Networks

389

Section 2 describes some necessary notions concerning OSN. In Sect. 3 we p resent an overview of profile management issues and approaches. Section 4 is devoted to classifications usable for profile construction. In general, ontologies present an appropriate framework for this task today. A method for profile matching is described in Sect. 5. First, we will introduce the notion of concept distance in a hierarchy, different types of matching two concepts in a hierarchy as well as the notion of profiles compatibility. As examples we will use ACM Computing Classification System [1] containing hierarchies of ICT topics. Finally, in Sect. 6 we suggest current and future plans in profile processing research.

2 Towards Professional Social Networks We start with an intuitive definition of a social network.

Definition 1 Given an undirected graph G (V, E) representing a social network, where V, |V | = n, is the nonempty set of vertices representing actors and E, |E| = m, is the set of edges representing the relationships among them. Let vi and vj be two vertices from V; if e (vi, vj) ∈ E then vi and vj are neighbours.

To take actors’ profiles into account, let FV be the set of features describing the actors of the social network, which can be represented by a matrix of size n × |FV|. Note that the features in this definition represent single-valued attributes. In practice, a multi-valued attribute is a more real case, e.g. hobby = [reading, chess, music]. We will use the employee’s expertise expressed by a set of topics. Also each feature could be optionally an aggregation of single-valued attributes (e.g. date of birth = [day, month, year]). The relationships between actors depend on the type of society. Between companies, the relationship could be a business contract of supply. Between people in a company, it could be the hierarchical relationship if we are considering the organizational structure, or it could be the sending of e-mails in a network of relationships between friends. The relationships define a social graph. Depending of the social network, relationships can be qualified or not. They are often bidirectional and may express, e.g. relationships between persons, users and user groups. Technically, the functionalities of OSN can be usually described in seven blocks supporting the actors’ activities: identity, profiles, presence (neighbours, friends), relationship types, messaging, repository, calendar and events. We may distinguish the identity of an actor and his different profiles representing the facets that he exposes (private, professional profile, etc.).

390

J. Pokorný

However, social networks are more than a graph; they have an interesting amount of information derived from its social aspect, such as profile information, content sharing and annotations, among others. Suppose, e.g. an OSN of the employees from enterprises. Based on information retrieval methods and graph processing, we can use the OSN relationships and explore whether the OSN actors have sent messages to each other or not and derive the type of projects each employee has been involved in. Thus, we can search communities of people which have met before and have worked in similar projects. Those communities of cooperation could be used to find experts or to create work teams in the enterprise. We will also consider the employees’ expertise explicitly and discover special communities for practical purposes. In this case, we talk about PSNs.

3 Profile Management One of the most important components of an OSN is the profile page. For example, Orkut profile pages basically consist of 68 features. The profile page reflects a single user or a company. Every user on the social network can access this information, and the company can specify and maintain exactly what information it makes available to the user. Usually the profile page is managed and administrated via a portal. In OSN, the profile page’s management and administration are carried out by reading and writing comments. We are interested in so-called professional profiles (PP). Informally speaking, a PP is the machine-readable description of an extent of knowledge in an area. A little more formally, a PP is an instance of a structured data type, whose part is a hierarchy of categories (usually a classification tree). We will formally define PP in the following way.

Definition 2 A professional profile is a set of couples {(T1, w1),…, (Tm, wm)}, where Ti are the terms (topics) that describe the actor and wi denote the importance of Ti in describing the actor.

PPs are enabling to express the professional focus of a given actor (e.g. person, research team or project) in a structured way. For example, what is taught on which level in a course or what ICT knowledge on which level a person has. Also relationship types in PPs can be special ones, e.g. supervised_by, coauthor. In more advanced approach to PPs, organizations are modelled. They are represented by several groupings of people with common research interests. The actor profile may not be entered directly. Extended user profiles are built by extending the basic explicitly defined user’s profiles with data inferred from user


391

Fig. 1 Visualization of a professional profile

content analysis and user interactions with each other and with the tool itself. Actor description fields and other text-based fields can also contain implicit data regarding actor interests and preferences that can be extracted by content analysis tools. The authors of [9] developed a tool for implicit groups based on extended user profiles built by extending the basic explicitly defined user’s profiles with data inferred from user content analysis and user interactions with each other and with the tool itself. User profiles can be extracted using various types of corpora, e.g. utilizing knowledge about experts from Wikipedia or analysing public expert’s documents (papers, technical reports). Authors of [13] generate extended user profiles by spreading activation networks derived from ontologies. User profiles are also processed in recommendation systems. An important feature of each profile is its visualization. For example, in project SoSIReČR [14], the classification nodes are depicted as a bar chart, arranged in a circle (the resulted graph in Fig. 1 is often called a cobweb graph).

4 The Role of Ontologies in Social Networks A powerful tool for representing profiles is ontologies [7]. Ontologies are used to represent the formal specifications of the notions involved on a domain of interest and the relations between these terms [5]. These approaches can be: • Corpus-based • Knowledge-based Corpus-based methods use well-known methods from information retrieval area, like vector space model, statistical model and latent semantic analysis. These

392

J. Pokorný

methods use matching techniques well known from information retrieval area (cosine, similarity measure, Dice’s coefficient and Jaccard’s index). In context of user profiles, these approaches are rather naive. As the profiles are short bags of words, they often do not determine a semantic inexact match when here is no overlap of words of participated profiles. Knowledge-based approaches are based on predefined taxonomies or ontologies, like WordNet or bibliographical classifications used in context of digital libraries or Web portals. Typically, classification tree(s) are used in these ontologies. There are various ways of ontologies organization. Some of them are hierarchical taxonomies based on ISA relationships; the other uses more relationship types, most often BT/NT (broader/narrower term), organized even not only in hierarchies but in general graphs. Representing actors by profiles is also used in content-based filtering where the system selects rank-ordered items according to user profiles. Both items and user profiles are described either by bags of words or by concepts from ontology. For example, in the newspaper domain the authors of [12] used the ontology NewsCodes1—a subject classification hierarchy with three levels and seventeen categories in its first level. A similar idea will be used here with PPs describing actors from ICT community. We focused on the knowledge-based ACM Classification Scheme [1], i.e. a typical schema that is universally accepted standard classification of ICT disciplines appropriate. Its category space is also organized in three level hierarchies. That is, the classification has a forest structure with trees of a fixed maximum depth. The ACM Classification Scheme consists of eleven major partitions (first-level subjects). These are subdivided into 81 second-level topics, which are further subdivided into third-layer topics; see, e.g. the category Software Engineering in Fig. 1. An uncoded fourth level contains subject descriptors, e.g. Computer-aided software engineering (CASE), Decision tables, Evolutionary prototyping, Flow charts, in category D.2.2 Design Tools and Techniques. For example, all publications issued by ACM are annotated by this classification. There are many applications of this classification scheme for describing expertise in ICT scientific community, e.g. in recommender systems [2], in the context of a digital library [15] as well as in representing research organizations [8]. The classification was used even for determining the semantic similarity in matching software practitioners’ needs and software research activities [4]. The so-called subject clusters describing groupings composing an organization are also represented by sets of ACM topics [8]. The concepts representing a topic from the profile are necessarily not the most specific ones in a path of the hierarchy. For example, if somebody is an expert in Formal Definitions and Theory, it does not mean necessarily that he/she knows all about Programming Languages. Thus it is not possible to understand ACM classification as a set of usual ISA hierarchies but rather of BT/NT hierarchies.

1

http://www.iptc.org/cms/site/index.html?channel=CH0088


393

5 Profile Matching A usual method belonging to a profile management in OSN is comparing profiles by a profile matching. Profile matching is based on the notion of similarity. It reflects closeness and interaction between actors. For purposes of this paper we omit interactions and focus mainly on static PPs. Various techniques for profile matching can be divided into two main categories: • Syntactic-based approaches • Semantic-based approaches While the former provide exact or approximate lexicographic matching of two concepts, the latter use semantics for similarity definition. Here, as an example, we will use ACM classification where semantics is determined by hierarchies of topics. Obviously, in general, we can consider arbitrary concept hierarchies. Consider first Web pages representing profiles. Technically they are part of the Web, but their data representations are different from general Web pages. In OSN, Web pages describing profiles are automatically generated and not authored by any person. Thus methods for comparing such profiles are relatively simple. Syntactic- based approaches like vector space model can be used in the case of single-valued features. For example, in Orkut, out of 68 features 20 features are considered for similarity measurement. In this case a classical notion of similarity is used in which a similarity function sim (a, b) is reflexive, i.e. given two PPs P1 and P2, in general, we have that sim (P1, P2) = sim (P2, P1). Then, e.g. a group is composed from actors they are mutually similar. In context of PSNs, a more useful case than this (full) social similarity is usage of only partial similarity. Suppose as actors a requester and a candidate, both expressed by respective profiles. Then a more appropriate sim function is not necessarily reflexive. A partial similarity is defined as the level of matching between a requester and candidates, e.g. a company finds out a new employee, a project requires a team of researchers and a student with his/her interests chooses a new study program. These examples take into account the fact that, while the “perfect” candidate seldom exists in the profiles base, profiles are often available which provide the desired requirements “to some extent”. Here we use the term compatibility (rather than similarity) due to asymmetry in the classification schema: the functionalities of children nodes are also the functionalities provided by their ancestor(s) in the BT/NT hierarchy, while the reverse usually does not hold.

5.1 Distances Between Topics and Their Matching Now we define the notion of distance between topics based on the forest structure [3]. Let there be t trees (f1, f2, …,ft) in the forest F whose nodes represent topics.

394

J. Pokorný Software

Software Engineering

Metrics

Coding Tools and Techniques

Programming Languages

Design

Formal Definitions and Theory

Processors

Fig. 2 A part of category D2 software engineering in the ACM hierarchy

Consider two topics Ta and Tb such that both of them belong to the same tree of F. Let LCA be the least common ancestor of Ta and Tb. Also, assume d (LCA, Ta) to be the depth of Ta from the LCA.

Definition 3 If T1 and T2 are two topics, then the distance, D (T1, T2), between them is given as: d (T ,T ) if T1 ,T2 ∈ fi , D (T1 ,T2 ) =  LCA 1 2 ∞ if no such fi exists  where dLCA (T1, T2) = max(d(LCA, T1); d (LCA, T2)). We do not suppose that more than one such fi exists in classifications considered.

If T1 and T2 are in fi and fi, respectively, and i ≠ j, then D (T1, T2) is ∞. Consider a part of category D2 in Fig. 2. For example, when T1 = Programming Languages and T2 = Data structures then, since T1 and T2 are in different trees, D (T1, T2) = ∞. Considering Fig. 2, if T1 = Coding Tools and Techniques and T2 = Programming Languages then LCA = Software and d (LCA, T1) = 2, d (LCA, T2) = 1 and, consequently, D (T1, T2) = 2. Any two topics with a common parent node have their distance equal to 1. For example, D (Formal Definitions and Theory, Processors) = 1. If T1 = T2, then D (T1, T2) = 0. The Definition 2 helps us to determine the distance, when the topics belong to a single tree. Clearly, in n-level classification hierarchy, the useful distances are from interval . Now we will consider a pair of profiles Pr and Pc, belonging to the requester and the candidate, respectively. In our retrieval model, we exploit different types of matching between the set of features associated to the requester and those associated to available candidates. Let Tr = {Tr1,…,Trm} and Tc = {Tc1,…,Tcn} be their


395

associate sets of profile topics. For any two topics Tri ∈ Tr and Tcj ∈ Tc that lie on the same path in a tree from F, we can distinguish at least three types of matches between them: • Perfect match, if d(Tri, Tcj) = 0 • Close match, if d (Tri, Tcj) = 1 • Weak match, if d (Tri, Tcj) ≥ 2 The third variant of matching is relevant for n-level hierarchies with n > 3. So for ACM classification we obtain d (Tri, Tcj) = 2 for weak match. Consider Tr = {Software} and Tc = {Coding Tools and Techniques, Processors}. Then four occurrences of weak matching exist: two from BT (Software) to NT (Coding Tools and Techniques, Processors) and vice versa. How to express a similarity (score) between topics contained in the requester and candidate profiles? We denote it by score(Tci, Trj) and talk about the score value of the candidate topic Tci with respect to the associated requester topic Trj. For perfect matching it is easy, e.g. score (Metrics, Metrics) can be defined as 1. It is not the case for close and weak matches. For example, score (Software Engineering, Metrics) should be not the same as score (Metrics, Software Engineering). The former shows that the candidate is more general and in the latter case the candidate is more specific. For example, for finding an employee who is an expert in software engineering, the value of score (Metrics, Software Engineering) should be greater than score (Software Engineering, Metrics) in which the candidate is “universal” software engineer. That is, a specialist in metrics probably knows fundaments of software engineering. The same asymmetry holds for couples of weak matched topics. In the context of content-based filtering, the authors of [12] use score values 2/5 and 2/3 for close match and ½ and 1/3 for weak match. Thus, score (Metrics, Software Engineering) = 2/3 and score (Software Engineering, Metrics) = 2/5. For the weak match case, we obtain score (Metrics, Software) = 1/2 and score (Software, Metrics) = 1/3. The restriction to matching topics lying only on a tree path might seem too restrictive to somebody. Other relationships, e.g. siblings, can be relevant for a requester. Consider Formal Definitions and Theory and Processors contained in requester and candidate profile, respectively. Although distances between siblings are equal to 1 by Definition 3, their matching is not considered here. In a spreading activation method, both profiles could be extended towards the common parent node Programming Languages with assignment of weights according to an influence function. How to quantify this influence offers an opportunity for a future research.

5.2 Profile Compatibility In [14] we extended PP profiles by weight information added to ACM topics reflecting the actor’s expertise. The actor is asked to select a number of topics of any layer

396

J. Pokorný

Table 1 Examples of profiles and their compatibility Noc 1 2 3

Pc Metrics, processors Software Metrics, processors, formal definitions and theory

Nor 1 2 3

Pr Software engineering (2) Programming languages (3) Programming languages (3) Processors (4) Processors (4)

Comp(Pc,Pr) 0.66 0.4 0.86 1

of the ACM forest and assign each with a number between 0 and 5 expressing degree of the actor’s activity to the topic. These weights can be used for ICT specialist’s finding out an appropriate job. The same can be done in the case of a company creating job possibilities for ICT people. Concerning comparing of respective profiles, it is clear that the ICT specialist can be both requester and candidate. The same holds for a company. We start with a simple situation when no candidate’s topic weights are taken into account. We will suppose only nonzero weights w in requester profiles. Compatibility between candidate profile and requester profile can be defined as follows. Let Pc, Pr be profiles of a candidate and requester, respectively. Pr contains m profile topics from which n, n ≤ m, lie on the same paths in F as some topics from Pc. Let si be a score value of a candidate’s topic with respect to the associate requester topic on these particular paths. The compatibility between the candidate Pc and Pr is defined as follows: Comp ( Pc ,Pr ) =

∑w

× si

i

i ∈

∑

wj

j ∈

In Table 1 we show some examples of profiles both of candidates and requesters. The numbers in () denote associated weights. The difference in compatibility between requests 2 and 3 for the candidate 3 is because of the explicit requirement of Programming Languages. The requester calls for more expertise than only Processors. Note also that the topics of candidate profiles used in Comp are the most specific ones in a path of the hierarchy. Releasing this condition, we have various possibilities how to calculate Comp function. Suppose a number of Tci lying on a path. The most natural solution is to consider such Tci that is the closest to its parent or ancestor from Tr on the same path. Assume, e.g. Tc = {Metrics, Software Engineering} and Tr = {Software Engineering(2), Processors(3)}. Then we would choose rather Software Engineering and not Metrics from Tc. There is a question why candidate’s weights are not used in Comp function. In extension of our approach, we will take them into account and change the formula for Comp. The resulted weights will approximate both types of weights. Let


397

Table 2 Examples of weighted profiles and their compatibility Noc 1 2 3

Pc Metrics (1), processors (3) Software (4) Metrics (2), processors (3), formal definitions and theory (2)

Nor 1 2 3

Pr Software engineering (2) Programming languages (3) Programming languages (3) Processors (4) Processors (4)

Comp(Pc,Pr) 0.53 0.4 0.61 0.75

Tck(vk) and Trj(wj) be weighted topics of respective profiles Pc and Pr. There are two possible cases: • D(Tck, Trj) = 0 If vk wj, then the candidate is too highly expert for the requester and can contribute only by vj. Thus the resulted weight is min (vk , wj). • D(Tck, Trj) = 1 or D(Tck, Trj) = 2. The same consideration can be used. Then the resulted function Comp can be expressed by Comp ( Pc ,Pr ) =

∑

i ∈

min ( vi ,wi ) × si

∑

wj

j ∈

We can check the approach on examples from Table 1. In Table 2 we see that candidate weight 4 for Software has no influence in the resulted Comp value. On the contrary, small weights of Processors and Formal Definitions and Theory decrease the Comp values for requesters 2 and 3. There are possibilities to take into account various other observations. For example, in ref. [3] the authors follow the assumption that semantic differences among upper-level concepts are bigger than among lower-level concepts. This fact then influences the development of similarity measure.

6 Conclusions and Future Work We have described a method of how the comparison in PPs is used in PSN. A particular application behind the research was finding experts for professional activities. Obviously, the problem of finding experts on a given set of topics is important for many lines of business, e.g. consulting, recruitment and e-business. Our method supposes hierarchical ontologies to be at disposal and a simple technique of weighting. A substantial difference of the method from those used in friendship-like OSNs

398

J. Pokorný

lies in certain asymmetry in estimating score of similarity between requester’s and candidate’s topics. We have seen that it is possible to explore more sophisticated techniques to measure compatibility or similarity of PPs. The method can be used in any enterprise portal which is a part of a social network. For example, the selection of business partners and collaborators described by a PP can be a helpful feature of such portal. In combination with business intelligence, the methods of PPs management can contribute for the realization of dynamic business ecosystems in near future. To evaluate the method requires its use in an environment with real users, particularly in an appropriate application context, e.g. a portal of a career site. Concerning future research, there is a lot of associated topics: • Scores for similarity of candidate and requester topics can depend on the application domain and/or on ontologies used. • Both requester and candidate profile can be also dynamic. Their changes could be easily controlled or derived from actor’s behaviour. • Often nonhierarchical relationships are added to hierarchical ontologies. Their consideration can significantly extend semantics of matching of profiles. • Clustering actors according to their PPs. Another direction of future work involves evaluating methods of comparing PPs with those used in associated areas like recommender systems and other enterprise tagging systems. Acknowledgments This research has been supported by the grant GACR No. P202/10/0761.

References 1. ACM computing classification system. http://www.acm.org/about/class. Accessed 14 Apr, 2012 2. Afzal MT, Latif A, Saeed AU, Sturm P, Aslam S, Andrews K, Tochtermann K, Maurer H (2009) Discovery and visualization of expertise in a scientific community. In: Proc. of FIT’09, CIIT, Abbottabad, Pakistan, 16–18 Dec 2009, pp.43 3. Bhattacharyya P, Garg A, Wu SF (2011) Analysis of user keyword similarity in online social networks. Social Netw Analys Mining 1(3):143–158. doi:10.1007/s13278-010-0006-4 4. Feather M, Menzies T, Connelly J (2003) Matching software practitioner needs to researcher activities. In: Proc. of the 10th Asia-Pacific software engineering conference (APSEC’03), IEEE, Chiang Mai, Tahiland, 6–16 5. Gruber T (1993) A translation approach to portable ontology specifications. Knowl Acquis 5:199–220. doi:10.1006/knac.1993.1008 6. Kubalík J, Matoušek K, Doležal J, Nečaský M (2011) Analysis of portal for social network of IT professionals. J Syst Integr 2(1):21–28 7. Katifori A, Halatsis C, Lepouras G, Vassilakis C, Giannopoulou E (2007) Ontology visualization methods – a survey. ACM Computing Surveys 39(4):10.1–10.42, 10.1145/1287620.1287621 8. Mirkin B, Nascimento S, Pereira LM (2007) ACM classification can be used for representing research organizations. DIMACS technical report 2007–13


399

9. Pais MR, Morgadob C, Cunha JC (2011) Implicit groups in web-based interactive a pplications. In: Proc. of the 3rd int. conf. on computational aspects of social networks (CASoN). Salamanca, Spain, 175–180 10. Pallis G, Zeinalipour-Yazti D, Dikaiakos MD (2011) Online social networks: status and trends. In New Directions in Web Data Management 1 (pp. 213–234). Springer Berlin Heidelberg 11. Raad E, Chbeir R, Dipanda A (2010) User profile matching in social networks. In: NetworkBased Information Systems (NBiS), 13th International Conference on (pp. 297–304). IEEE 12. Shoval P, Maidel V, Shapira B (2008) An ontology – content-based filtering method. Int J Inform Theor Appl 15(4):303–314 13. Thiagarajan R, Manjunath G, Stumptner M (2008) Finding experts by semantic matching of user profiles. In: Proc. of the 3rd expert finder workshop on personal identification and collaborations: knowledge mediation and extraction (PICKME 2008). Karlsruhe, Germany, pp 7–18 14. Vojtáš P, Pokorný J, Nečaský M, Skopal T, Matoušek K, Kubalík J, Novotný O, Maryška M (2011) SoSIReČR – IT professional social network. In: Proc. of the 3rd int. conf. on computational aspects of social networks (CASoN), 108–113 15. Wang T, Desai BC (2007) Document classification with ACM subject hierarchy. In: Proc. electrical and computer engineering, CCECE, 792–795