UPLSN: User Profiling in Location based Social ...

12 downloads 0 Views 863KB Size Report
Justin Bieber's Heart: The Dynamics of the “Location” Field in User. Profiles,” Proceedings of the SIGCHI Conference on Human Factors in. Computing Systems ...
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 15, No. 6, June 2017

UPLSN: User Profiling in Location based Social Networking using Geographical Information System Vasanthakumar G U, Ashwini G R, Srilekha K N, Swathi S, P Deepa Shenoy and Venugopal K R Department of Computer Science and Engineering, University Visvesvaraya College of Engineering, Bangalore University, Bangalore, India. E-mail: [email protected]

Abstract— Information Diffusion in Online Social Networks (OSN) is faster compared to other media because of its wide usage by its users. OSN serve various purposes and help people in different ways. In this work we present an intelligent, crowdpowered Geographical Information System that identifies experts on specific topic(s) of their interest location-wise in Twitter Social Network. The proposed UPLSN algorithm identifies experts by finding topic(s) of their interest by extracting and clustering the keywords in their tweets with respect to the tweet locations. These users are profiled and stored in the database. Profiled profound users are presented to the Business Users based on the topic searched by them for further exchange of information between them with our proposed PUGIS algorithm. The proposed algorithms are evaluated conducting experiments on twitter data set to demonstrate their adequacy.

also video enabled communications. For people who move from one location to another location for any purpose may undergo problems with language, places, culture etc., where all of these are addressed in social networking sites which make it easy for new people to discover the unknown places, shops, hospitals and many more. Apart from these, social networking sites even bring people around the world together under single platform to share knowledge, interact with people, gather information, showcase talent and also to get connected with people of same interest. In some cases these sites also make platform for questionaries’ and provide solutions for various confusions. Mechanisms provided by some sites to share images, videos and short stories are a way to express their interests and get connected to people of same interest around the world.

Keywords- Data Mining; Geographical Information System; Location Tagging; Online Social Networks; Profound Users; Topic of Interest; Twitter; User Profiling.

Online social networking sites usage has credited multiple advantages for its users. With growing usage of social networking sites, some intruders/malicious users use this platform to propagandize their illegal mission. This may not be known by the regular users and as a reason they get in trouble by involving into such trap unknowingly. This happens by accepting friend request from unknown users, sharing personal information, posting pictures publicly which are downloadable by anyone in friends list and use it as required. Apart from these, some users may get connected to some terrorism groups or other harmful communities and perform illegal activities. In other scenarios, people who depend on the information from these sites for any medical help may land up in serious problems since there are no reviews or check points for the correctness of the information available. The need to identify and eliminate such users and to extract, review the data content from the networks is high and to do so, mechanisms are evolving from researchers regularly. Data mining is one such technique which extracts useful and necessary information as designed to extract from the raw data resources directly from World Wide Web or from Database, Data Warehousing or web repositories and facilitated for review or refine.

I.

INTRODUCTION

Since a decade, the use of online social networking sites has increased tremendously by occupying most time of its users. Depending on the functionality and the purpose of use, there are different kinds of social networking sites available. OSN like Facebook, Twitter, Google+, Instagram have more users with high usage because of the facilities provided by them. They provide a means of communication between users and furnish information on ongoing trends in politics, technology, research, movies, media and so on. These sites provide platform for its users to express their opinion, share knowledge, create impact, promote business and various other purposes which is one of the reasons for the increase in popularity of these sites. A user has a freedom to create profiles in more than one social networking site and use them either having same profile name in all sites or different ones. In comparison to older generation, younger use these sites more often. The functionalities provided by each social networking site accounts large in attracting users for its usage. Many users who are located geographically a long distance apart use these sites mainly to get connected with their near and dear ones since few of these sites not only provides audio communication but

Online social networking sites are implemented for new trends, different options, more visibility and functionalities providing its users with better experience than the previous versions. Depending on the site, the functionalities may vary and reach to different set of users and accessed by them.

77

https://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 15, No. 6, June 2017

Location enabling while using OSN and tagging the location with each post is one of the functionalities introduced by few social networking sites. This is not a compulsory function that gets enabled for each user during their activities, instead it is decided by the user either to enable the location for their post or not and they may apply it for either session or permanently. By enabling this functionality, the location from where the user has posted is tagged with each post which may be viewed by other users who view the post.

The organization of the paper is as follows: Section-II gives a brief glimpse of literature work carried out. The definition of the problem is described in Section-III whereas the proposed system is discussed in detail in Section-IV. User Profiling in Location based Social Networking (UPLSN) algorithm is presented in Section-V. Simulation and Result Analysis is discussed in Section-VI and the work is summarized with Conclusions in Section-VII. II.

There are various advantages with location tagging in OSN like identifying user pattern of movement, analyzing the user activities w.r.t location, user locations and so on. As already mentioned, users may provide wrong information about their location during profile creation, which is identified by extracting location tagged information for a period of time [1]. In other words, when a user requires any help with respect to a location, then the user may share his exact location which would help others in the network or friend list to provide information related to the that location. In this way, location tagging works as filter to precise the huge information in OSN.

LITERATURE SURVEY

In order to understand the communications and its pattern in OSN, Zhang et al. [2] have followed a phenomenon which includes all the attributes of their interactions. By analyzing groups and individual interactions, they constructed a transmission graph in which the results prove that pattern of interaction and its duration clearly depends on individuals. This result contributes for truncated power law as a reserved proof. User privacy and data privacy are the major considerations in OSN because of the reason that the intruders can hack the data or the media present in the profiles and use it for malicious purposes. In other words data present in OSN may not be true or correct as the data may not be from proper source. Location Based Social Networks (LBSN), a survey conducted by Pavlos et al. [3] describes all the algorithms which provides the strength and weakness of LBSN w.r.t recommendations, time awareness and user’s privacy issue by considering entities like groups, users, activities, quality and location.

Using location tagging, user activities and their pattern are extracted along with user behavior which contributes majorly in user profiling. By extracting the posts / images / videos / comments along with location of a user, the pattern and the interest of the users are obtained. Since every post of user may not relate with his location, the need to analyze the content of post with location, relation of post to location, any hidden meaning with location to the media content etc, needs to be verified. In other cases, some spams or hoax information may be spread widely with respect to some person, location or event because of wrong post w.r.t location or even wrong location w.r.t post. Hence, a mechanism for filtering and identifying such posts of users is required.

Recommendations in OSN play a vital role since wrong recommendations may irritate users. By combing both spatial and temporal perspectives based on Bayes’ rule, Yang et al. [4] have presented a probability model which provides new mechanism for recommendation system. This method uses location of user in LBSN and tries to find people in the same location and helps in friendship recommendation. Similarly, Betim et al. [5] have proposed a personalized recommender system which provides personal recommendations based on the preferences. This method uses location information of users to analyze the preferences. From the experimental results of proposed method on Gowalla data set, it is proven that the addition of location attribute to the recommendation system has improved the recommendations effectively and also provides a view of user behavior. Nitai et al. [6] have proposed a friend recommendation system which categorizes the friends groups and provides recommendation.

Motivation: The information getting generated in Twitter every minute is huge from diverse geographical locations that disseminates rapidly world-wide. Twitter users are increasing day-by-day and they not only get updates from those whom they elect to follow but also by explicitly choosing to consume new information by searching for a topic in addition to following accounts. Twitter has also added features like ‘Trending Topics’ to make its users updated and even major search engines have also been included to search public social streams. The information so gathered provides users with personalized services such as local news, advertisements, application sharing and so on which motivated us to carry out this work. With all these potential information and news being spread on Twitter, users do need the details of profound users who are experts in certain topic(s) of their interest locationwise for further exchange of reliable information.

Eunjoonet et al. [7] have analyzed human behavior with respect to social ties. Here, they have considered both location based social network for the analysis and the location tagged data available from cell phone. They proved that human travels which are periodic and short are irrelevant with respect to social ties but the long distance travels are directly influenced and dependent on social ties in the networks which are proven from recursive analysis. Influential users are always present in network who influence other users directly or indirectly and information diffusion starts from such influential users in the network [8]. Narayanam et al. [9] have proposed and presented ShaPley valued based Influential Nodes (SPIN) algorithm for identifying such influential nodes in network. This method

Contributions: If user needs any idea or help regarding any subject / topic then, those users who are proficient in relevant topic(s) of their interest may provide various ideas and suggestions. Experts in their topic(s) of interest location-wise are profiled based on their tweets content and their details are stored in the database and are recommended to the information needy.

78

https://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 15, No. 6, June 2017

achieved high and accurate results which are proved from experiments.

identify the overlapping and hierarchical communities in location based OSN is presented by Wang et al. [22]. This is achieved by using co-clustering method and collecting multiattribute features. Results of experiment on FourSquare data set proved that from various perspective, the proposed framework works efficiently and effectively.

User interest identification in OSN is a challenging task since it requires prior minimal knowledge about users and their activities [10]. Souvik et al. [11] have collected such information from social network and analyzed by applying collective and collaborative filters on user’s data to obtain user interest. They also conclude to analyze user posts, the two parameters; isometric projection and sparse regression as dominant features. The recommendation system presented by them based on collaborative and content filtering has been utilized by Wan et al. [12] in proposing a different and new friend recommendation system. Results of this hybrid system on Digg data set proved that it works better than all available systems. In many cases the user behavior changes based on the data content in the social network, leading to a study [13] on social network and the influence of data content on others in the network. Sequential user behavior is extracted and analyzed to prove that the behavior of users change based on the data content posted by influential users.

Users in Twitter post their tweets which may include images and smileys. Each smiley represents some emotions of human. Hence, users use this to express their emotions in a simple way. In many situations the tweet text and image may not be related at all. This can happen as the user may share his own opinion or other information related to the image. Based on the consideration that discussion in the groups of twitter may start or relate to location of the initiator, Swarup et al. [23] have presented a framework which estimates the city-level location of users. Content-derived location information has been followed here i.e., they depend on the content of tweet and re-tweets in the discussion to estimate the location. Results of experiments prove that this approach is 10% accurate than available methods.

Hongzhi et al. [14] have presented a location-aware probabilistic generative model LA_LDA for recommendation system. Experimental results prove that this method outperforms both cold start problem and top-k recommendations in user profiling effectively and efficiently. Most problems in OSN are avoided just by being careful before accepting unknown friend requests. But in certain cases, it becomes difficult to judge whether to accept or not. As a solution to such decision making, Chung-Kai et al. [15] have presented a distributed information sharing system which helps users in decision making. Wang et al. [16] also have presented similar mechanism for decision making. Another decision support system is presented by Zhang et al. [17], which provides better analysis in decision making to accept the friend request or not. IntRank is a trustworthy system presented by them which provides secure social networking system which helps in making necessary decision for the selection of friends. This is achieved by repetitive analysis of social ties, interaction patterns and ranking.

Brent et al. [24] have proposed similar approach to extract the location information of user from the tweets posted by user. They also prove that 34% of user provided false information about their location. The proposed approach is able to identify the country and state level information with decent accuracy. Teng Niu et al. [25] have performed sentiment analysis on textimage jointly and separately on the new data set that they have collected from Twitter and named multi-view sentiment analysis dataset (MVSA). They have achieved this based on feature extraction both in single-view and multi-view combinations. User profiling based on just text or location is challenging since it may not yield accurate results [26]. To overcome this problem, Soha et al. [27] have demonstrated how different dimensions of data when combined with location based networks can help in profiling users. This model applies similarity measures and co-occurrence method along with semantic analysis as tags. Place properties, annotation of place by user are combined to semantically analyze the user interest to profile them. To help sentiment analysis to work in better way on large and sparse data set Monalisa et al. [28] have presented a new algorithm based on segmentation. Dynamic Association Rule Mining [ 29] provides better results when used with Genetic Algorithm.

To get query addressed by others in the network and selecting strangers as connecting point a recommendation algorithm is developed by Jalal et al. [18] and presented a prediction model using crowd-source ‘qCrowd’. On Twitter data set the experiments are conducted where it selects the node automatically to get response. Encryption mechanism [19] uses symmetric key method which develops easy and secure environment in accessing social networks.

Jie Bao et al. [30] have summarized the recent advances in location-based services from geo-social media data and developed a system which provides location-based predictions and location-based recommendations more efficiently than other state-of-art methods. In order to find the relationship between the tweet text data and the location of tweet in mobile device, Stefan et al. [31] have proposed a classification approach which uses OpenStreetMap to get validation points. Using this classification and Geo-location points, the content of tweet is correlated using manual, unsupervised and supervised machine learning approaches. Results prove that the degree of relatedness to the content and location is based on topics in the tweet content.

Based on the user locations i.e., the locations they have visited, a framework is developed by Bogdon et al. [20] named PROFILR. This is achieved by constructing profiles of users who have visited multiple discrete locations. This method is more useful in analyzing correctness of data provided by users. Multiple Location Profiling (MLP) model is proposed by Rui Li et al. [21] which provides solution for profiling Twitter users along with location of their following network and tweets. The proposed method identifies user home location effectively and discovers user’s multiple locations. A novel framework to

79

https://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 15, No. 6, June 2017

Fig.1. System Architecture Diagram

III.

unregistered can just read them. Tweets are freely readable by default, however senders can limit message conveyance to only their followers. Users may subscribe to other’s tweets; known as "following", while endorsers are known as "followers" or "tweeps". Users demark their posts by utilization of hashtags; words or expressions prefixed with a "#" sign. Likewise, "@" sign took after by a username is utilized for replying to different users. To repost a message from another user and impart it to one's own particular followers, a user can tap the retweet catch inside the Tweet. As an informal organization, Twitter spins around the standards of adherents.

PROBLEM DEFINITION

Our intuition is that any information exchange between users may be related to a set of topics such as weather, sports, research, education, event, place, time, organization and so on. Given a set of user tweets, an attempt is made to profile the users based on their topic(s) of interest location-wise so that they are recommended to the needy for further dissemination of information. IV.

PROPOSED SYSTEM

In order to collect the data from the Twitter site, a twitter account is created. Authorizing a Twitter Application (App) / API (Application Program Interface) for the created account provides a Consumer Key (or API Key) and a Consumer Secret Key (or API Secret Key). An access token is generated along with an Access Token Secret Key providing Access levels and allowing the Twitter App / API to read the Twitter information.

In this work, we mainly focus on profiling the users based on their activities in Twitter Social Network. Assuming that the users have enabled their location services while tweeting, the information of users are collected from the content of their tweets. The details of these profiled profound users who are experts in their respective topic(s) of interest are stored in a database for recommendation to the relevant information needy business users at their location.

2) Stop Words Elimination

The components used in designing and formulating the proposed system along with system architecture is described in this section.

During the preprocessing of the data collected and to increase the efficiency of extracting the keywords from the tweets so as to classify and form clusters of related keywords, it is necessary to remove the stop words from the tweets. Elimination of stop words enables identification of keywords faster and effective. These stop words appear frequently in the tweets and do not carry any sentiment information. Some of the stop words include ‘a’, ‘I’, ‘at’, ‘the’, ‘of’, ‘she’, ‘he’, ‘and’ ‘it’ etc. These words also called as functional words are removed to increase the accuracy in

A. Components of the System 1) Twitter Social Network Twitter is an online interpersonal interaction service that enables its users to send and read short messages of 140 characters called "tweets". Registered users of the site read and post tweets, however the individuals who are

80

https://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 15, No. 6, June 2017

identifying the keywords for further processing by reducing the processing time.

to their relevant topic(s) of interest location-wise as shown in Fig.1.

3) Stemming

When a business user in need of information regarding any specific topic, searches with the keyword, then all those profiled profound users’ details in relevant to the topic searched are displayed to the business user if found in the database where the business user may explicitly query that profound user for help regarding the information needed.

One of the important steps during preprocessing of keyword extraction is Stemming. Generally stemming is a process which converts all words into their stem, base or root form. Stemming process works in the following way; for example, words like ‘innovation’, ‘innovating’ are reduce to their stem as ‘innovate’. Stemming process reduces time taken for keywords / feature extraction.

V.

ALGORITHM

The proposed UPLSN algorithm is as shown in Algorithm.1., using which the profound tweeter users are profiled with their relevant topic(s) of interest and stored in the database for further use by the business users.

4) Named Entity Recognition NER, also known as entity identification is a subtask of information extraction that seeks to locate and classify named entities of keywords in tweets into pre-defined categories such as the Person, Location, Organization, Money, Percent, Date and Time.

Algorithm 1: User Profiling in Location based Social Networking (UPLSN) Algorithm 1: while (True) do 2: for (Every Twitter User) do 3: Extract Location tagged tweets 4: for (Every Tweet) do 5: Identify its Location using Reverse Geo-coding 6: Classify and Cluster the tweets based on location 7: end for 8: for (Every Cluster of Location) do 9: for (Every Tweet) do 10: Eliminate Stop words and perform Stemming to get Keywords 11: for (Every Keyword) do 12: if (Clusters already exist) then 13: if (Keyword matches to existing Clusters) then 14: Move Keyword to relevant Cluster 15: elseif (Keyword belongs to any category of NER) then 16: Form new Cluster according to NER approach and update the Keyword 17: else 18: Form new Cluster according to WordNet approach and update the Keyword 19: endif 20: endif 21: elseif (Keyword belongs to any category of NER) then 22: Form new Cluster according to NER approach and update the Keyword 23: else 24: Form new Cluster according to WordNet approach and update the Keyword 25: endif 26: endif 27: end for 28: end for 29: Compute number of Keywords in the Cluster 30: end for 31: Compute number of Clusters (Topics of Interest) of the User Location-wise 32: Profile User based on Maximum Keyword count in their Topic(s) of Interest -Location-wise 33: end for 34: Update the Profiled Users Database with the details of the User Profiled 35: end while

For example, if a tweet is posted like, “Modi is attending a rally in Bangalore on February-14”, then NER takes an un-annotated block of text and produces an annotated block that highlights the names of the entities like, {Modi} [Person], {rally} [Event], {Bangalore} [Location] and {February-14} [Time]. 5) WordNet WordNet is a dictionary of synonyms of words which are stored as Synsets and registers numerous relations amount these synonym sets / their members. 6) Geographical Information System A Geographic Information System (GIS) is devised to secure, store, update, analyze, and represent geographic data. GIS is the basis for several location-enabled services that depend on analysis. a) Reverse Geo-coding Reverse Geo-coding is a technique of generating a readable address or place name understandable by the end user through back coding of a point location (latitude and longitude). This enables identification of nearby street address, places, state and country with the help of GIS.

Algorithm 2: Profound Users Geographical Information System (PUGIS) Algorithm

B. System Architecture

1: while (True) do 2: Business User searches for a Topic of Interest 3: Access the Location of the Business User 4: Query the Profiled Users Database for the Topic of Interest with respect to Location 5: if (Exists) do 6: Display the details of Relevant Profiled Users 7: else 8: Display “No Profiled User exist in the Topic of Interest with respect to Location” 9: end if 10: end while

The user tweets are collected and after identifying their location using Reverse Geo-coding, they are classified and clustered accordingly. These user tweets are then preprocessed by eliminating the stop words and by stemming. They are then fed to the classifier which is a combination of NER and WordNet to classify and cluster the keywords into respective categories of topics. Later, the Topic(s) of Interest of the user location-wise are extracted. The users are profiled with respect to the topic in which the maximum number of keywords exist location-wise and are stored in the database. The database is built with the details of these profiled profound users respect

From the PUGIS algorithm shown in Algorithm.2., the relevant profound users details are retrieved and given to the business user when he searches for a specific topic of his interest.

81

https://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 15, No. 6, June 2017

VI.

TABLE.II. PROFILED USERS

SIMULATION AND RESULT ANALYSIS

WITH LOCATION UPON RESPECTIVE TOPICS

The proposed algorithm is implemented using Java. MySql is used for querying. A system with Intel Pentium i7 with 4GB RAM having Windows-8 platform is used for simulation. Twitter APIs, NER Classifiers, WordNet 2.1 are integrated into the system. The data is collected from Twitter for a period of 1 month compounding to 1500 tweets from 15 users. The data set is preprocessed consisting of User ID, Tweet ID, Tweet content, Location details and Time stamp.

Business User Search Topics

TABLE.I. USERS TOPICS LOCATION-WISE User ID

Location ID

Number of Tweets

UID1

LID5

150

Tid4

Stress Management Health

100 38

UID2

LID1

110 80

Tid6 Tid1

Demonetization Cricket

89 76

UID3

LID4

120

Tid2 Tid3 Tid5

Shopping Malls Cooking Tips Health

56 34 65

UID4

LID6 LID1

64 26

Tid5 Tid4

Health Stress Management

46 56

LID2

43

Tid2 Tid5

Shopping Malls Health

95 36

UID6

LID7 LID5

45 73

Tid5 Tid3

Health Cooking Tips

40 39

UID7

LID1

135

Tid6

Demonetization Person

98 35

UID8

LID5

90

Tid9 Tid8 Tid4

Yoga Confidence Building Stress Management

54 30 13

UID9

LID4 LID3

40 35

Tid5 Tid2

Health Shopping

78 20

UID10

LID6 LID1

79 13

Tid3 Tid10

Cooking Tips Utensils

45 6

UID11

LID7

67

Tid2 Tid1

Shopping Malls Dress

64 9

UID12

LID2

157

Tid3 Tid1

Cooking Tips Cricket

89 68

LID4

74

Tid5 Tid11

Health Food Grains

56 18

UID14

LID5

83

Tid1

Cricket

34

UID15

LID8 LID1

151 13

Tid4 Tid1

Stress Management Cricket

67 29

UID5

UID13

Topic ID

Topic(s)

Number of Keywords

Topic IDs

Location

Location IDs

Profiled Profound Users

Cricket

tid1

Bangalore

lid5

UID2, UID14

Shopping malls

tid2

Chennai

lid4

UID3

Cooking tips

tid3

Kolkata

lid2

UID12

Stress management

tid4

Bangalore

lid5

UID1,UID8,UID15

Health

tid5

Chennai

lid4

UID9,UID13

Demonetization

tid6

Delhi

lid1

UID2,UID7

Colleges

tid7

Bangalore

lid5

UID4,UID10

Restaurants

tid8

Hyderabad

lid6

UID5,UID11,UID12

Food streets

tid9

Mumbai

lid3

UID12

Air show

tid10

Bangalore

lid5

UID6

The following search topics searched by the business users along with the profiled profound users’ IDs in respective location retrieved from the experiments conducted using PUGIS algorithm are tabulated and as shown in the Table.II.

Conducting experiments and analyzing the data set with the proposed UPLSN algorithm, profound users are profiled and stored in the database. Table.I. shows the considered users topics location-wise along with the number of keywords in each cluster of topic tabulated.

Fig.2. Location wise number of Tweets of Users

Fig.2. shows the number of tweets twitted whereas Fig.3. depicts the number of topic clusters of considered users location-wise.

82

https://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 15, No. 6, June 2017

Precision is the fraction of relevant Profound Users (PU) retrieved.

... (1)

Recall is the fraction of successfully retrieved Profound Users that are relevant to the query topic.

... (2)

F-measure is the weighted harmonic mean of precision and recall.

Fig.3. Location wise number of Topic Clusters of Users ... (3) TABLE.III. EVALUATION RESULTS Topics

Metrics

Value Obtained (in percentage)

Precision

98.97

Recall

97.00

F1 measure

97.97

The results are validated with the evaluations metrics and are as shown in Table. III. Our algorithm attained high precision and recall rates during the experiments conducted demonstrating its adequacy. Fig.4. Location wise Topics of Profiled Users

VII. CONCLUSIONS To access and connect with the right user for further exchange of reliable information in social networks, our proposed UPLSN and PUGIS algorithms are efficient. The algorithms presented in this paper profile profound users based on the content analysis of their tweets twitted at various locations. Using NER and WordNet approach, the extracted keywords are classified and clustered to form clusters of topics location-wise. Users are profiled based on maximum number of keywords in a cluster of topic location-wise and are stored in the database. The details of these profiled profound users are recommended to business users with respect to the query requested for further exchange of reliable information. The results of experiments conducted demonstrate the adequacy of proposed algorithms.

The profiled profound users location wise with respect to relevant topic(s) of their interest and are displayed to the business users when they search for information on some specific topic and the results of our experiments on various topics searched are as shown in Fig.4. The evaluation of the proposed algorithms is performed based on several experiments conducted with different search topic query from the business user to retrieve the relevant profound profiled user details from the database for recommendations. The experimental results so obtained are tabulated and are as shown in Table.II. We used precision, recall and F1 measures to evaluate the experimental results obtained and are as shown in equations (1), (2) and (3).

Topic relevance and truthfulness with respect to the location are the subject matter of future work.

83

https://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 15, No. 6, June 2017

REFERENCES [1] Cranshaw, Justin, EranToch, Jason Hong, AniketKittur and Norman

[2] [3]

[4]

[5] [6] [7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16] [17]

[18]

[19]

Sadeh, “Bridging the gap between physical location and online social networks,” ACM 12th International Conference on Ubiquitous Computing, pp. 119-128, 2010. Zhang, Yi-Qing and Xiang Li, “Temporal dynamics and impact of event interactions in cyber-social populations,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 23, no. 1, 2013. Kefalas, Pavlos, PanagiotisSymeonidis and YannisManolopoulos, “New perspectives for recommendations in location-based social networks: Time, privacy and explainability,” Fifth ACM International Conference on Management of Emergent Digital EcoSystems, pp. 1-8, 2013. Song, Yang, Zheng Hu, XiaomingLeng, HuiTian, Kun Yang and XinKe, “Friendship influence on mobile behavior of location based social network users,” Journal of Communications and Networks, vol. 17, no. 2, pp. 126-132, 2015. Berjani, Betim and Thorsten Strufe, “A recommendation system for spots in location-based online social networks,” ACM 4th Workshop on Social Network Systems, p. 4, 2011. Silva, Nitai B., Ren Tsang, George DC Cavalcanti and Jyh Tsang, “A graph-based friend recommendation system using genetic algorithm,” IEEE Congress on Evolutionary Computation, pp. 1-7, 2010. Cho, Eunjoon, Seth A. Myers and Jure Leskovec, “Friendship and mobility: user movement in location-based social networks,” ACM 17th SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1082-1090, 2011. Vasanthakumar G U, P. Deepa Shenoy, K.R. Venugopal, “PTIB: Profiling Top Influential Blogger in Online Social Networks”, International Journal of Information Processing, vol. 10, no. 1, Pages 77-91, IK International Publishing, March 2016. Narayanam, Ramasuri and YadatiNarahari. "A shapley value-based approach to discover influential nodes in social networks,” IEEE Transactions on Automation Science and Engineering, vol. 8, no. 1, pp. 130-147, 2011. Vasanthakumar G U, P Deepa Shenoy, Venugopal K R, ”PFU: Profiling Forum Users in Online Socia Networks, A Knowledge Driven Data Mining Approach”, IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE), Dhaka, Bangladesh, 19-20 December 2015. Debnath, Souvik, NiloyGanguly and PabitraMitra, “Feature weighting in content based recommendation system using social network analysis,” 17th ACM International Conference on World Wide Web, pp. 10411042, 2008. Wan, Shengxian, YanyanLan, JiafengGuo, Chaosheng Fan and Xueqi Cheng, “Informational friend recommendation in social media,” 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1045-1048, 2013. Gao Yang, Yan Chen and KJ Ray Liu, “Understanding sequential user behavior in social computing: To answer or to vote?,” IEEE Transactions on Network Science and Engineering, vol. 2, no. 3, pp. 112-126, 2015. Hongzhi Yin, Bin Cui, Ling Chen, Zhiting Hu, and Chengqi Zhang, “Modeling Location-based User Rating Profiles for Personalized Recommendation,” ACM Transactions on Knowledge Discovery from Data, Vol. 9, Issue. 3, 2015. Yu Chung-Kai, Mihaela van der Schaar and Ali H. Sayed, “InformationSharing Over Adaptive Networks with Self-Interested Agents,” IEEE Transactions on Signal and Information Processing over Networks, vol. 1, no. 1, pp. 2-19, 2015. Yunlong Wang and Petar M. Djuric, “Social Learning with Bayesian Agents and Random Decision Making,” IEEE Transactions on Signal Processing, vol. 63, no. 12, 2015. Zhang, Lizi, Hui Fang, Wee Keong Ng and Jie Zhang, “IntRank: Interaction ranking-based trustworthy friend recommendation,” IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications, pp. 266-273, 2011. Mahmud, Jalal, Michelle X. Zhou, Nimrod Megiddo, Jeffrey Nichols, and Clemens Drews, “Recommending targeted strangers from whom to

[20]

[21]

[22]

[23]

[24]

[25] [26]

[27]

[28]

[29]

[30] [31]

solicit information on social media,” ACM International Conference on Intelligent User Interfaces, pp. 37-48, 2013. Lan Zhang, Xiang-Yang Li, Kebin Liu, Taeho Jung and Yunhao Liu, “Message in a Sealed Bottle: Privacy Preserving Friending in Mobile Social Networks,” IEEE Transactions on Mobile Computing, vol. 14, issue 9, 2013. Bogdan Carbunar, Mahmudur Rahman, Jaime Ballesteros, Naphtali Rishe, and Athanasios V. Vasilakos, “PROFILR: Towards Preserving Privacy and Functionality in Geosocial Networks,” IEEE Transactions on Information Forensics and Security, Vol. 9, No. 4, 2014. Rui Li, Shengjie Wang, and Kevin Chen-Chuan Chang, “Multiple Location Profiling for Users and Relationships from Social Network and Content,” Proceedings of the VLDB Endowment, vol. 5, Issue. 11, pp. 1603-1614, 2012. Zhu Wang, Daqing Zhang, Xingshe Zhou,Dingqi Yang, Zhiyong Yu, and Zhiwen Yu, “Discovering and Profiling Overlapping Communities in Location-Based Social Networks,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol. 44, No. 4, 2014. Swarup Chandra, Latifur Khan, and Fahad Bin Muhaya, “Estimating Twitter User Location Using Social Interactions – A Content Based Approach,” IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing, 2011. Brent Hecht, Lichan Hong, Bongwon Suh, Ed H. Chi, “Tweets from Justin Bieber’s Heart: The Dynamics of the “Location” Field in User Profiles,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 237-246, 2011. Niu T, Zhu S, Pang L, and El Saddik A., “Sentiment Analysis on MultiView Social Data,” International Conference on Multimedia Modeling, vol. 9517, pp. 15-27, 2016. Vasanthakumar G U, Sunithamma K, P Deepa Shenoy and Venugopal K R, “An Overview on User Profiling in Online Social Networks”, International Journal of Applied Information Systems, vol. 11, no. 8, pp. 25-42, Foundation of Computer Science (FCS), NY, USA, January 2017. Soha Mohamed, and Alia Abdelmoty, “Uncovering User Profiles in Location-Based Social Networks,” International Conference on Advanced Geographic Information Systems, Applications, and Services, 2016. Monalisa Ghosh and Gautam Sanyal, “Preprocessing and Feature Selection Approach for Efficient Sentiment Analysis on Product Reviews,” International Conference on Frontiers in Intelligent Computing: Theory and Applications, vol. 515, pp. 721-730, 2017. P Deepa Shenoy, Srinivasa K G, Venugopal K R and L M Patnaik, “Dynamic Association Rule Mining Using Genetic Algorithms”, International Journal on Intelligent Data Analysis, IOS Press, vol. 9, no. 5, pp. 439-453, Sepetember 2005. Jie Bao, Defu Lian, Fuzheng Zhang, and Nicholas Jing Yuan, “Geosocial Media Data Analytic for User Modeling and Location-based Services,” SIGSPATIAL Special, vol. 7, Issue. 3, pp. 11-18, 2015. Stefan Hahmann, Ross S. Purves, and Dirk Burghardt, “Twitter location (sometimes) matters: Exploring the relationship between georeferenced tweet content and nearby feature classes,” Journal of Spatial Information Science, pp. 1-36, 2014.

AUTHORS PROFILE Vasanthakumar G U obtained his Bachelor of Engineering in Electronics and Communication Engineering in 2001 and received his Master of Engineering in Information Technology in 2006, and currently, he is fulltime Research Scholar, pursuing Ph.D. in the Department of Computer Science and Engineering at University Visvesvaraya College of Engineering, Bangalore University, Bangalore, India. He is an IEEE student member and his research interests focus on Information Extraction and Knowledge Discovery, Social Network Analysis, Text Mining, Content Analysis, Pattern Recognition, Information Retrieval and Computer Vision.

84

https://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 15, No. 6, June 2017 P Deepa Shenoy is currently working as Professor in the Department of Computer Science and Engineering, University Visvesvaraya College of Engineering, Bangalore University, Bangalore, India. She did her doctorate in the area of Data Mining from Bangalore University in the year 2005. Her areas of research include Data Mining, Soft Computing, Biometrics and Social Media Analysis. She has published more than 150 papers in refereed International Conferences and Journals.

Ashwini G R obtained her Bachelor of Engineering in Computer Science and Engineering in 2015 from SJBIT, Bangalore. She is currently persuing her Master of Engineering in Web Technologies from University Visvesvaraya College of Engineering, Bangalore University, Bangalore. Her research interests includes Data Mining and Social Media Analysis.

Srilekha K N finished her Bachelor of Engineering in Information Science and Engineering at University Visvesvaraya College of Engineering, Bangalore University, Bangalore. Her research interests includes Data Mining and Social Media Analysis.

K R Venugopal is an IEEE Fellow, ACM Distinguished Educator and currently the Principal, University Visvesvaraya College of Engineering, Bangalore University, Bangalore. He obtained his Bachelor of Engineering from University Visvesvaraya College of Engineering. He received his Masters degree in Computer Science and Automation from Indian Institute of Science, Bangalore. He was awarded Ph.D. in Economics from Bangalore University and Ph.D. in Computer Science from Indian Institute of Technology, Madras. He has authored and edited 73 books on Computer Science and Economics and has published over 640 research papers. His research interests includes Computer Networks, Wireless Sensor Networks, Parallel and Distributed Systems, Digital Signal Processing and Data Mining.

Swathi S finished her Bachelor of Engineering in Information Science and Engineering at University Visvesvaraya College of Engineering, Bangalore University, Bangalore. Her research interests includes Data Mining and Social Media Analysis.

85

https://sites.google.com/site/ijcsis/ ISSN 1947-5500

Suggest Documents