Efficient Web Usage Mining with Clustering - CiteSeerXwww.researchgate.net › publication › fulltext › 26818474

203

Efficient Web Usage Mining with Clustering K.Poongothai1 1

M.Parimala2 and Dr. S.Sathiyabama3

Asst Prof, Department Of Information Technology, Selvam College of Technology, Namakkal, Tamilnadu, India. 2

Lecturer, Department Of MCA, M.Kumarasamy College of Engg, Karur, Tamilnadu, India. 3

Assistant Professor of Computer Science, Thiruvalluvar Govt Arts and science college, Rasipuram, Tamilnadu, India.

Abstract Web usage mining attempts to discover useful knowledge from the secondary data obtained from the interactions of the users with the Web. Web usage mining has become very critical for effective Web site management, creating adaptive Web sites, business and support services, personalization, network traffic flow analysis etc., Web site under study is part of a nonprofit organization that does not sell any products. It was crucial to understand who the users were, what they looked at, and how their interests changed with time. To achieve this, one of the promising approaches is web usage mining, which mines web logs for user models and recommendations. Web usage mining algorithms have been widely utilized for modeling user web navigation behavior. In this study we advance a model for mining of user’s navigation pattern. The proposal of our work proceeds in the direction of building a robust web usage knowledge discovery system, which extracts the web user profiles at the web server, application server and core application level. The proposal optimizes the usage mining framework with fuzzy C means clustering algorithm (to discover web data clusters) and compare with Expected Maximization cluster system to analyze the Web site visitor trends. The evolutionary clustering algorithm is proposed to optimally segregate similar user interests. The clustered data is then used to analyze the trends using inference system. By linking the Web logs with cookies and forms, it is further possible to analyze the visitor behavior and profiles which could help an e-commerce site to address several business questions. Experimentation conducted with CFuzzy means and Expected Maximization clusters in Syskill Webert data set from UCI, shows that EM

shows 5% to 8% better performance than CFuzzy means in terms of cluster number.

1. Introduction Web mining is the application of data mining techniques to extract knowledge from web data, including web documents, hyperlinks between documents, us-age logs of web sites, etc. Web content mining is the process of extracting useful information from the contents of web documents. Content data is the collection of facts a web page is designed to contain. It may consist of text, images, audio, video, or structured records such as lists and tables. Application of text mining to web content has been the most widely researched. Issues addressed in text mining include topic discovery and tracking, extracting association patterns, clustering of web documents and classification of web pages. Research activities on this topic have drawn heavily on techniques developed in other disciplines such as Information Retrieval (IR) and Natural Language Processing (NLP). While there exists a significant body of work in extracting knowledge from images in the fields of image processing and computer vision, the application of these techniques to web content mining has been limited. The structure of a typical web graph consists of web pages as nodes, and hyper-links as edges connecting related pages. Web structure mining is the process of discovering structure information from the web. This can be further divided into two kinds based on the kind of structure information used. A hyperlink is a structural unit that connects a location in a web page to a different location, either within the same web page or on a different web page. A hyperlink that connects to a different part of the same page is called an intradocument hyperlink, and a hyperlink that connects

IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 6, No 3, November 2011 ISSN (Online): 1694-0814 www.IJCSI.org

two different pages is called an inter-document hyperlink. In addition, the content within a Web page can also be organized in a tree-structured format, based on the various HTML and XML tags within the page. Mining efforts here have focused on automatically extracting document object model (DOM) structures out of documents Web usage mining is the application of data mining techniques to discover interesting usage patterns from web usage data, in order to understand and better serve the needs of web-based applications. Usage data captures the identity or origin of web users along with their browsing behavior at a web site. Web usage mining itself can be classified further depending on the kind of usage data considered i.e web server data, application server data and application level data. In Web Server Data, user logs are collected by the web server and typically include IP address, page reference and access time. In Application Server Data, commercial application servers such as Web logic, Story Server, have significant featu

Efficient Web Usage Mining with Clustering - CiteSeerXwww.researchgate.net › publication › fulltext › 26818474

Efficient Web Usage Mining with Clustering - CiteSeerXwww.researchgate.net › publication › fulltext › 26818474

Suggest Documents

Agglomerative Clustering in Web Usage Mining: A

efficient web data extraction using clustering approach in web usage ...

Web Usage Mining - CiteSeerX

web usage mining

Web Usage Mining - arXiv

Web Usage Mining with Inductive Logic Programming

Clustering Techniques Utilized in Web Usage Mining - CiteSeerX

Web usage mining with intentional browsing data

Automatic Clustering for the Web Usage Mining Presented ... - CiteSeerX

Web Usage Mining Using Artificial Ant Colony Clustering and ... - arXiv

Web Usage Mining Using Artificial Ant Colony Clustering and Genetic ...

Clustering Techniques Utilized in Web Usage Mining - CiteSeerX

web usage mining using rough agglomerative clustering - Google Sites

Clustering Dynamic Web Usage Data

An Efficient Web Usage Mining Approach Using Chaos Optimization ...

An efficient algorithm for Web usage mining - CiteSeerX

An efficient algorithm for Web usage mining - Semantic Scholar

An efficient algorithm for Web usage mining - Semantic Scholar

Web Usage Mining Project

Mining Unexpected Web Usage Behaviors

Web Usage Mining - IEEE Xplore

Web Usage Mining - Semantic Scholar

AntClust: Ant Clustering and 0eb Usage Mining

A Survey on Web Mining and Web Usage Mining