Ethical aspects of web log data mining David L ... - Semantic Scholar

Int. J. Information Technology and Management, Vol. X, No. Y, XXXX

1

Ethical aspects of web log data mining David L. Olson Department of Management, University of Nebraska, Lincoln, NE 68588-0491, USA E-mail: [email protected] Abstract: Web logs (blogs) are growing in use, expanding beyond a personal means to exchange and record ideas to tools that are used by organisations as forums, mechanisms for knowledge management, and for purposes of persuasion (such as advertising) or influence (such as dissemination of political viewpoints). Because ideas are expressed as electronically recorded words, blogs can be data mined (or web mined). This paper reviews the tools available, their uses in knowledge management and ethical aspects of blogs. Keywords: blogs; data mining; web mining; knowledge management; ethics. Reference to this paper should be made as follows: Olson, D.L. (XXXX) ‘Ethical aspects of web log data mining’, Int. J. Information Technology and Management, Vol. X, No. Y, pp.XXX–XXX. Biographical notes: David L. Olson is the James & H.K. Stuart Professor in MIS, Chancellor’s Professor at the University of Nebraska. He has published research in over 90 refereed journal papers, primarily on the topic of multiple objective decision-making. He teaches in the management information systems, management science and operations management areas. He has authored or co-authored 17 books, including Decision Aids for Selection Problems, Managerial Issues of Enterprise Resource Planning Systems and Introduction to Business Data Mining. He is a Fellow of the Decision Sciences Institute.

1

Introduction

Web logs (blogs) have been defined in a variety of ways as described by Dearstyne (2005). Microsoft calls them updated personal web journals, which have knowledge management potential in communicating product messages, increasing people’s ability to share ideas and information on a worldwide scale. Accenture calls them interactive websites allowing users to publish ideas and information, spanning time zones and continents. Technorati, which markets a blog search engine, calls them personal journals with the power to allow millions of people to publish their ideas and millions of others to comment on them. Harvard Law School is more formal, defining blogs as a hierarchy of text, images, media objects and data, arranged chronologically, that can be viewed in an HTML browser. MacDougall (2007) referred to literally millions of blogs on the web, with a new one created every 7 sec. Because they store their content electronically in a highly accessible form, blogs are ripe for web mining. Chua et al. (2007) searched over 400 documents, to include blogs and internet resources such as Wikipedia, for emergency response after Copyright © XXXX Inderscience Enterprises Ltd.

2

D.L. Olson

Hurricane Katrina. They were able to extract a great deal of information concerning how knowledge was created, transferred and subsequently used to gain understanding in emergency management. Blogging has become very popular, not only by individuals, but also by research institutions and corporations. Gordon (2006) cited their use as platforms for sharing and trading data, information, knowledge and opinion among customers, prospects, employees, partners and the media. While this provides great communications benefit, it opens issues related to company secrets, personal abuse, libel and privacy matters. Section 2 will review the mining of text data, focusing on web mining, especially as it relates to blogs. Section 3 will consider issues in web data mining. It will describe some of the tools available in dealing with this special form of data mining. Section 4 will discuss methods of control available. Section 5 will conclude with a discussion of ethical issues involved in web and blog mining.

2

Blog mining

Data mining operates on both numerical data and symbolic data, such as text. Text mining provides a means to wade through the massive glut of prose that is generated by our computer-driven society. Text mining can involve analysis to distill meaning, summarise content through text processing, explore texts through analysis of key words, cluster themes and documents and retrieve information through semantically important words and relevant sentences. The most developed form of text mining is web mining. Web mining obviously implies use of the Worldwide Web, and can apply to blogs as well as any other content. Semantic networks connect the most important concepts from a set of text and establishes the relationships between concepts. Cayzer (2004) addressed the use of Semantic Web tools applied to blogs for knowledge management. Hewlett-Packard Laboratories in Bristol, UK utilised blogs for exchange of information nuggets among those working on organisational problems. Blogs provided a useful means to store, annotate, and share information more reliably than e-mail. A system was developed to capture nuggets of knowledge at a useful level of detail with minimum disruption to user activities. These nuggets could be found throughout the organisation, but the Semantic network enabled integration and global search. Consumers of this knowledge were able to add value by comments on particular nuggets of knowledge. The Semantic blog enabled easy capture of particular bits of knowledge for e-mail, insertion into documents and posting to web pages. Data could be added with time in the appropriate location, and inheritance inferencing for objects was available.

2.1 Web mining There were claimed to be over 110 million World Wide Web servers in 2001, with over 89 million home users in the USA and 33 million Europeans (Pabarskaite, 2003). These numbers are unimportant with regard to their specificity, but e-commerce and e-business have grown rapidly.

Ethical aspects of web log data mining

3

The exponential growth of the World Wide Web has swamped us with material. We are currently drowning in information, but starving for knowledge (Pal et al., 2002). Web mining can aid in location of documents and services on the web. Search engines are fundamental to this type of operation, and provide the initial act needed to conduct more complex forms of web mining. Web mining can also involve information extraction, to include numerical (or text) data found during search activities. Web mining can be applied to servers, to clients (through cookies) and to databases. The data obtained can include numbers, text, image, audio and any other medium in which data is stored. Web mining can involve the focused activity of monitoring user behaviour on a website, with the intent of detecting user behaviour patterns that can be used to not only predict potential customer behaviour, but also to trigger action to improve the prospects of that potential customer buying from the site.

2.2 Web mining taxonomy A common taxonomy for web mining research includes three primary focuses: web content mining, web structure mining and web usage (web log) mining (Pabarskaite and Long, 2003). Web content mining concerns discovery of useful information (and accessing information) from web sources. Within a web business operation, a great deal of customer information is available, as has been demonstrated by Amazon.com and its competitors. It has been useful to keep track of what each customer purchases, and performing cluster analysis on this very large database to enable prediction of other items that type of customer might want. In the case of books, the customer is more likely to purchase a book they have never purchased before, requiring imputation of what a particular customer with a given profile might be interested in Iacobucci et al. (2000). Dealing with such data is challenging, due to the massive size of a data matrix including customers and books. Web content can come in many forms, to include text, image, audio, video and other media. The web itself can be queried. Web-based data collection has been found to be unobtrusive, experimentally-based, externally valid and easily capable of obtaining large sample sizes. Web structure mining seeks to discover the underlying link structures of the web or the relationships between web pages. Web links between sites are identified, and their strength measured, allowing clustering and classification of web pages. Chau and Xu (2007) reported an application identifying hate groups and racists through network analysis and visualisation. Measures can be obtained concerning how easy it is to find related documents on different websites. Data sources include HTML documents and link information. Web usage mining analyses secondary data resulting from human use of the web. Nagivational patterns are usually obtained from web logs, to include navigational patterns and visitor profiles. Information obtained can include what pages were accessed and in what sequence. Web log mining falls within this category, along with website visitor profile analysis. Yen (2003) presented an approach of pattern analysis of blog linkages. Web surfing on blogs was dealt with through an agent-based system by Liu et al. (2004). Furukawa et al. (2006) presented a method to analyse blog network usage by a machine-learning algorithm which enabled them to predict the destination of comments or trackback actions.

4

D.L. Olson

Web usage mining in marketing consists of three broad types of application (Spiliopoulou, 2000): 1

data acquisition

2

cost and quality measures

3

assessment of user/owner satisfaction.

Data acquisition can be invasive, actively contacting the user through questionnaires or other methods. Data acquisition can be non-invasive through recording user behaviour without their interaction. Web server logs are the primary means of obtaining this information, which is often stored in web data warehouses for deep analysis. Once data on customer web behaviour is gathered, it can be used to estimate customer satisfaction in various ways. This can lead to very high numbers of identified relationships. Huang (2007) considered interest measures as a means to help cope with the massive number of rules generated through web log mining. Web content mining would be the most common type of data mining applicable to blogs, but the other types of web mining have been utilised as well. Web structure mining might be of interest to those who are interested in participating in a blog. Web usage mining is usually associated with e-commerce retail, but could have conceivable potential in those participating in blog discussions. Many papers present new methods of analysing web logs with the intent of learning things about the linkages among blog participants (Bodon et al., 2005; Garatti et al., 2004; Huang et al., 2004; Spangler et al., 2006).

3

Blog mining issues

The masses of data available on the web make some form of data mining highly attractive to access information critical to one’s domain of decision making. Web access has been used to better control operations and to monitor customers. The web makes data mining easier through access to large volumes of data, either from external sources or through internal sources such as enterprise resource planning systems. There also is an opportunity to data mine web log activity. The web itself is a tool that provides many benefits to business. It reaches much broader markets much faster. It allows business organisations to communicate with each other for closer coordination and cooperation. It allows businesses to exchange information within the organisation much faster and more efficiently as well. Consumers benefit by being able to compare and shop at their convenience. Like almost every other good thing, this opportunity comes with potential problems. The first is the issue of privacy. The open nature of the web makes almost everything we place on the web available to the world, including competitors, adversaries or governmental agencies. Technology makes it easy to monitor web usage of others. This technology has created an industry selling data, much of it obtained through web mining, usually without subject permission. Web mining also can lead to new forms of discrimination.


5

3.1 The problem Web-based tracking has a number of positive features. User profiling is intended to be helpful to customers (in the sense that businesses try to sell things that customers need). The same is true for market research, with profile data used to design products that theoretically have higher value for consumers. Portal support to ease transactions is also positive, making it more convenient to do business. Web access for businesses to reach customers and to cooperate with other businesses in various chains has proven invaluable. There is, however, potential for web linkage to cause damage through privacy invasion and discrimination. Web monitoring can extend itself to the Big Brother presence that Aldous Huxley and George Orwell warned us about. More threatening at the moment is theft of identity, a growing criminal activity. Table 1 compares potential benefits of web mining with downside issues. Table 1

Web mining benefits and negative aspects

Benefits

Benefit elements

Negative features

Instances of negative impact

Business freedom

Make more profit

Discrimination

Use of web to profile customers illegally

Make more profit

CRM

Privacy abuse

Identity theft

Telemarketing

Nuisance

Statistics: insurance, banking ERP: use of generated data Consumer gets better deals

Intelligent marketing COULD lower prices Consumers see more opportunities

3.2 Web ethics There are a number of ethical issues involved in web mining, to include blogs. Privacy is an obvious issue. But there are other issues, to include price discrimination inherent in yield management types of marketing. The utilitarian view (greatest good for the greatest number – Mill, 1987) is the basis for most logical positivist philosophies shared by most business decision makers. Web mining practitioners realise that some damage to particular individuals may be possible, but their primary focus is on getting a good job done for their organisation. The Rawlsian view places much more emphasis on individual protection (Rawls, 1999, 2001). This view is more conservative (in the sense of protecting individual interests). This approach is held by many, although it can lead to the extreme of paralysing business. The pragmatic view (decisions are best that lead to the best outcomes) provides a compromise (James, 1991). There are an endless number of other philosophical refinements we are not in a position to relate. Web mining is a technology with a potential to do a lot of good. It also can lead to negative impact on individuals. MacDougall (2005) argued that news blogs are becoming the default political news source for a growing proportion of the population. Blogs can become a new form of open source, open access partisan press. They also can be used in their worst sense as a

6

D.L. Olson

mass-mediated spectacle that can lead to insulated enclaves of intolerance-based on illusion, rumour and political innuendo. Use of blogs for public relations involves a number of ethical and legal considerations. Legal issues cited by Lazar (2005) were: •

disclosing information risking a firm’s competitive position or violating laws and regulations

•

viewing blogs as advertising vehicles

•

paying for blog content, and thus becoming paid endorsers rather than independent commentators

•

controlling content, and thus opening an organisation up to liability for content.

3.3 Privacy issues Mason (1986) was a very early predictor of the danger of computer infringement on personal privacy. He warned that the twin drivers of ability and value would make it likely that someone would take advantage of access to personal systems. Data mining is a very valuable tool, but requires detailed data about customers. The more details that can be gathered, the more understanding of consumer preferences is gained. Maury and Kleiner (2002) identified that a major revenue stream for web-based businesses is to sell such information. Sama and Shoaf (2002) also discussed the privacy dilemma of e-business. Pragmatic accommodation of the needs of both consumers and e-business data miners is possible, but if care is not taken, we can expect improper invasion of the privacy of e-business customers. Levine (2003) reviewed a number of privacy factors, shown in Table 2. Table 2

Web privacy factors

Factor

Function

Utilitarian

Rawlsian

Freedom

Protection from social ostracism, punishment, discrimination and criticism

More focus on society than individual

Individual is the focus

Informed consent

Permission required

Not central

Strong importance

Personality development

Private reflection and experimentation for growth

Not central

Individual focus

Avoidance of discrimination

Laws insufficient for personal protection

Hinders society

Individual protection

Separation

Market, family, religion, politics, other relationships separated

Complete data access helps planning

Individual protection over full disclosure

It can be important that personal information be protected through aggregation. Even then, it is dangerous to make it easy to identify the specific profile of each individual (which after all is a necessary condition for effective use of data mining results). There are a number of technical tools developed to support web mining (Calkins, 2002). Portals regulate the distribution of information given to users, and can generate actions to enhance sales volume. Site trackers collect information, often without


7

the knowledge of the customer. Profilers are sets of data about individual web surfing and buying habits. This data is obtained through trackers, which can be placed on a web page to monitor visitor time and keywords. The existence of profilers is given by an icon appearing on the user’s screen. Search bots gather information on internet users and take actions such as sending advertisements. In addition to having agent capabilities, search bots are covert, with no introduction given on the user’s screen. There also is the ability to establish hyperlinks to customer websites, to include deep-linking (which allow website providers to link to other sites, without crediting those other sites for their content), meta-tagging (posting keywords on their website to falsely lure search engines), framing and in-line linking (Maury and Kleiner, 2002). Cookies provide website owners the ability to gather detailed data, and enter into data mining, which can be viewed as a service to the customer, although cynics might be more impressed by the service to the web retailer’s bottom line. The Rawlsian philosophy of distributive justice would argue for giving the customer a fair opportunity to veto the presence of the cookie. A more pragmatic view would allow web retailers to do what their competitors do (or what the legal system allows them to do). The extent of internet privacy invasion through portals is impressive. Peace et al. (2002) reported a survey of the 100 websites with the most volume. Of these 100 sites, 86 used cookie technology, 35 of them allowing advertisers to set cookies. Of these, 18 did not provide a privacy policy.

3.4 Discrimination Discrimination can occur through use of consumer profiles that exclude classes of consumers from market participation or limit access to essential information (Danna and Gandy, 2002). A key concept in customer segmentation, a fundamental data mining practice, is to separate customers into groups, treating each group differently. After identifying price sensitivity by group through data mining, the seller may offer special prices to specific customer groups. Implementation of price discrimination is often applied by banks, which charge extra for services that they would prefer customers not to use (such as encouraging the use of ATMs rather than more expensive human tellers). A variant form of price discrimination would be product versioning, implemented by airlines that offer different priced tickets depending upon the degree of restrictions. In this case, customers can self-select, by paying additional for removal of various restrictions, such as weekend travel or the ability to change itineraries. The film industry applies windowing, another form of price discrimination. Through windowing, the same film can be sold in many different channels, such as theatres, DVDs, cable TV, network TV or syndication. Usually versioning is implemented by staggering release times of the film in various media to maximise revenue. Since consumers can select the form of good they want (at different prices), these forms of discrimination are deemed acceptable by most. Yet another form of technologically-enabled discrimination has been labelled web-lining by Danna and Gandy. One example cited was Kozmo.com, an internet-based home delivery service. Kozmo.com defined its service area on the basis of internet usage. This was considered a form of web-lining, because classes of customers are excluded based on characteristics of neighbourhoods rather than on individual characteristics.

8

D.L. Olson

Kozmo thus did not make its services available to predominately African-American neighbourhoods. Kozmo was forced out of business in litigation-based on identification of patterns of discrimination.

4

Methods of control

Alternative philosophies would suggest different means to control web mining (and blog mining). The Worldwide Web was designed as an uncontrolled entity, encouraging exchange of information. It does that fairly well, and it would be a shame to lose the many benefits of being able to communicate freely throughout the world. These benefits include the positive outcomes of web mining, to include broader exposure to products, the ability to tailor products for individual tastes, and the potential for lower marketing costs (which in theory could be passed on to consumers). However, totally unrestricted blog mining can infringe upon the rights of others. Telemarketing is a benefit that many of us would rather forego. There also is great potential for abuse of privacy. The more serious forms of privacy invasion include malicious identity theft. There are also some aspects of our lives (medical histories, tax records, court dockets) that we may not want shared with the rest of the world in an uncontrolled manner. The final issue involves the potential for harmful discrimination. Web mining offers its users the opportunity to select their customers. The primary idea behind bank loan analysis (and insurance) through data mining is to avoid giving loans (or policies) to those less likely to pay (or less likely to make claims). There is a fine grey line between identification of bad credit risks and fair treatment. This line has been defined by litigation over many years. More rigid control of blog mining is risky. Firstly, there is the problem of who has authority over the internet. The web’s international nature probably provides it some assurance of freedom, but with each country imposing some control over internal traffic (with varying degrees of thoroughness). Secondly, granting the ability to any agency to more rigidly control the web would impose the type of access that we wanted to avoid in the first place (who polices the police). The way in which control is currently imposed is systemic and market-driven. Publicity is the most effective way to modify behaviour that steps over the line (Garfinkel, 2000; Schneier, 2000). Publicity usually works, as businesses operate in a climate where there is perceived value in avoiding unhappy customers. Pressure from watchdog organisations modified the behaviour of DoubleClick cited earlier. If publicity does not work, litigation is a more severe (and expensive) resort. Litigation worked to modify the behaviour of Kozmo.com, cited earlier. Publicity is the market-driven element, and litigation the systemic corrective for more involved cases. In a pragmatic sense, these control devices seem to work.

5

Conclusions

Web log mining usually involves mining of text. Text mining is much more challenging than quantitative data mining. Numbers are much easier to process by computer than text. However, many useful tools have been developed to find patterns in text that lead to


9

human knowledge. This can include learning about the intentions of competitors, as well as study of general customer behaviour. Web mining has grown rapidly. It is characterised by at least three broad categories of application. Firstly, the web is a source of quantitative data that can be used along with other sources of data in data mining. Secondly, the behaviour of web users at e-commerce or e-business websites can be monitored with the intent of improving the profitability of such websites. Thirdly, the web includes massive quantities of text, which can be used in text mining. Tools to search the web were among the first applications of technology to web analysis. There have been many artificial intelligence tools that have provided support to web mining. In the future, these tools may lead to mining of multimedia data as well as text to include that found on blogs. Data mining is a potentially useful tool, capable of doing a lot of good not only to business, but also to the medical field and to government. It does bring with it dangers. How can we best protect ourselves, especially in the area of business data mining? There are a number of options. Strict control of data usage through governmental regulation was proposed by Garfinkel (2000). Many large database projects were ultimately stopped. Those involving government agencies were successfully stopped by public exposure, with negative outcry leading to cancellation of the National Data Centre and the Social Security Administration projects. A system with closely held information by credit bureaus in the 1960s was only stopped after governmental intervention, to include passage of new laws. Times have changed, seeing a more responsive attitude toward consumers on the part of business. Innovative data mining efforts by Lotus/Equifax and by Lexis-Nexis were quickly stopped by public pressure alone. Public pressure seems to be quite effective in providing some control over potential data mining abuses. If that fails, litigation is available (although slow in effect). It is necessary for us to realise what businesses can do with data. There never will be a perfect system to protect us, and we need to be vigilant. However, too much control can be dangerous too, inhibiting the ability of business to provide better products and services through data mining. Garfinkel prefers more governmental intervention. We would prefer less governmental intervention, but reliance upon publicity, and if necessary, the legal system. Web mining is a natural extension of the use of technology. The vast majority of website content is predominately not of interest to any one individual. Search engines display a great deal of intelligence and cunning in their ability to identify content related to specific keys. But we all have used key words yielding thousands of contacts that we did not want, and sometimes have difficulty finding useful content. Web mining provides more efficient ability to identify relevant content. Along with this technological ability, comes the risk of abuse. Control would be best accomplished if it were naturally encouraged by systemic relationships. The first systemic means of control is publicity. Should those adopting questionable practices persist, litigation is slow, costly, but ultimately effective means of system correction. However, before taking drastic action, if the system works, it would be best not to fix it. The best measure electronic retailers can take is probably the pragmatic Calpurnian solution, do not do anything that will cause customers to suspect that their rights are being violated.

10

D.L. Olson

References Bodon, F., Kouris, I.N., Makris, C.H. and Tsakalidis, A.K. (2005) ‘Automatic discovery of locally frequent itemsets in the presence of highly frequent itemsets’, Intelligent Data Analysis, Vol. 9, pp.83–104. Calkins, M. (2002) ‘Rippers, portal users, and profilers: three web-based issues for business ethicists’, Business and Society Review, Vol. 107, No. 1, pp.61–75. Cayzer, S. (2004) ‘Semantic blogging and decentralized knowledge management’, Communications of the ACM, Vol. 47, No. 12, pp.47–52. Chau, M. and Xu, J. (2007) ‘Mining communities and their relationships in blogs: a study of online hate groups’, International Journal of Human-Computer Studies, Vol. 65, pp.57–70. Chua, A.Y.K., Kaynak, S. and Schubert, S.B.F. (2007) ‘An analysis of the delayed response to Hurricane Katrina through the lens of knowledge management’, Journal of the American Society for Information Science and Technology, Vol. 58, No. 3, pp.391–403. Danna, A. and Gandy Jr., O.H. (2002) ‘All that glitters is not gold: digging beneath the surface of data mining’, Journal of Business Ethics, Vol. 40, pp.373–386. Dearstyne, B.W. (2005) ‘Blogs: the new information revolution?’ The Information Management Journal, Vol. 39, No. 5, pp.38–44. Furukawa, T., Matsuzawa, T., Matsuo, Y., Uchiyama, K. and Takeda, M. (2006) ‘Analysis of user relations and reading activity in Weblogs’, Electronics and Communications in Japan Part 1, Vol. 89, No. 12, pp.1258–1266. Garatti, S., Savaresi, S.M., Bittanti, S. and La Brocca, L. (2004) ‘On the relationships between user profiles and navigation sessions in virtual communities: a data-mining approach’, Intelligent Data Analysis, Vol. 8, pp.579–600. Garfinkel, S. (2000) Database Nation: The Death of Privacy in the 21st Century, Cambridge, MA: O’Reilly & Associates. Gordon, S. (2006) ‘Rise of the blog’, IIE Review, Vol. 52, No. 3, pp.32–35. Huang, X. (2007) ‘Comparison of interestingness measures for web usage mining: an empirical study’, International Journal of Information Technology and Decision Making, Vol. 6, No. 1, pp.15–41. Huang, X., Peng, F., An, A. and Schuurmans, D. (2004) ‘Dynamic web log session identification with statistical language models’, Journal of the American Society for Information Science and Technology, Vol. 55, No. 14, pp.1290–1303. Iacobucci, D., Arabie, P. and Bodapati, A. (2000) ‘Recommendation agents on the internet’, Journal of Interactive Marketing, Vol. 14, No. 3, pp.2–11. James, W. (1991) Pragmatism, New York: Prometheus Books (original 1907). Lazar, B.A. (2005) ‘Marketing blogs present new legal issues’, Marketing News, Vol. 39, No. 7, p.6. Levine, P. (2003) ‘Information technology and the social construction of information privacy: comment’, Journal of Accounting and Public Policy, Vol. 22, pp.281–285. Liu, J., Zhang, S. and Yang, J. (2004) ‘Characterizing web usage regularities with information foraging agents’, IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 5, pp.566–584. MacDougall, R. (2005) ‘Identity, electronic ethos, and blogs: a technological analysis of symbolic exchange on the new news medium’, American Behavioral Scientist, Vol. 49, No. 4, pp.575–599. Mason, R. (1986) ‘Four ethical issues of the information age’, MIS Quarterly, Vol. 10, No. 1, pp.5–12. Maury, M.D. and Kleiner, D.S. (2002) ‘E-commerce, ethical commerce?’ Journal of Business Ethics, Vol. 36, Nos. 1–2, pp.21–31. Mill, J.S. (1987) Utilitarianism, New York: Prometheus Books (original 1863).


11

Pabarskaite, Z. (2003) ‘Decision trees for web log mining’, Intelligent Data Mining, Vol. 7, pp.141–154. Pabarskaite, Z. and Long, J.A. (2003) ‘Decision trees for pernicious pages detection’, International Journal on Artificial Intelligence Tools, Vol. 12, No. 4, pp.527–537. Pal, S.K., Talwar, V. and Mitra, P. (2002) ‘Web mining in soft computing framework: relevance, state of the art and future directions’, IEEE Transactions on Neural Networks, Vol. 13, No. 5, pp.1163–1177. Peace, A.G., Weber, J., Hartzel, S. and Nightingale, J. (2002) ‘Ethical issues in ebusiness: a proposal for creating the ebusiness principles’, Business and Society Review, Vol. 107, No. 1, pp.41–60. Rawls, J. (1999) A Theory of Justice, Revised Edition, Cambridge, MA: Harvard University Press (original 1971). Rawls, J. (2001) Justice As Fairness: A Restatement, Cambridge, MA: The Belknap Press. Sama, L.M. and Shoaf, V. (2002) ‘Ethics on the web: applying moral decision-making to the new media’, Journal of Business Ethics, Vol. 36, Nos. 1–2, pp.93–103. Schneier, B. (2000) Secrets and Lies: Digital Security in a Networked World, New York: Wiley and Sons. Spangler, W.S., Kreulen, J.T. and Newswanger, J.F. (2006) ‘Machines in the conversation: detecting themes and trends in informal communication streams’, IBM Systems Journal, Vol. 45, No. 4, pp.785–799. Spiliopoulou, M. (2000) ‘Web usage mining for web site evaluation’, Communications of the ACM, Vol. 43, No. 8, pp.127–134. Yen, S-J. (2003) ‘An efficient approach for analyzing user behaviors in a web-based training environment’, International Journal of Distance Education Technologies, Vol. 4, pp.55–71.

Ethical aspects of web log data mining David L ... - Semantic Scholar

Ethical aspects of web log data mining David L ... - Semantic Scholar

Suggest Documents

Ethical aspects of web log data mining David L. Olson - CiteSeerX

Different Aspects of Web Log Mining - Semantic Scholar

Different Aspects of Web Log Mining

Different Aspects of Web Log Mining - CiteSeerX

log data preparation for mining web usage patterns - Semantic Scholar

Frequent Pattern Mining in Web Log Data

Efficient Web Log Mining for Product Development - David Woon

Data Mining using Web Spiders - Semantic Scholar

Web Log Mining in Search Engines - Semantic Scholar

Constraint-based Web Log Mining for Analyzing ... - Semantic Scholar

Exploiting Web Log Mining for Web Cache

analysing server log file using web log expert in web data mining

Data Mining in Semantic Web Data - ijcte

1 On Analyzing Web Log Data: A Parallel ... - Semantic Scholar

The Laborious Way From Data Mining to Web Log ... - Semantic Scholar

Finding Generalized Path Patterns for Web Log Data Mining *

Finding Generalized Path Patterns for Web Log Data Mining - DElab

Efficient Frequent Pattern Mining on Web Log Data - CiteSeerX

CiteSeerX â LODAP: A LOg DAta Preprocessor for mining Web ...

Finding Generalized Path Patterns for Web Log Data Mining - CiteSeerX

sequential pattern mining from web log data - IJESAT

application of decisional dna in web data mining - Semantic Scholar

Data Mining Technology for the Evaluation of Web ... - Semantic Scholar

Web Usage Mining - Semantic Scholar

Ethical aspects of web log data mining David L ... - Semantic Scholar