Finding rules in data

Recommend Documents

Finding Association Rules From Quantitative Data Using Data ...

Keywords: data mining, booleanization, association rules, dependency rules. 1. ... and Motwani broadened the definition of association rules calling these rules.

Finding rules for audit opinions prediction through data mining methods

from data mining methods. The system receives the data from financial reports and identifies the type of audit opinions. This approach combines support.

Finding Groups in Data

of data (including interval-scaled and binary variables, as well as similarity data) and explains ...... (of course, the program may also be put in drive B or on a hard disk C). The user must ...... An interesting feature of Mac-. Queen's method is .

Finding Unusual Review Patterns Using Unexpected Rules

problem as finding unexpected rules. The technique is domain independent. Using the technique, we analyzed an Amazon.com review dataset and found many ...

Finding Optimal Apertures in Kepler Data - IOPscience

Oct 20, 2016 - and the Kepler Combined Differential Photometric Precision (CDPP) ... The method further allows us to identify errors in the Kepler and K2 input.

Finding Structure in Big Data - People.csail.mit.edu

Finding Structure in Big Data. Ankur Moitra, IAS. May 4, 2012. Ankur Moitra (IAS). Recommendations. May 4, 2012 ...

Finding Data Races in C++ - Google Code

If you've got some multi-threaded code, you may have data races in it. Data races ... Valgrind is a binary translation f

Finding developmental groups in acquisition data - CiteSeerX

The most famous example of this approach to stages is Brown' (1973) ... MLU stages according to Brown (1973) ...... Review of Roger Brown, A first language.

Finding Trading Patterns in Stock Market Data

For example, daily charts might show price bars that record the opening, closing, maximum, and minimum price for a period of trading. A simple graph of closing ...

Finding Trading Patterns in Stock Market Data

strategies for finding rules or patterns in data. These .... its high. Market opens and closes near its high but trades lower during the period. ..... a probability of 0.5.

Finding Consistent Clusters in Data Partitions - CiteSeerX

Best fit to some criteria, no single algorithm can ade- quately handle all sorts of cluster shapes and structures; when considering hy- brid structure data sets, ...

Min-Apriori: An Algorithm for Finding Association Rules in ... - CiteSeerX

Dec 31, 1997 - Additional support was pro- vided by the IBM ... Let C be a subset of I, then we define support of C with respect to T to be: support(C) = â iâT.

Outliers and Data Mining: Finding Exceptions in Data - Computer ...

(especially in an interactive data mining application), there may be cases where we are ... mining applications involving large or dynamic datasets, is the stability ...

Data Sharing And Rules Intersection In Cooperative

The network restriction is reducing to make the data sharing to ... Heterogeneous network, Rules intersection, firewall, ... network devices computer software used ... multiple computers communicate varied type conversations have course example file

Association Rules and Data Mining in Hospital

Mar 3, 1998 - hospital infection control and public health surveillance. Next, they define ... injury records that originate in clinics and hospitals.''2. It is this ...

Effictiveness of data correction rules in process-produced data - iab

One of the most striking features of recent decades has been the rising ... neither how effective these correction rules are nor whether any of them have ...... Bernhard, Sarah, Christian Dressel, Bernd Firzenberger, Daniel Schnitzlein, and .... und

SBE Rules - NCSBE Public Data

May 26, 2017 - Sending the protest and attachments, by e-mail, on the same day it was ..... process is not automatic), i

Finding Interesting Rules from Large Sets of Discovered ... - CiteSeerX

discovering association rules from large collections of data. The number of discovered rules can, however, be so large that browsing the rule set and nding ...

Download ebook The Golden Rules: Finding World ... - Google Sites

Online PDF The Golden Rules: Finding World-Class Excellence in Your Life and Work, Read PDF The Golden Rules: Finding Wo

Finding Valuein Big Data - Aaltodoc - Aalto-yliopisto

Finding Value in Big Data. Statistical Analysis of Large Data Sets with. Applications in Electric Power Systems. Matti Koivisto. A doctoral dissertation completed ...

Finding and Analyzing Data on FEC.gov

you have any comments or suggestions for this guide, please email me: .... should make sure your story and any related g

Finding Decision Rules with Genetic Algorithms - Semantic Scholar

Sep 12, 2005 - goals uniformly, it seems a promising start. Thanks to Steve Kimbrough, Scott Moore, and Jerry Lohse for the discussions on this topic.

Finding Reusable Data Structures - Semantic Scholar

scale, object-oriented applications is the frequent creation of data structures (by the ... as Java) that does not suppo

A GUIDE TO FINDING EXISTING DATA

Find out more at croplife.org/data-transparency. Helping Farmers Grow. Most health and environmental safety information

Finding rules in data

Download PDF

1 downloads 0 Views 201KB Size Report

Comment

(1 + log(fdj (ti))) Ã log M. fD (ti) if ti â dj, minthâdj whj Ã log(M/fD (ti)). 1+log(M/fD (ti)) if ti â U(R,dj) \ dj. 0 if ti â U(R,dj). The vector length normalization is then ...

Finding rules in data Tu-Bao Ho1,2 1

2

Institute of Information Technology, Vietnamese Academy of Science and Technology, 18 Hoang Quoc Viet, Hanoi, Vietnam Japan Advanced Institute of Science and Technology (JAIST) 1-1 Asahidai, Nomi, Ishikawa, Japan [email protected]

Abstract. In the first year of my preparation for doctor thesis at INRIA in the group of Edwin, I worked on the construction of an inference engine and a knowledge base, by consulting various group members, for building an expert system guiding the data analysis package SICLA of the group. One day, Edwin asked me whether one can automatically generate rules for expert systems from data, and I started my new research direction. Since that time, my main work has been machine learning, especially finding rules in data. This paper briefly presents some learning methods we have developed.

1

Introduction

Twenty years ago, machine learning was in its infancy with few work and applications. The wave of artificial intelligence (AI) in early of years 1980s has fostered the development of machine learning. From the joined work on conceptual clustering with Michalski, Edwin found his interest in this young field of machine learning (Michalski et al., 1983). As a doctor candidate in his group at that time, he suggested if I can work on finding new ways to generate rules for expert systems from data, instead of working as knowledge engineers who try to acquire knowledge from human experts. There have been a great progress in the field of machine learning. It has become an established area with sound foundation, rich techniques and various applications. Machine learning becomes one of the most active areas in computer science. This paper briefly presents some of our main work in machine learning since those days in the group of Edwin, from supervised learning (Ho et al., 1988), (Nguyen and Ho, 1999), (Ho and Nguyen, 2003) to unsupervised learning (Ho, 1997), (Ho and Luong, 1997), and some recent work on text clustering (Ho and Nguyen, 2002), (Ho et al., 2002), (Le and Ho, 2005), bioinformatics (Pham and Ho, 2007), kernel methods (Nguyen and Ho, 2007).

388

2

Tu-Bao Ho

Rule induction from supervised data

In this section we briefly present three supervised learning methods of CABRO1 (Ho et al., 1988), CABRO2 (Nguyen and Ho, 1999), and LUPC (Ho and Nguyen, 2003). 2.1

CABRO1

CABRO1 (Construction Automatique a Base de Regles a partir d’Observation) is a method of rule induction from supervised data. Let D1 , D2 , ..., Dp be p finite domains and D1 × D2 × ... × Dp . Elements of D are called objects and denoted by ω = (d1 , d2 , ..., dp ) where dj ∈ Dj for j ∈ J = {1, 2, ..., p}. A variable-value pair (Xj , dj ) defines an elementary assertion AXj =dj that determines the set ωXj =dj of objects of D which have the value dj ∈ {dj1 , dj2 , ..., djq } for the variable Xj : AXj =dj : D −→ {true, f alse}, ω = (d1 , ..., dp ) 7→ AXj =dj (ω) = true, if Xj (ω) = dj and ω = (d1 , ..., dp ) 7→ AXj =dj (ω) = f alse, if Xj (ω) 6= dj . V We consider an0 assertion as a conjunctionVof elementary assertions: A = (Xj , dj ), j ∈ J ⊆ J and dj ∈ Dj , where denotes the logical conjunction. An assertion A is a Boolean function from D −→ {true, f alse}, and it is also the identification function for the set: ωA = {ω ∈ D | A(ω) = true}. Variables correspond to j ∈ J 0 are said to be tied to the assertion. Variables correspond to j ∈ J \ J 0 are said to be free from the assertion. Number of tied variables is called length of the assertion. One says also that assertion A covers the set ωA . Assertion A is said to be more general than assertion B iff ωB ⊆ ωA . Assertion A is said to be better than assertion B iff card(ωA ) > card(ωB ). A is a representative assertion generated from an object ω ∈ E if A is one of the best assertions formed by elementary assertions generated from ω. Denote < =