Data Clustering: User's Dilemma (Abstract). 1. Anil K. Jain ... Multi-source Data Modelling: Integrating Related Data to Improve ... Mining Marketing Data.
Petra Perner (Ed.)
Machine Learning and Data Mining in Pattern Recognition 5th International Conference, MLDM 2007 Leipzig, Germany, July 18-20, 2007 Proceedings
Springer
Table of Contents
Invited Talk Data Clustering: User's Dilemma (Abstract) Anil K. Jain
1
Classification On Concentration of Discrete Distributions with Applications to Supervised Learning of Classifiers Magnus Ekdahl and Timo Koski Comparison of a Novel Combined ECOC Strategy with Different Multiclass Algorithms Together with Parameter Optimization Methods Marco Hülsmann and Christoph M. Friedrich
2
17
Multi-source Data Modelling: Integrating Related Data to Improve Model Performance Paul R. Trundle, Daniel C Neagu, and Qasim Chaudhry
32
An Empirical Comparison of Ideal and Empirical ROC-Based Reject Rules Claudio Marrocco, Mario Molinara, and Francesco Tortorella
47
Outlier Detection with Kernel Density Functions Longin Jan Latecki, Aleksandar Lazarevic, and Dragoljub Pokrajac
61
Generic Probability Density Function Reconstruction for Randomization in Privacy-Preserving Data Mining Vincent Yan Fu Tan and See-Kiong Ng
76
An Incremental Fuzzy Decision Tree Classification Method for Mining Data Streams Tao Wang, Zhoujun Li, Yuejin Yan, and Huowang Chen
91
On the Combination of Locally Optimal Pairwise Classifiers Gero Szepannek, Bernd Bischl, and Claus Weihs
104
Feature Selection, E x t r a c t i o n and Dimensionality Reduction An Agent-Based Approach to the Multiple-Objective Selection of Reference Vectors Ireneusz Czarnowski and Piotr J§drzejowicz
117
X
Table of Contents
On Applying Dimension Reduction for Multi-labeled Problems Moonhwi Lee and Cheong Hee Park
131
Nonlinear Feature Selection by Relevance Feature Vector Machine Haibin Cheng, Haifeng Chen, Guofei Jiang, and Kenji Yoshihira
144
Affine Feature Extraction: A Generafization of the Fukunaga-Koontz Transformation Wenbo Cao and Robert Haralick
160
Clustering A Bounded Index for Cluster Validity Sandro Saitta, Benny Raphael, and Ian F. C. Smith
174
Varying Density Spatial Clustering Based on a Hierarchical Tree Xuegang Hu, Dongbo Wang, and Xindong Wu
188
Kernel MDL to Determine the Number of Clusters Ivan 0. Kyrgyzov, Olexiy 0. Kyrgyzov, Henri Maitre, and Marine Campedel
203
Critical Scale for Unsupervised Cluster Discovery Tomoya Sakai, Atsushi Imiya, Takuto Komazaki, and Shiomu Hama
218
Minimum Information Loss Cluster Analysis for Categorical Data Jifi Grim and Jan Hora
233
A Clustering Algorithm Based on Generalized Stars Airel Perez Suärez and Jose E. Medina Pagola
248
Support Vector Machine Evolving Committees of Support Vector Machines D. Valincius, A. Verikas, M. Bacauskiene, and A. Gelzinis Choosing the Kernel Parameters for the Directed Acyclic Graph Support Vector Machines Kuo-Ping Wu and Sheng-De Wang
263
276
Data Selection Using SASH Trees for Support Vector Machines Chaofan Sun and Ricardo Vilalta
286
Dynamic Distance-Based Active Learning with SVM Jun Jiang and Horace H.S. Ip
296
Table of Contents
XI
Transductive Inference Off-Line Learning with Transductive Confidence Machines: An Empirical Evaluation Stijn Vanderlooy, Laurens van der Maaten, and Ida Sprinkhuizen-Kuyper Transductive Learning from Relational Data Michelangelo Ceci, Annalisa Appice, Nicola Barile, and Donato Malerba
310
324
Association Rule Mining A Novel Rule Ordering Approach in Classification Association Rule Mining Yanbo J. Wang, Qin Xin, and Frans Coenen Distributed and Shared Memory Algorithm for Parallel Mining of Association Rules J. Herndndez Palancar, O. Fraxedas Tormo, J. Feston Cdrdenas, and R. Herndndez Leon
339
349
Mining Spam, Newsgroups, Blogs Analyzing the Performance of Spam Filtering Methods When Dimensionality of Input Vector Changes J.R. Mendez, B. Corzo, D. Glez-Pena, F. Fdez-Riverola, and F. Diaz
364
Blog Mining for the Fortune 500 James Geller, Sapankumar Parikh, and Sriram Krishnan
379
A Link-Based Rank of Postings in Newsgroup Hongbo Liu, Jiahai Yang, Jiaxin Wang, and Yu Zhang
392
Intrusion Detection and Networks A Comparative Study of Unsupervised Machine Learning and Data Mining Techniques for Intrusion Detection Reza Sadoddin and Ali A. Ghorbani
404
Long Tail Attributes of Knowledge Worker Intranet Interactions Peter Geczy, Noriaki Izumi, Shotaro Akaho, and Köiti Hasida
419
A Case-Based Approach to Anomaly Intrusion Detection Alessandro Micarelli and Giuseppe Sansonetti
434
Sensing Attacks in Computers Networks with Hidden Markov Models. . . Davide Ariu, Giorgio Giacinto, and Roberto Perdisci
449
XII
Table of Contents
Frequent and Common Item Set Mining FIDS: Monitoring Frequent Items over Distributed Data Streams Robert Füller and Mehmed Kantardzic Mining Maximal Frequent Itemsets in Data Streams Based on FP-Tree Fujiang Ao, Yuejin Yan, Jian Huang, and Kedi Huang CCIC: Consistent Common Itemsets Classifier Yohji Shidara, Atsuyoshi Nakamura, and Mineichi Kudo
464
479 490
Mining Marketing Data Development of an Agreement Metrie Based Upon the RAND Index for the Evaluation of Dimensionality Reduction Techniques, with Applications to Mapping Customer Data Stephen France and Douglas Carroll
499
A Sequential Hybrid Forecasting System for Demand Prediction Luis Aburto and Richard Weber
518
A Unified View of Objective Interestingness Measures Celine Hebert and Bruno Cremilleux
533
Comparing State-of-the-Art Collaborative Filtering Systems Laurent Candillier, Frank Meyer, and Marc Boulle
548
Structural Data Mining Reducing the Dimensionality of Vector Space Embeddings of Graphs . . . Kaspar Riesen, Vivian Kilchherr, and Horst Bunke
563
PE-PUC: A Graph Based PU-Learning Approach for Text Classification Shuang Yu and Chunping Li
574
Efficient Subsequence Matching Using the Longest Common Subsequence with a Dual Match Index Tae Sik Han, Seung-Kyu Ko, and Jaewoo Kang
585
A Direct Measure for the Efficacy of Bayesian Network Structures Learned from Data Gary F. Holness
601
Image Mining A New Combined Fractal Scale Descriptor for Gait Sequence Li Cui and Hua Li
616
Table of Contents
XIII
Palmprint Recognition by Applying Wavelet Subband Representation and Kernel PCA Murat Ekinci and Murat Aykut
628
A Filter-Refinement Scheine for 3D Model Retrieval Based on Sorted Extended Gaussian Image Histogram Zhiwen Yu, Shaohong Zhang, Hau-San Wong, and Jiqi Zhang
643
Fast-Maneuvering Target Seeking Based on Double-Action Q-Learning Daniel CK. Ngai and Nelson H.C Yung
653
Mining Frequent Trajectories of Moving Objects for Location Prediction Mikolaj Morzy
667
Categorizing Evolved CoreWar Warriors Using EM and Attribute Evaluation Doni Pracner, Nenad Tomasev, Milos Radovanovic, and Mirjana Ivanovic Restricted Sequential Floating Search Applied to Object Selection J. Arturo Olvera-Lopez, J. Francisco Martinez-Trinidad, and J. Ariel Carrasco-Ochoa Color Reduction Using the Combination of the Kohonen Self-Organized Feature Map and the Gustafson-Kessel Fuzzy Algorithm Konstantinos Zagoris, Nikos Papamarkos, and Ioannis Koustoudis A Hybrid Algorithm Based on Evolution Strategies and Instance-Based Learning, Used in Two-Dimensional Fitting of Brightness Profiles in Galaxy Images Juan Carlos Gomez and Olac Fuentes Gait Recognition by Applying Multiple Projections and Kernel PCA . . . Murat Ekinci, Murat Aykut, and Eyup Gedikli
681
694
703
716 727
Medical, Biological, and Environmental Data Mining A Machine Learning Approach to Test Data Generation: A Case Study in Evaluation of Gene Finders Henning Christiansen and Christina Mackeprang Dahmcke
742
Discovering Plausible Explanations of Carcinogenecity in Chemical Compounds Eva Armengol
756
One Lead ECG Based Personal Identification with Feature Subspace Ensembles Hugo Silva, Hugo Gamboa, and Ana Fred
770
XIV
Table of Contents
Classification of Breast Masses in Mammogram Images Using Ripley's K Function and Support Vector Machine Leonardo de Oliveira Martins, Erick Correa da Silva, Aristöfanes Correa Silva, Anselmo Cardoso de Paiva, and Marcelo Gattass Selection of Experts for the Design of Multiple Biometrie Systems Roberto Tronci, Giorgio Giacinto, and Fabio Roli Multi-agent System Approach to React to Sudden Environmental Changes Sarunas Raudys and Antanas Mitasiunas Equivalence Learning in Protein Classification Attila Kertesz-Farkas, Andrds Kocsor, and Sändor Pongor
784
795
810 824
Text and D o c u m e n t Mining Statistical Identification of Key Phrases for Text Classification Frans Coenen, Paul Leng, Robert Sanderson, and Yanbo J. Wang
838
Probabilistic Model for Structured Document Mapping Guillaume Wisniewski, Francis Maes, Ludovic Denoyer, and Patrick Gallinari
854
Application of Fractal Theory for On-Line and Off-Line Farsi Digit Recognition Saeed Mozaffari, Karim Faez, and Volker Märgner Hybrid Learning of Ontology Classes Jens Lehmann
868 883
Discovering Relations Among Entities from XML Documents Yangyang Wu, Qing Lei, Wei Luo, and Harou Yokota Author Index
911