the european conference on machine learning ... - ECML PKDD 2017

THE EUROPEAN CONFERENCE ON MACHINE LEARNING & PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES SEPTEMBER 18 - 22, 2017 SKOPJE, MACEDONIA

2

ECML PKDD 2017 SKOPJE MACEDONIA

ДОБРЕДОЈДОВТЕ Dear Colleagues, Welcome to Macedonia, welcome to Skopje, and welcome to ECML PKDD 2017! This is the 10th edition of ECML PKDD as a single conference. While ECML and PKDD have been organized jointly since 2001, they only officially merged in 2008. Following the growth of the field and the community, the conference has diversified and grown over the last decade in terms of content, form and attendance. ECML PKDD attracted nearly 600 participants this year, keeping stable the record number of participants from last year. We are proud to present a rich scientific program, including high-profile keynotes and many technical presentations in different tracks (research, journal, applied data science, nectar and demo), fora (EU projects, PhD), workshops, tutorials and discovery challenges. We are also happy to offer an extensive social program, including a reception, conference banquet and catered poster sessions. We hope that this will provide ample opportunities for exciting exchanges of ideas, pleasurable networking and (simply) fun. Many people have put in countless hours of work to make this event happen: To them we express our heartfelt thanks: This includes the organization team listed at the end of the brochure, i.e., the program chairs of the different tracks and fora, workshops and tutorials, and discovery challenges, as well as the awards committee, production and public relations chairs, local organizers, sponsorship chairs and proceedings chairs. In addition, we’d like to thank the program committees of the different conference tracks, the organizers of the workshops and their respective committees, the Cankarjev Dom congress agency, and the student volunteers. Furthermore, many thanks to our sponsors for their generous financial support. Finally, thanks to all authors that submitted their work for presentation at ECML PKDD 2017. Last, but certainly not least, we would like to thank you for coming to the conference and helping us make it a memorable event. Enjoy the conference and your stay in Skopje! Sašo Džeroski and Michelangelo Ceci, general chairs On behalf of the ECML PKDD 2017 organizing team


3

200 m

4


Aleksandar Palace

Main Conference Venue Address Bul. 8th of September no.15, 1000 Skopje, Macedonia [ Bul. 8-mi Septemvri br.15, 1000 Skopje, Makedonija ] 42.010818, 21.405563

Registration Desk

Sunday, September 17 from 17:00 to 20:00 Monday, September 18 - Friday, September 22 from 08:00 to 17:40

Monday, September 18 & Friday, September 22 from 09:00 to 17:40

Tuesday, September 19 - Thursday, September 21 from 09:00 to 10:00




Tuesday, September 19 & Wednesday, September 20 from 14:00 to 15:40

Tuesday, September 19 & Wednesday, September 20 from 16:00 to 17:40

Wednesday, September 22 from 17:00 to 18:00

Wednesday, September 22 from 20:00 to 23:00

Workshops & Tutorials Invited Talks

Best Paper Award Presentations Conference & Journal Track Applied Data Science Track Nectar Track

Discovery Challenges Community Meeting

Conference Banquet

Macedonian National Theatre Address 11th of March st., 1000 Skopje, Macedonia [ ul. 11ti Mart, 1000 Skopje, Makedonija ] 41.998533, 21.432317

Macedonian Opera & Ballet Address Dimitar Vlahov walk, 1000 Skopje, Macedonia [ Kej Dimitar Vlahov, 1000 Skopje, Makedonija ] 41.997486, 21.436974

Opening Ceremony

Monday, September 18 from 19:00 to 19:30



Invited Talk

Welcome Reception

Invited Talks

Tuesday, September 19 & Thursday, September 21 from 19:00 to 20:00



Demo Spotlights

Demo & Poster Sessions


5

SCHEDULE AT A GLANCE MONDAY REGISTRATION

08:00 - 09:00

09:00 - 10:40

10:40 - 11:00

11:00 - 12:40

TUESDAY

Tutorials Workshops

Combined Tutorials with Workshops

Tutorials Workshops

with

Workshops

15:40 - 16:00

16:00 - 17:40

EU

Projects Forum

Tutorials Workshops


EU

Projects Forum

Tutorials Workshops

with

Workshops

Invited Talk

Invited Talk

ALEX GRAVES

INDERJIT DHILLON

BEST STUDENT ML PAPER

TEST-OF-TIME AWARD

Conference Applied & Data Journal Science Track Track

COFFEE & TEA BREAK Conference Applied & Data Journal Science Track Track

LUNCH BREAK

EU

Projects Forum

COFFEE & TEA BREAK Combined Tutorials

REGISTRATION

COFFEE & TEA BREAK

LUNCH BREAK

12:40 - 14:00

14:00 - 15:40

REGISTRATION

COFFEE & TEA BREAK Combined Tutorials

WEDNESDAY


LUNCH BREAK

Nectar Track

Conference & Journal Track

COFFEE & TEA BREAK EU

Projects Forum

Conference & Discovery Journal Challenge Track

Applied Data Science Track

Nectar Track

Discovery Challenge

COFFEE & TEA BREAK Conference & Journal Track

COMMUNITY MEETING 19:00 - 19:30 19:30 - 20:00

OPENING CEREMONY Invited Talk

FRANK HUTTER 20:00 - 21:00

JOHN QUACKENBUSH WELCOME RECEPTION

21:00 - 23:00

6

Invited Talk


DEMO & POSTER SESSION 1

CONFERENCE BANQUET

COMBINED TUTORIALS WITH WORKSOPS MONDAY

THURSDAY

FRIDAY

REGISTRATION

BEST STUDENT KDD PAPER COFFEE & TEA BREAK Conference Applied & Data Journal Science Track Track

ICT

Innovations 2017

Combined Tutorials

IoT Large Scale Learning from Data Streams

with Workshops

PhD Forum



ICT Innovations 2017

LUNCH BREAK

Tutorials Workshops


Invited Talk

CORDELIA SCHMID DEMO & POSTER SESSION 2

COFFEE & TEA BREAK

Tutorials Workshops

Automatic Selection, Configuration, and Composition of Machine Learning Algorithms

PhD Forum

09:00 - 17:40

MLSA 2017

Machine Learning and Data Mining for Sports Analytics

09:00 - 17:40

PAP 2017

1st International Workshop of Personal Analytics and Privacy

09:00 - 17:40

DyNO 2017+ TD-LSG 2017 SoGood 2017 SURL 2017 MIDAS 2017

Joint Workshop on Large-Scale Evolving Networks & Graphs

09:00 - 17:40

NFMCP 2017

The 6th International Workshop on New Frontiers in Mining Complex Patterns

09:00 - 17:40

LIDTA 2017

Learning with Imbalanced Domains: Theory and Applications

09:00 - 17:40

Deep Learning for Precision Medicine

09:00 - 12:40

DLPM 2017 DMNLP 2017

09:00 - 12:40

DMSC 2017

Data Mining with Secure Computation

14:00 - 19:00

KnowMe

1st International Workshop on Knowledge Discovery from Mobility and Transportation Systems

14:00 - 17:40

DARE 2017

5th International Workshop on Data Analytics for Renewable Energy Integration

09:00 - 12:40 14:00 - 17:40


Data Science for Social Good Scaling-Up Reinforcement Learning 2nd Workshop on MIning DAta for financial applicationS

FRIDAY

with

Workshops

COFFEE & TEA BREAK Conference & Journal Track

Combined Tutorials

AutoML

MONDAY

09:00 - 12:40 LUNCH BREAK

Interactive Adaptive Learning

WORKSHOPS

COFFEE & TEA BREAK

Tutorials Workshops

09:00 - 17:40

09:00 - 19:00

Invited Talk Tutorials Workshops

IAL 2017

FRIDAY

REGISTRATION

PIERRE-PHILIPPE MATHIEU

09:00 - 17:40

4th Workshop on Interactions between Data Mining and Natural Language Processing

TUTORIALS MONDAY 14:00 - 17:40

Core Decomposition of Networks: Concepts, Algorithms, and Applications

FRIDAY 09:00 - 12:40

Deep Learning for Computer Vision Applications: Robotics and Driving

14:00 - 17:40

Machine learning with fossil data: Analyzing Environmental and Climate Change ECML PKDD 2017 SKOPJE MACEDONIA

7

GENERAL INFORMATION EACH PARTICIPANT WILL RECEIVE A NAME TAG AT REGISTRATION. PLEASE BRING IT WITH YOU AT ALL CONFERENCE EVENTS. Additional tickets for accompanying persons to attend social events can be purchased at the registration desk. COFFEE AND LUNCH BREAKS Food and drinks served in the Breaks Area are included in the registration fee. Wi-Fi Wi-Fi will be available at the Conference venue during the whole conference. Please connect to the wireless networks ECMLPKDD2017 or ECMLPKDD2017-5 and use the password “SK2017ECMLPKDD”. CONFERENCE APP We created an event for the conference on EventBase with all details about the program. You can browse it by downloading the EventBase app from the app store and searching for “ECML PKDD 2017”. QR CODES The QR codes throughout the brochure, contain links to the camera ready versions of the accepted papers. MOBILE PHONES Please make sure to switch off your mobile phones during the sessions. Note that roaming charges may apply in case of making phone calls, sending SMS and making data transfer. EMERGENCY CONTACTS Emergencies related to the conference should be directed to: +389 77 527 604 (Ivica Dimitrovski, Local Organization Chair), +389 78 749 804 (Dragi Kocev, Production & PR Chair), or +389 78 749 839 (Registration Desk). These three numbers are not toll free. The Emergency Call Center in Macedonia uses the number 112. TRANSPORT For the three events (on Monday, Tuesday and Thursday) held outside the Main Conference Venue, there will be organized bus transport. The buses will depart at 18.00 from the parking lot in front of Main Conference Venue. The buses will start their trip back to Main Conference Venue at 22.15, the last one departing at 22.45. The buses on their trip back will make a stop at Hotel Karpos and proceed to the Main Conference Venue. CURRENCY The official currency of Macedonia is Denar (MKD). It has a stable exchange rate with the Euro: 1 EUR = 61.5 MKD. International credit cards are accepted for payments in most hotels, restaurants and shops. ATM machines are available throughout the city. INSURANCE The organizers of ECML PKDD 2017 cannot be held liable for any injury, loss or damage occurring during the conference. ELECTRICITY Macedonia uses a 230 volt 50 Hz system with European Type F plugs.

8


INSTRUCTIONS FOR PRESENTERS AND CHAIRS FOR ALL GENERAL ENQUIRERS REGARDING PROJECTOR/COMPUTER/INTERNET ISSUES PLEASE CONTACT THE REGISTRATION DESK. INSTRUCTIONS FOR SPEAKERS »» Each of the conference rooms will have a projector and a laptop. The laptop will have Powerpoint and .pdf viewing software pre-installed. »» You should be in the conference room 10 minutes before your session and report to the Session Chair. In addition, you will be asked to copy your presentation to the laptop and check if everything works fine. All of the presentations will be deleted after the end of the session. »» A volunteer will be available in every main session room to provide assistance. »» The time allocated to each speaker, including time to set up and time for questions, is : • Conference, Journal, Applied Data Science and Nectar Tracks: 20 minutes. • Demo track elevator pitches: 4 minutes. • Workshops, Forums and Discovery Challenges: per workshop/forum/challenge instructions. INSTRUCTIONS FOR SESSION CHAIRS »» Please make sure to show up 10 minutes before the session starts to check the equipment works. »» Please stick to the schedule. If a speaker fails to show up, please announce a short break. »» Please moderate questions. »» Please do not start sessions or talks early as attendees may need some time to move between rooms. INSTRUCTIONS FOR POSTER PRESENTERS »» The two poster sessions are scheduled: • Tuesday, September 19, from 20:20 to 22:30 (for papers presented orally on Tuesday and Wednesday before the lunch break i.e.12:40) • Thursday, September 21, from 20:20 to 22:30 (for papers presented orally on Wednesday after the lunch break i.e. 12:40) »» Both poster sessions will take place in the Macedonian Opera and Ballet and will be catered events. »» The poster size should be A0 (841 x 1189 mm) in portrait orientation. »» Presenters should set up their on the day of the poster session, before the last keynote lecture for the day. The presenters should remove their posters after the end of the poster session. »» Volunteers will be available to assist you. INSTRUCTIONS FOR SOFTWARE DEMOS »» Demo sessions will be held together with the poster sessions in the Macedonian Opera and Ballet. »» The demos are divided into two sessions. The elevator pitches are scheduled immediately after the two afternoon invited talks. The two demo sessions are scheduled: • Tuesday, from 20:20 to 22:30 • Thursday, from 20:20 to 22:30 »» Demonstration desks will be set up next to the poster area. »» Presenters should show up in advance to set up their demos and remove them at the end of the session. »» Each demo will be provided with a desk, two chairs, a monitor and a (4 or 5 way) power socket. Internet connection will be available.


9

MAIN CONFERENCE VENUE FLOOR PLANS

10



11

12


KEYNOTE SPEAKERS ECML PKDD 2017 SKOPJE MACEDONIA

13

Frank Hutter University of Freiburg, Germany Monday, September 18 2017 @ 19:30 Macedonian National Theatre

TOWARDS END-TO-END LEARNING & OPTIMIZATION Deep learning has recently helped AI systems to achieve human-level performance in several domains, including speech recognition, object classification, and playing several types of games. The major benefit of deep learning is that it enables end-to-end learning of representations of the data on several levels of abstraction. However, the overall network architecture and the learning algorithms’ sensitive hyperparameters still need to be set manually by human experts. In this talk, I will discuss extensions of Bayesian optimization for handling this problem effectively, thereby paving the way to fully automated end-to-end learning. I will focus on speeding up Bayesian optimization by reasoning over data subsets and initial learning curves, sometimes resulting in 100-fold speedups in finding good hyperparameter settings. I will also show competition-winning practical systems for automated machine learning (AutoML) and briefly show related applications to the end-to-end optimization of algorithms for solving hard combinatorial problems.

Chair: Sašo Džeroski

About the speaker : Frank Hutter is an Emmy Noether Research Group Lead (eq. Asst. Prof.) at the Computer Science Department of the University of Freiburg (Germany). He received his PhD from the University of British Columbia (2009). Frank’s main research interests span artificial intelligence, machine learning, combinatorial optimization, and automated algorithm design. He received a doctoral dissertation award from the Canadian Artificial Intelligence Association and, with his coauthors, several best paper awards (including from JAIR and IJCAI) and prizes in international competitions on machine learning, SAT solving, and AI planning. In 2016 he received an ERC Starting Grant for a project on automating deep learning based on Bayesian optimization, Bayesian neural networks, and deep reinforcement learning.

Sponsored by

14


Alex Graves Google DeepMind, UK Tuesday, September 19 2017 @ 09:00 CONGRESS HALL 1, Aleksandar Palace

FRONTIERS IN RECURRENT NEURAL NETWORK RESEARCH In the last few years, recurrent neural networks (RNNs) have become the Swiss army knife of sequence processing for machine learning. Problems involving long and complex data streams, such as speech recognition, machine translation and reinforcement learning from raw video are now routinely tackled with RNNs. However, significant limitations still exist for such systems, such as their ability to retain large amounts of information in memory, and the challenges of gradient-based training on very long sequences. My talk will review some of the new architectures and training strategies currently being developed to extend the frontiers of this exciting field.

Chair: Jaakko Hollmén

About the speaker : Alex Graves completed a BSc in Theoretical Physics at the University of Edinburgh, Part III Maths at the University of Cambridge, a PhD in artificial intelligence at IDSIA with Jürgen Schmidhuber, followed by postdocs at the Technical University of Munich and with Geoff Hinton at the University of Toronto. He is now a research scientist at DeepMind. His contributions include the Connectionist Temporal Classification algorithm for sequence labelling (now widely used for commercial speech and handwriting recognition), stochastic gradient variational inference, and the Neural Turing Machine / Differentiable Neural Computer architectures.


15

John Quackenbush Dana-Farber Cancer Institute, USA Harvard TH Chan School of Public Health, USA Tuesday, September 19 2017 @ 19:00 Macedonian Opera & Ballet

USING NETWORKS TO LINK GENOTYPE TO PHENOTYPE We know that genotype influences phenotype, but aside from a few highly penetrant Mendelian disorders, the link between genotype and phenotype is not well understood. We have used gene expression and genetic data to explore gene regulatory networks, to study phenotypic state transitions, and to analyze the connections between genotype and phenotype. I will describe how networks and their structure provide unique insight into how small effect variants influence phenotype. Chair: Celine Vens

About the speaker : John Quackenbush received his PhD in theoretical particle physics from UCLA in 1990. Following a postdoctoral fellowship in experimental high-energy physics, he received an NIH research award to work on the Human Genome Project and helped map chromosome 11 and sequence chromosomes 21 and 4. After four years of working in genomics and computational biology at the Salk Institute and then at Stanford University, John joined The Institute for Genomic Research (TIGR), pioneering microarray expression technologies and analytical methods. In 2005, he joined the Dana-Farber Cancer Institute and the Harvard T.H. Chan School of Public Health, where he uses computational and systems biology methods to explore the complexities of human disease, including cancer. In 2011, he cofounded Genospace, a precision-medicine software company acquired by Hospital Corporation of America in 2017. John’s many awards include recognition as a White House Champion of Change for making genomic data useful and widely accessible.

16


Inderjit Dhillon University of Texas at Austin, USA Wednesday, September 20 2017 @ 09:00 CONGRESS HALL 1, Aleksandar Palace

MULTI-TARGET PREDICTION VIA LOW-RANK EMBEDDINGS Linear prediction methods, such as linear regression and classification, form the bread-and-butter of modern machine learning. The classical scenario is the presence of data with multiple features and a single target variable. However, there are many recent scenarios where there are multiple target variables. For example, recommender systems, predicting bid words for a web page (where each bid word acts as a target variable), or predicting diseases linked to a gene. In many of these scenarios, the target variables might themselves be associated with features. In these scenarios, bilinear and nonlinear prediction via low-rank embeddings have been shown to be extremely powerful. The low-rank embeddings serve a dual purpose: (i) they enable tractable computation even in the face of millions of data points as well as target variables, and (ii) they exploit correlations among the target variables, even when there are many missing observations. We illustrate our methodology on various modern machine learning problems: recommender systems, multi-label learning and inductive matrix completion, and present results on some standard benchmarks as well as an application that involves prediction of gene-disease associations.

Chair: Michelangelo Ceci

About the speaker : Inderjit Dhillon is the Gottesman Family Centennial Professor of Computer Science and Mathematics at UT Austin, where he is also the Director of the ICES Center for Big Data Analytics. Currently he is on leave from UT Austin and works as Amazon Fellow at A9/Amazon, where he is developing and deploying state-of-the-art machine learning methods for Amazon search. His main research interests are in big data, machine learning, network analysis, linear algebra and optimization. He received his B.Tech. degree from IIT Bombay, and Ph.D. from UC Berkeley. Inderjit has received several awards, including the ICES Distinguished Research Award, the SIAM Outstanding Paper Prize, the Moncrief Grand Challenge Award, the SIAM Linear Algebra Prize, the University Research Excellence Award, and the NSF Career Award. He has published over 175 journal and conference papers, and has served on the Editorial Board of the Journal of Machine Learning Research, the IEEE Transactions of Pattern Analysis and Machine Intelligence, Foundations and Trends in Machine Learning and the SIAM Journal for Matrix Analysis and Applications. Inderjit is an ACM Fellow, an IEEE Fellow, a SIAM Fellow and an AAAS Fellow.


17

Pierre-Philippe Mathieu ESA/ESRIN, EO Science, Applications and New Technologies, Italy Thursday, September 21 2017 @ 09:00 CONGRESS HALL 1, Aleksandar Palace

ENABLING A SMARTER PLANET WITH EARTH OBSERVATION Nowadays, teams of researchers around the world can easily access a wide range of open data across disciplines and remotely process them on the Cloud, combining them with their own data to generate knowledge, develop information products for societal applications, and tackle complex integrative complex problems that could not be addressed a few years ago. Such rapid exchange of digital data is fostering a new world of data-intensive research, characterized by openness, transparency, and scrutiny and traceability of results, access to large volume of complex data, availability of community open tools, unprecedented level of computing power, and new collaboration among researchers and new actors such as citizen scientists. The EO scientific community is now facing the challenge of responding to this new paradigm in science 2.0 in order to make the most of the large volume of complex and diverse data delivered by the new generation of EO missions, and in particular the Sentinels. In this context, ESA is supporting a variety of activities in partnership with research communities to ease the transition and make the most of the data. These include the generation of new open tools and exploitation platforms, exploring new ways to disseminate data, building new partnership with citizen scientists, and training the new generation of data scientists. The talk will give a brief overview of some of ESA activities aiming to facilitate the exploitation of large amounts of data from EO missions in a collaborative, cross-disciplinary, and open way, for uses ranging from science to applications and education.

Chair: Ljupčo Todorovski About the speaker : Pierre-Philippe Mathieu is an Earth Observation Data Scientist at the European Space Agency in ESRIN (Frascati, Italy). He has spent 20+ years working in the field of environmental and ocean modelling, weather risk management and remote sensing. He has a degree in mechanical engineering and an M.Sc from the University of Liege (Belgium), a Ph.D. in oceanography from the University of Louvain (Belgium), and a Management degree from the University of Reading Business School (UK).

Sponsored by

18


Cordelia Schmid INRIA, France Thursday, September 21 2017 @ 19:00 Macedonian Opera & Ballet

AUTOMATIC UNDERSTANDING OF THE VISUAL WORLD One of the central problems of artificial intelligence is machine perception, i.e., the ability to understand the visual world based on input from sensors, such as cameras. Computer vision is the area which analyzes visual input. In this talk, I will present recent progress in visual understanding. It is for the most part due to the design of robust visual representations and learned models capturing the variability of the visual world based on state-of-the-art machine learning techniques, including convolutional neural networks. Progress has resulted in technology for a variety of applications. I will present in particular results for human action recognition.

Chair: Ivica Dimitrovski

About the speaker : Cordelia Schmid holds an M.S. degree in Computer Science from the University of Karlsruhe and a Doctorate, also in Computer Science, from the Institut National Polytechnique de Grenoble (INPG). Her doctoral thesis received the best thesis award from INPG in 1996. Dr. Schmid was a postdoctoral research assistant in the Robotics Research Group of Oxford University in 1996--1997. Since 1997 she has held a permanent research position at INRIA Grenoble Rhone-Alpes, where she is a research director and directs an INRIA team. Dr. Schmid has been an Associate Editor for IEEE PAMI (2001--2005) and for IJCV (2004--2012), editor-in-chief for IJCV (2013---), a program chair of IEEE CVPR 2005 and ECCV 2012 as well as a general chair of IEEE CVPR 2015 and ECCV 2020. In 2006, 2014 and 2016, she was awarded the Longuet-Higgins prize for fundamental contributions in computer vision that have withstood the test of time. She is a fellow of IEEE. She was awarded an ERC advanced grant in 2013, the Humboldt research award in 2015 and the Inria & French Academy of Science Grand Prix in 2016. She was elected to the German National Academy of Sciences, Leopoldina, in 2017.


19

20


PROGRAM ECML PKDD 2017 SKOPJE MACEDONIA

MONDAY

SEPTEMBER 18

MONDAY 22


CONGRESS HALL

CONGRESS HALL

CONGRESS HALL

1

2

3

4

Tutorial

Tutorial

Workshop

Workshop

IAL 2017

IoT

MLSA 2017

SURL 2017


Large Scale Learning from Data Streams

Machine Learning & Data Mining for Sports Analytics

10:40 - 11:00

MEZZANINE HALL

2

3

4

Workshop

Workshop

Joint Workshop

SoGood 2017

Data Science for Social Good

PAP 2017 Evolving Personal Analytics & Large-Scale Networks & Graphs Privacy

IoT


Workshop

Workshop

MLSA 2017


EU Projects Forum

Workshop

SURL 2017

Scaling-Up Reinforcement Learning

SoGood 2017

Data Science for Social Good

Workshop

Joint Workshop


IAL 2017


Workshop IoT


Workshop MLSA 2017


EU Projects Forum

Tutorial

Workshop

Core Decomposition of Networks:

MIDAS 2017

MIning DAta for Concepts, Algorithms financial applicationS & Applications

Workshop

Joint Workshop

PAP 2017

Large-Scale Evolving

Workshop

Joint Workshop

Personal Analytics & Networks & Graphs Privacy

COFFEE & TEA BREAK

15:40 - 16:00

Workshop IAL 2017


Workshop IoT


Workshop MLSA 2017


EU Projects Forum

Tutorial

Workshop

Core Decomposition of Networks:

MIDAS 2017

MIning DAta for Concepts, Algorithms financial applicationS and Applications


19:00 - 19:30

OPENING CEREMONY

19:30 - 20:30

Invited Talk FRANK HUTTER TOWARDS END-TO-END LEARNING & OPTIMIZATION

20:30 - 23:00

WELCOME RECEPTION

FRIDAY

MACEDONIAN NATIONAL THEATRE

WEDNESDAY

LUNCH BREAK

Workshop

16:00 - 17:40

MEZZANINE HALL

TUESDAY

IAL 2017


Tutorial

12:40 - 14:00

14:00 - 15:40

Scaling-Up Reinforcement Learning

MEZZANINE HALL

COFFEE & TEA BREAK

Tutorial 11:00 - 12:40

EU Projects Forum

BANQUET HALL

THURSDAY

09:00 - 10:40

CONGRESS HALL

MONDAY

PROGRAM AT A GLANCE


23

MONDAY

SEPTEMBER 18

EU Projects Forum

CONGRESS HALL 4 09:00 - 17:40

EU Projects Forum Chairs :

Petra Kralj Novak, Jožef Stefan Institute, Slovenia Nada Lavrač, Jožef Stefan Institute, Slovenia

The EU Projects Forum at ECML PKDD 2017 is a novel initiative that encourages the dissemination of EU projects and their results to the targeted scientific audience of the conference participants. It aims at becoming an annual venue for the EU funded projects to present their vision and work to the conference audience, and an opportunity for the ECML PKDD audience to learn about the European scientific success stories in their research field. The inaugural EU Projects Forum (scheduled as a full-day satellite event of ECML PKDD) encompasses a talk by the EU project officer Salvatore Spinello on the main features and evaluation process of FET Open, a tutorial on how to write successful project proposals by Richard Wheeler, as well as the presentations of several ERC grants and EU projects.

Collection of abstracts: http://ecmlpkdd2017.ijs.si/EUPFabstracts.pdf Invited talk

FET OPEN: MAIN FEATURES AND EVALUATION PROCESS Salvatore Spinello Research Programme Officer at REA, Belgium FET Open is one of the most attractive research programme under Horizon 2020. FET-Open supports the early-stage, highrisk research around new ideas towards radically new future technologies. It explores an open range of new and disruptive technological possibilities in all areas of Science & Technology, inspired by cutting edge science, unconventional collaborations and pioneering new ways to create the optimum conditions for serendipity to occur. In these first 3 years of Horizon 2020, a total of 2.648 proposals were submitted to the FET-Open programme and covered a wide range of disciplines: from Physics to Life Sciences, from Information Sciences and Engineering to Chemistry. Most proposals show indeed high degree of interdisciplinarity. During my presentation I will focus on Research and Innovation-Actions (RIA) and the so-called “gatekeepers” that every excellent proposal should address. I will then present the evaluation process that allows the selection of the best proposals, resulting in a continuously growing portfolio of high quality interdisciplinary projects. I will conclude providing some statistics in terms of country and organization participations, scientific fields covered and interdisciplinarity.

Invited talk

HINTS ON HOW TO WRITE A SUCCESSFUL PROJECT PROPOSAL Richard Wheeler Edinburgh Scientific, UK EU Funding has never been more important, nor harder to get. In this very practical talk, Richard Wheeler will provide an insider’s view to securing European Union funding, including tips and tricks on good proposal writing, what really happens in EU review meetings, why most proposals fail, good project management methods, and more. Attendees will have the opportunity to ask questions and discuss their ideas in an informal workshop environment.

24


EU Projects Forum Program at a glance 09:00 - 09:10

Opening Nada Lavrač, Petra Kralj Novak

ERC GRANT

10:30 - 11:00 11:00 - 11:20 11:20 - 11:40 11:40 - 12:00 12:00 - 12:20

FET Open: Main Features and Evaluation Process Salvatore Spinello

MONDAY

10:00 - 10:30

INVITED TALK

FORSIED: Formalizing Subjective Interestingness in Exploratory Data Mining Tijl De Bie, Jefrey Lijffijt

Coffee & Tea Break H2020 PROJECT

Fosca Giannotti

SoBigData Research Infrastructure

H2020 PROJECT

Hussain Kazmi

H2020 PROJECT

Martin Žnidaršič

H2020 PROJECT

AURORA: Advanced User-Centric Efficiency Metrics for Air Traffic Performance Analytics

REnnovates: Towards Net-Zero Energy Communities and Beyond

TUESDAY

09:10 - 10:00

CF-Web: ClowdFlows Data and Text Analytics Marketplace on the Web

12:20 - 12:40

FP7 FET OPEN

WEDNESDAY

Brian Mac Namee

MAESTRA: Learning from Massive, Incompletely Annotated, and Structured Data Sašo Džeroski

12:40 - 14:00 14:00 - 14:30

Lunch Break ERC GRANT

CAUSALPATH: Next Generation Causal Analysis Inspired By the Induction of Biological Pathways from Cytometry Data

14:30 - 15:40

INVITED TALK

THURSDAY

Vincenzo Lagani, Ioannis Tsamardinos

Hints on How to Write a Successful Project Proposals Part 1 Richard Wheeler

15:40 - 16:00 16:00 - 16:30

Coffee & Tea Break INVITED TALK

Hints on How to Write a Successful Project Proposals Part 2

17:00 - 17:30

Late Breaking Project Presentations

17:30 - 17:40

Closing Remarks

FRIDAY

Richard Wheeler


25

MONDAY

SEPTEMBER 18

Combined Tutorial with Workshop


IAL 2017 Interactive Adaptive Learning Organizers :

Georg Krempl, University Magdeburg, Germany Vincent Lemaire, Orange Labs, France Robi Polikar, Rowan University, USA Bernhard Sick, University of Kassel, Germany Daniel Kottke, University of Kassel, Germany Adrian Calma, University of Kassel, Germany

Webpage : http://www.uni-kassel.de/go/ial2017 This workshop on interactive adaptive learning aims at discussing techniques and approaches for optimizing the whole learning process, including the interaction with human supervisors, processing systems, and includes adaptive, active, semi-supervised, and transfer learning techniques, and combinations thereof in interactive and adaptive machine learning systems. Our objective is to bridge the communities researching and developing these techniques and systems in machine learning and data mining. Therefore we welcome contributions that present a novel problem setting, propose a novel approach, or report experience with the practical deployment of such a system and raise unsolved questions to the research community.

Tutorial

Introduction to Stream Mining This part starts with the classic stream mining paradigm. In its context, we discuss the challenges posed by non-stationarity and limitations in processing, storage, and supervision capacities. We briefly summarize related techniques, e.g. for incremental processing, forgetting, and change detection. This part concludes by an overview on further challenges that are investigated in the state-of-the-art research. Active Learning In this part of the tutorial, we focus on techniques for optimizing the interaction of a machine learning system with an oracle such as a human supervisor. We review active machine learning techniques, with focus on adaptive active learning for evolving and streaming data. We discuss recent advances and conclude with an overview on open research questions in adaptive active machine learning. Semi-Supervised and Transfer Learning This part of the tutorial addresses the problem of learning with incomplete or delayed supervision. We focus on the problem of learning with verification latency, and review techniques from change mining, semi-supervised and (unsupervised) transfer learning in non-stationary environments. We conclude with an overview on open challenges. Evaluation, Applications and Emerging Trends This last part of the tutorial takes an integrative view on the previous parts, with focus on industrial applications and open challenges of adaptive interactive mining systems as a whole. We briefly discuss the related issues of evaluation and deployment, applications, reported challenges and solutions, and highlight potential directions for future research.

Invited talk

ENSEMBLE LEARNING FROM DATA STREAMS WITH ACTIVE AND SEMI-SUPERVISED APPROACHES

Bartosz Krawczyk

Virginia Commonwealth University, USA

26


IAL 2017 Program at a glance 09:00 - 09:10

Welcome

09:10 - 09:50

Introduction to Stream Mining Georg Krempl

Active Learning

MONDAY

09:50 - 10:40

Daniel Kottke

10:40 - 11:00 11:00 - 11:40

Coffee & Tea Break Semi-Supervised and Transfer Learning Georg Krempl

Evaluation, Applications and Emerging Trends

TUESDAY

11:40 - 12:30

Vincent Lemaire

12:30 - 12:40

Spotlights on Poster Session Probabilistic Expert Knowledge Elicitation of Feature Relevances in Sparse Linear Regression Pedram Daee, Tomi Peltola, Marta Soare, Samuel Kaski

WEDNESDAY

Users Behavioural Inference with Markovian Decision Process and Active Learning Firas Jarboui, Vincent Rocchisani, Wilfried Kirchenmann

Multi-Arm Active Transfer Learning for Telugu Sentiment Analysis

12:40 - 14:00

Lunch Break + Poster Session

14:00 - 14:20

Probabilistic Active Learning with Structure-Sensitive Kernels

THURSDAY

Subba Reddy Oota, Vijayasaradhi Indurthi, Mounika Reddy Marreddy, Sandeep Sricharan Mukku, Radhika Mamidi

Dominik Lang, Daniel Kottke, Georg Krempl, Bernhard Sick

14:20 - 14:40

Transfer Learning for Time Series Anomaly Detection Vincent Vercruyssen, Wannes Meert, Jesse Davis

14:40 - 15:40

Ensemble Learning from Data Streams with Active & Semi-Supervised Approaches

15:40 - 16:00

Coffee & Tea Break + Poster Session

16:00 - 16:20

Simulation of Annotators for Active Learning: Uncertain Oracles

FRIDAY

Bartosz Krawczyk

Adrian Calma and Bernhard Sick

16:20 - 16:40

Interactive Anonymization for Privacy-Aware Machine Learning Bernd Malle, Peter Kieseberg, Andreas Holzinger

16:40 - 17:40

Panel Discussion George Kachergis, Bartosz Krawczyk, Myra Spiliopoulou, Jerzy Stefanowski


27

MONDAY

SEPTEMBER 18



IoT Large Scale Learning from Data Streams Organizers :

Moamar Sayed-Mouchaweh, High Engineering School of Mines, Douai, France Albert Bifet, Telecom-ParisTech, France Hamid Bouchachia, University of Bournemouth, UK João Gama, University of Porto, Portugal Rita Ribeiro, University of Porto, Portugal

Webpage https://abifet.wixsite.com/iotstreaming2017 The volume of data is rapidly increasing due to the development of the technology of information and communication. This data comes mostly in the form of streams. Learning from this ever-growing amount of data requires flexible learning models that self-adapt over time. In addition, these models must take into account many constraints: (pseudo) real-time processing, high-velocity, and dynamic multi-form change such as concept drift and novelty. This workshop welcomes novel research about learning from data streams in evolving environments. It will provide the researchers and participants with a forum for exchanging ideas, presenting recent advances and discussing challenges related to data streams processing. It solicits original work, already completed or in progress. Position papers are also considered. This workshop is combined with a tutorial treating the same topic and will be presented in the same day.

Tutorial Gianmarco De Francisci Morales, Albert Bifet, Latifur Khan, Moamar Sayed-Mouchaweh, Joao Gama, Wei Fan The challenge of deriving insights from the Internet of Things (IoT) has been recognized as one of the most exciting and key opportunities for both academia and industry. Advanced analysis of big data streams from sensors and devices is bound to become a key area of data mining research as the number of applications requiring such processing increases. Dealing with the evolution over time of such data streams, i.e., with concepts that drift or change completely, is one of the core issues in IoT stream mining. This tutorial is a gentle introduction to mining IoT big data streams. The first part introduces data stream learners for classification, regression, clustering, and frequent pattern mining. The second part deals with scalability issues inherent in IoT applications, and discusses how to mine data streams on distributed engines such as Spark, Flink, Storm, and Samza

Invited talk

LEARNING FROM NON-STATIONARY DISTRIBUTIONS Geoff Webb Monash University, Australia

28


IoT Large Scale Learning from Data Streams Program at a glance

09:00 - 10:40

10:40 - 11:00 11:00 - 12:40

12:40 - 14:00

Welcome Tutorial 1: IoT Fundamentals and Stream Mining Algorithms Gianmarco De Francisci Morales, Albert Bifet, Latifur Khan, Moamar Sayed-Mouchaweh, Joao Gama, Wei Fan

MONDAY

09:00

Coffee & Tea Break Tutorial: 2. IoT Distributed Big Data Stream Mining and Applications Gianmarco De Francisci Morales, Albert Bifet, Latifur Khan, Moamar Sayed-Mouchaweh, Joao Gama, Wei Fan

Lunch Break

14:00 - 14:45

TUESDAY

SESSION 1 - Chair : Albert Bifet Learning from Non-Stationary Distributions Geoff Webb 15:00 - 15:15

A Sliding Window Filter for Time Series Streams Gordon Lesti and Stephan Spiegel

Evolutive Deep Models for Online Learning on Data Streams with No Storage

WEDNESDAY

15:20 - 15:35

Andrey Besedin, Pierre Blanchart, Michel Crucianu, Marin Ferecatu

15:40 - 16:00

Coffee & Tea Break SESSION 2 - Chair : Moamar Sayed-Mouchaweh

16:00 - 16:15

Hybrid Self Adaptive Learning Scheme for Simple and Multiple Drift-Like Fault Diagnosis in Wind Turbine Pitch Sensors

16:20 - 16:35

THURSDAY

Houari Toubakh, Moamar Sayed-Mouchaweh

Aggregation Algorithm Vs. Average For Time Series Prediction Waqas Jamil, Yuri Kaliniskan, Abdelhamid Bouchachia

16:40 - 16:55

Self-Adaptive Ensemble Classifier for Handling Complex Concept Drift Imen Khamassi, Moamar Sayed-Mouchaweh

Summary Extraction on Data Streams in Embedded Systems Sebastian Buschjäger, Katharina Morik

FRIDAY

17:00 - 17:15


29

MONDAY

SEPTEMBER 18

CONGRESS HALL 3

Workshop

09:30 - 17:40

MLSA 2017 Machine Learning and Data Mining for Sports Analytics Organizers :

Jesse Davis, KU Leuven, Belgium Mehdi Kaytoue, INSA de Lyon, France Albrecht Zimmermann, Université Caen, France

Webpage https://dtai.cs.kuleuven.be/events/MLSA17/ Sports Analytics has been a steadily growing and rapidly evolving area over the last decade, both in US professional sports leagues and in European football leagues. The majority of techniques used in the field so far are statistical. However, there has been growing interest in the Machine Learning and Data Mining community about this topic as this setting is interesting, challenging and offers new sources of data. The workshop concerns all aspects of applying machine learning and data mining techniques for sports problems such as match strategy, tactics, and analysis; player acquisition, player valuation, and team spending; injury prediction and prevention; match outcome and league table prediction; and tournament design and scheduling among others.

Invited talk

DECISION-MAKING, ANTICIPATION AND GAZE CONTROL IN SPORTS Matt Dicks University of Portsmouth, UK

Invited talk

SCISPORTS: ENABLING DATA-DRIVEN DECISION MAKING IN PROFESSIONAL FOOTBALL

Jan Van Haaren

KU Leuven, Belgium

30


MLSA 2017 Program at a glance 09:30 - 09:40

Opening Remarks

09:40 - 10:40

Decision-Making, Anticipation and Gaze Control in Sports

10:40 - 11:00

Coffee & Tea Break

11:00 - 11:25

Predicting the Potential of Professional Soccer Players

MONDAY

Matt Dicks

Ruben Vroonen, Tom Decroos, Jan Van Haaren, Jesse Davis

11:25 - 11:50

STARSS: A Spatio-Temporal Action Rating System for Soccer Tom Decroos, Jan Van Haaren, Vladimir Dzyuba, Jesse Davis

Who Is Going to Get Hurt? Predicting Injuries in Professional Soccer

TUESDAY

11:50 - 12:15

Alessio Rossi, Luca Pappalardo, Paolo Cintia, Javier Fernandez, F. Marcello Iaia, Daniel Medina

12:15 - 12:40

Enabling Training Personalization By Predicting the Session Rate of Perceived Exertion

12:40 - 14:10

Lunch Break

14:10 - 15:10

SciSports: Enabling Data-Driven Decision Making in Professional Football

WEDNESDAY

Gilles Vandewiele, Youri Geurkink, Maarten Lievens, Femke Ongenae, Filip De Turck, Jan Boone

Jan Van Haaren 15:10 - 15:35

An Artificial Neural Network-Based Prediction Model for Underdog Teams in NBA Matches

15:35 - 16:00

Coffee & Tea Break

16:00 - 16:25

Dynamic Winner Prediction in Twenty20 Cricket: Based on Relative Team Strengths

THURSDAY

Paolo Giuliodori

Sasank Viswanadha, Kaustubh Sivalenka, Madan Gopal Jhawar, Vikram Pudi

16:25 - 16:50

Linking Event Mentions From Cricket Match Reports to Commentaries

16:50 - 17:15

FRIDAY

Manish Gupta

Honest Mirror: Quantitative Assessment of Player Performance in an ODI Cricket Match Madan Gopal Jhanwar, Vikram Pudi

17:15 - 17:40

Wrap-Up


31

MONDAY

SEPTEMBER 18

BANQUET HALL

Workshop

08:55 - 12:45

SURL 2017 Scaling-Up Reinforcement Learning Organizers :

Felipe Leno da Silva, University of São Paulo, Brazil Ruben Glatt, University of São Paulo, Brazil

Webpage http://surl.tirl.info/ Reinforcement Learning (RL) has achieved many successes over the years in training autonomous agents to perform simple tasks. However, one of the major remaining challenges in RL is scaling it to high-dimensional, real-world applications. Although many works have already focused on strategies to scale-up RL techniques and to find solutions for more complex problems with reasonable successes, many issues still exist. This workshop encourages to discuss diverse approaches to accelerate and generalize RL, such as the use of approximations, abstractions, hierarchical approaches, and Transfer Learning. Scaling-up RL methods has major implications on the research and practice of complex learning problems and will eventually lead to successful implementations in real-world applications. This workshop intends to bridge the gap between conventional and scalable RL approaches. We aim to bring together researchers working on different approaches to scale-up RL with the goal to solve more complex or larger scale problems. We intend to make this an exciting event for researchers worldwide, not only for the presentation of top quality papers, but also to spark the discussion of opportunities and challenges for future research directions.

Invited talk

SCALING UP RL WITH OFFLINE TASK HIERARCHIES

Devin Schwab

Carnegie Mellon University, USA

Invited talk

SCALING UP POLICY SEARCH METHODS FOR ROBOTICS Herke van Hoof McGill University, Canada

32


SURL 2017 Program at a glance 08:55 - 09:00

Opening SESSION 1

09:00 - 09:20

Case-Based Policy Inference for Transfer in Reinforcement Learning

09:20 - 09:40

MONDAY

R. Glatt, F. L. Silva, A. H. R. Costa

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning J. Foerster, N. Nardelli, G. Farquhar, T. Afouras, P. Torr, P. Kohli, S. Whiteson

09:40 - 10:00

Constrained Bayesian Reinforcement Learning Via Approximate Linear Programming

10:00 - 10:40

TUESDAY

J. Lee, Y. Jang, P. Poupart, K. Kim

Scaling Up RL with Offline Task Hierarchies Devin Schwab

10:40 - 11:00

Coffee & Tea Break SESSION 2 Automatic Object-Oriented Curriculum Generation for Reinforcement Learning

WEDNESDAY

11:00 - 11:20

F. L. Silva and A. H. R. Costa

11:20 - 11:40

Count-Based Exploration in Feature Space for Reinforcement Learning J. Martin, S. Narayanan S., T. Everitt , M. Hutter

11:40 - 12:00

Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning

12:00 - 12:40

THURSDAY

T. M. Moerland, J. Broekens, C. M. Jonker

Scaling Up Policy Search Methods for Robotics Herke Van Hoof

Closing / Community Meeting / Integration

FRIDAY

12:40 - 12:45


33

MONDAY

SEPTEMBER 18

MEZZANINE HALL 2

Workshop

09:00 - 12:45

SoGood 2017 Data Science for Social Good Organizers :

Ricard Gavaldà, UPC BarcelonaTech, Spain Irena Koprinska, University of Sidney, Australia Stefan Kramer, JGU Mainz, Germany

Webpage https://sites.google.com/site/ecmlpkddsogood2017/ This workshops aims to attract papers presenting applications of Data Science to Social Good, or else that take into account social aspects of Data Science methods and techniques. Application domains should be as varied as possible. The novelty of the application and its social impact will be major selection criteria.

Invited talk

USING SOCIAL MEDIA TO UNDERSTAND COLLECTIVE AND PERSONAL EVENTS: CHALLENGES AND APPLICATIONS

Alexandra Olteanu

IBM T.J. Watson Research Center, USA

SoGood 2017 Program at a glance

09:00 - 09:10

Welcome SESSION 1

09:10 - 09:30

Some Like It Hoax: Automated Fake News Detection in Social Networks Eugenio Tacchini, Gabriele Ballarin, Marco Della Vedova, Stefano Moret, Luca De Alfaro

09:30 - 10:30

Using Social Media to Understand Collective and Personal Events: Challenges and Applications Alexandra Olteanu

10:30 - 10:40

Discussion

11:40 - 12:00

Coffee & Tea Break SESSION 2

11:00 - 11:20

Learning from Administrative Health Registries Jonathan Rebane, Isak Karlsson, Lars Asker, Henrik Boström, Panagiotis Papapetrou

11:20 - 11:40

Online Topic Modeling: Keeping Track of News Topics for Social Good Zahra Ahmadi, Sophie Burkhardt, Stefan Kramer

11:40 - 12:00

A Bayesian Reputation Framework for Citizen Science Joan Garriga, Jaume Piera, Frederic Bartumeus

12:00 - 12:40

34


Panel & Discussion

MEZZANINE HALL 2

Workshop

14:00 - 17:40

MIDAS 2017 MIning DAta for financial applicationS Ilaria Bordino, UniCredit, R&D Department, Italy Ilaria Bordino, IMT School for Advanced Studies Lucca, Italy Fabio Fumarola, UniCredit, R&D Department, Italy Francesco Gullo, UniCredit, R&D Department, Italy Tiziano Squartini, IMT School for Advanced Studies Lucca, Italy

MONDAY

Organizers :

Webpage

WEDNESDAY

Like the famous King Midas, popularly remembered in Greek mythology for his ability to turn everything he touched with his hand into gold, we believe that the wealth of data generated by modern technologies, with widespread presence of computers, users and media connected by Internet, is a goldmine for tackling a variety of problems in the financial domain. The MIDAS workshop is aimed at discussing challenges, potentialities, and applications of leveraging data-mining tasks to tackle problems in the financial domain. The workshop provides a premier forum for sharing findings, knowledge, insights, experience and lessons learned from mining data generated in various application domains.

Invited talk

EVOLVING DATA, EVOLVING MODELS IN ECONOMY AND FINANCE

Joao Gama

University of Porto, Portugal

Opening

THURSDAY

14:00 - 14:10

SESSION 1 14:10 - 15:10

Evolving Data, Evolving Models in Economy and Finance Joao Gama

15:10 - 15:35

Influence Analysis in Business Social Media Flora Amato, Vincenzo Moscato, Antonio Picariello, Giovanni Ponti, Giancarlo Sperlì

15:40 - 16:00

Coffee & Tea Break

FRIDAY

MIDAS 2017 Program at a glance

TUESDAY

http://networks.imtlucca.it/conferences/midas2017

SESSION 2 16:00 - 16:25

Event Recognition Strategies Applied in the Mercurio Project Davide Azzalini, Fabio Azzalini, Davide Greco, Mirjana Mazuran, Letizia Tanca

16:25 - 16:50

BoostEMM - Transparent Boosting Using Exceptional Model Mining Simon Van Der Zon, Oren Zeev Ben Mordehai, Tom Vrijdag, Werner Van Ipenburg, Jan Veldsink, Wouter Duivesteijn, Mykola Pechenizkiy

16:50 - 17:15

Profiling Smart Contracts Interactions Tensor Decomposition and Graph Mining Jérémy Charlier, Sofiane Lagraa, Radu State, Jérôme François

17:15 - 17:40

Machine Learning for Multi-Step Ahead Forecasting of Volatility Proxies Jacopo De Stefani, Olivier Caelen, Dalila Hattab and Gianluca Bontempi

17:40

Closing ECML PKDD 2017 SKOPJE MACEDONIA

35

MONDAY

SEPTEMBER 18

MEZZANINE HALL 3

Workshop

09:00 - 17:40

PAP 2017 Personal Analytics & Privacy Organizers :

Serge Abiteboul, Inria, ENS Paris, France Riccardo Guidotti, KDDLab, ISTI-CNR Pisa, Italy Anna Monreale, University of Pisa, Italy Dino Pedreschi, University of Pisa, Italy

Webpage http://kdd.isti.cnr.it/pap2017 Personal data analytics and individual privacy protection are the key elements to leverage nowadays services to a new type of systems. The availability of personal analytics tools able to extract hidden knowledge from individual data while protecting the privacy right can help the society to move from organization-centric systems to user-centric systems, where the user is the owner of her personal data and is able to manage, understand, exploit, control and share her own data and the knowledge deliverable from them in a completely safe way. The purpose of PAP is to encourage principled research that will lead to the advancement of personal data analytics, personal services development, privacy, data protection and privacy risk assessment. The workshop will seek top-quality submissions addressing important issues related to personal analytics, personal data mining and privacy in the context where real individual data (spatio-temporal data, call details records, tweets, mobility data, transactional data, social networking data, etc.) are used for developing a data-driven service, for realizing a social study aimed at understanding nowadays society, and for publication purposes.

Invited talk

THE RISE OF DECENTRALIZED PERSONAL DATA MARKETS

Bruno Lepri

Fondazione Bruno Kessler, Italy

Invited talk

PERSONAL KNOWLEDGE MANAGEMENT SYSTEMS Serge Abiteboul Inria and ENS, France

36


PAP 2017 Program at a glance 09:00 - 09:15

Welcome & Workshop Overview

09:15 - 10:00

The Rise of Decentralized Personal Data Markets Bruno Lepri

Your Privacy, My Privacy? On Leakage Risk Assessment in Online Social Networks

MONDAY

10:00 - 10:30

Ruggero G. Pensa, Livio Bioglio

10:30 - 11:00

Coffee & Tea Break

11:00 - 11:30

Assessing Privacy Risk in Retail Data Roberto Pellungrini, Francesca Pratesi, Luca Pappalardo

Differential Privacy and Neural Networks

TUESDAY

11:30 - 12:00

Giuseppe Manco, Giuseppe Pirrò

12:00 - 12:30

Co-Clustering for Differentially Private Synthetic Data Generation Tarek Benkhelif, Françoise Fessant, Fabrice Clérot, Guillaume Raschia

14:00 - 14:40

Lunch Break Personal Knowledge Management Systems

WEDNESDAY

12:30 - 14:00

Serge Abiteboul

14:40 - 15:10

From Self-Data to Self-Preferences: Towards Preference Elicitation in Personal Information Management Systems Tristan Allard, Tassadit Bouadi, Joris Duguépéroux, Virginie Sans

Evaluating the Impact of Friends in Predicting User'S Availability in Online Social Networks

THURSDAY

15:10 - 15:40

Andrea De Salve, Paolo Mori, Laura Ricci

15:40 - 16:00 16:00 - 16:30

Coffee & Tea Break Movement Behaviour Recognition for Water Activities Mirco Nanni, Roberto Trasarti

16:30 - 17:00

Guess the Movie - Linking Facebook Pages to IMDb Movies

17:00 - 17:30

FRIDAY

Paolo Fornacciari, Barbara Guidi, Monica Mordonini, Jacopo Orlandini, Laura Sani, Michele Tomaiuolo

Research on Online Digital Cultures - Community Extraction and Analysis By Markov and K-Means Clustering Tobias Blanke, Giles Greenway, Mark Coté, Jennifer Pybus

17:30 - 17:40

Conclusive Remarks


37

MONDAY

SEPTEMBER 18

Joint Workshop

MEZZANINE HALL 4 09:00 - 17:00

DyNO 2017 + TD-LSG 2107 Large-Scale Evolving Networks & Graphs Organizers :

Giulio Rossetti, ISTI-CNR Pisa, Italy Rémy Cazabet, LIP6, UPMC/CNRS, Sorbonne Universités, France Letizia Milli, University of Pisa, Italy Sabeur Aridhi, University of Lorraine, France José Fernandes de Macedo, Universidade Federale do Ceara, Brazil Engelbert Mephu Nguifo, LIMOS, Blaise Pascal University, France Karine Zeitouni, DAVID, Université de Versailles Saint-Quentin, France

Webpage http://kdd.isti.cnr.it/dyno/ & http://tdlsg-ecmlpkdd17.isima.fr/ The aim of this joint workshop on Large-scale Evolving Networks and Graphs is to bring together active scholars and practitioners of dynamic graphs, complex and temporal networks. In this workshop, we aim to discuss the problem of mining large-scale time-dependent graphs, since there are many real world applications deal with a large volumes of spatio-temporal data (e.g. moving objects’ trajectories). Managing and analyzing large-scale time-dependent graphs is very challenging since this requires sophisticated methods and techniques for creating, storing, accessing and processing such graphs in a distributed environment, because centralized approaches do not scale in a Big Data scenario. Contributions will clearly point out answers to one of these challenges focusing on large-scale graphs. In the last years, we also witnessed a shift from static network analysis to dynamic ones, i.e., the study of networks whose structure changes over time. As time goes by, all the perturbations which occur in the network topology due to the rise and fall of nodes and edges have repercussions on the network phenomena we are used to observing. Nowadays, one of the most fascinating challenges is to analyze the structural dynamics of real world networks and how they impact on the processes which occur on them, i.e. the spreading of social influence and diffusion of innovations. Results in this field will enable a better understanding of important aspects of human behaviors as well as to a more detailed characterization of the complex interconnected society we inhabit.The purpose of this workshop is also to encourage research that will lead to the advancement of the social science in time-evolving networks.

Invited talk

DEALING WITH BETWEENNESS IN EVOLVING GRAPHS AND IMPOSED SYSTEM WORKLOAD IMBALANCE

Nicolas Kourtellis Telefonica I+D, Spain

Invited talk

LEARNING FROM HIDDEN TIME-DEPENDENT GRAPHS

Jan Ramon

INRIA, France

38


09:00

Welcome

Large-Scale Evolving Networks & Graphs Program at a glance

MORNING SESSION 1 - SOCIAL NETWORKS 09:10 - 09:45

Dynamic Community Detection : State of the Art and First Empirical Comparisons Cazabet Rémy

09:45 - 10:10

Churn Prediction Using Dynamic RFM-Augmented Node2vec

10:10 - 10:25

MONDAY

Sandra Mitrovic, Bart Baesens, Wilfried Lemahieu, Jochen De Weerdt

Influence Maximization-Based Event Organization on Social Networks Cheng-Te Li

10:30 - 11:00

Coffee & Tea Break SESSION 2 - EVOLVING (SOCIAL) GRAPHS

11:00 - 11:45

Dealing with Betweenness in Evolving Graphs and Imposed System Workload Imbalance

11:45 - 12:10

TUESDAY

Nicolas Kourtellis

Multi-Scale Community Detection in Temporal Networks Using Spectral Graph Wavelets Zhana Kuncheva, Giovanni Montana

12:10 - 12:35

Finding Simple Temporal Cycles in an Interaction Network Rohit Kumar, Toon Calders

Lunch Break

WEDNESDAY

12:40 - 14:00

AFTERNOON SESSION 1 -TRANSPORTATION NETWORKS 14:00 - 14:45

Learning from Hidden Time-Dependent Graphs Jan Ramon

14:45 - 15:00

Discovering Connections in Heterogeneous Transportation Datasets with Link++

15:00 - 15:15

THURSDAY

Ali Masri, Karine Zeitouni, Zoubida Kedad

Finding the Nearest Service Provider on Time-Dependent Road Networks Lívia Almada Cruz, Francesco Lettich, Leopoldo Soares Júnior, Regis Pires Magalhães, José Antônio, Fernandes De Macedo

15:15 - 15:30

Design and Implementation Issues of a Time-Dependent Shortest Path Algorithm for Multimodal Transportation Network Abdelfattah Idri, Mariyem Oukarfi, Azedine Boulmakoul and Karine Zeitouni

15:40 - 16:00

Coffee & Tea Break

16:00 - 16:25

FRIDAY

AFTERNOON SESSION 2 - PERFORMANCE ISSUES OF LARGE-SCALE EVOLVING GRAPHS Synthetic Graph Generation from Finely-Tuned Temporal Constraints Karim Alami, Radu Ciucanu and Engelbert Mephu Nguifo

16:25 - 16:40

A Distributed Framework for Large-Scale Time-Dependent Graph Analysis Wissem Inoubli, Livia Almada, Ticiana Linhares, Gustavo Coutinho, Lucas Peres, Regis Pires Magalhães, Jose Antonio F. De Macedo, Sabeur Aridhi, Engelbert Mephu Nguifo

16:40 - 17:00

Panel Discussion


39

MONDAY

SEPTEMBER 18

BANQUET HALL

Tutorial

14:00 - 17:40

Core Decomposition of Networks: Concepts, Algorithms and Applications Organizers :

Fragkiskos D. Malliaros, UC San Diego, USA Apostolos N. Papadopoulos , Aristotle University of Thessaloniki, Greece Michalis Vazirgiannis , École Polytechnique, France

Webpage http://fragkiskos.me/projects/core_tutorial/ Graph mining is an important research area with a plethora of practical applications. Core decomposition of networks is a fundamental operation strongly related to more complex mining tasks such as community detection, dense subgraph discovery, identification of influential nodes, network visualization, text mining, just to name a few. In this tutorial, we will present in detail the concept and properties of core decomposition in graphs, the associated algorithms for its efficient computation and important cross-disciplinary applications that benefit from it.

40

14:00 - 15:40

PART 1 : Definitions, Fundamental Concepts and Basic Algorithms

15:40 - 16:00

Coffee & Tea Break

16:00 - 17:40

PART 2 : Algorithms (continued), Applications, Open Problems


Macedonian National Thetre

Invited Talk

20:00 - 21:00

University of Freiburg, Germany Deep learning has recently helped AI systems to achieve human-level performance in several domains, including speech recognition, object classification, and playing several types of games. The major benefit of deep learning is that it enables end-to-end learning of representations of the data on several levels of abstraction. However, the overall network architecture and the learning algorithms’ sensitive hyperparameters still need to be set manually by human experts. In this talk, I will discuss extensions of Bayesian optimization for handling this problem effectively, thereby paving the way to fully automated end-to-end learning. I will focus on speeding up Bayesian optimization by reasoning over data subsets and initial learning curves, sometimes resulting in 100-fold speedups in finding good hyperparameter settings. I will also show competition-winning practical systems for automated machine learning (AutoML) and briefly show related applications to the end-to-end optimization of algorithms for solving hard combinatorial problems.

TUESDAY

Frank Hutter

MONDAY

TOWARDS END-TO-END LEARNING & OPTIMIZATION

WEDNESDAY

Chair: Sašo Džeroski

FRIDAY

THURSDAY

Sponsored by


41

TUESDAY 42


PROGRAM AT A GLANCE

10:00 - 10:30

BEST STUDENT ML PAPER PRESENTATION

10:30 - 11:00

COFFEE & TEA BREAK

CONGRESS HALL

CONGRESS HALL

CONGRESS HALL

MEZZANINE HALL

1

2

3

4

2+3

Neural Networks & Deep Learning 1

Time Series & Streams 1

Ensembles & Meta Learning

Applied Data Science 1

LUNCH BREAK

12:40 - 14:00

14:00 - 15:40

Probabilistic Models & Methods 1

Networks & Graphs 1

15:40 - 16:00


Nectar 1

Privacy & Security

Discovery Challenge 1

COFFEE & TEA BREAK


Reinforcement Learning

Regression

FRIDAY

16:00 - 17:40

Pattern & Sequence Mining

THURSDAY

11:00 - 12:40

CONGRESS HALL

TUESDAY

Invited Talk ALEX GRAVES FRONTIERS IN RECURRENT NEURAL NETWORK RESEARCH

WEDNESDAY

09:00 - 10:00

MONDAY

CONGRESS HALL 1

MACEDONIAN OPERA & BALLET 19:00 - 20:00

Invited Talk JOHN QUACKENBUSH USING NETWORKS TO LINK GENOTYPE TO PHENOTYPE Demo Spotlights

20:00 - 23:00

____________ DEM0 1 & POSTER SESSION 1


43

TUESDAY

SEPTEMBER 19

Invited Talk


Frontiers in Recurrent Neural Network Research Alex Graves Google DeepMind, UK In the last few years, recurrent neural networks (RNNs) have become the Swiss army knife of sequence processing for machine learning. Problems involving long and complex data streams, such as speech recognition, machine translation and reinforcement learning from raw video are now routinely tackled with RNNs. However, significant limitations still exist for such systems, such as their ability to retain large amounts of information in memory, and the challenges of gradient-based training on very long sequences. My talk will review some of the new architectures and training strategies currently being developed to extend the frontiers of this exciting field.


Invited Talk


Using Networks to Link Genotype to Phenotype John Quackenbush Dana-Farber Cancer Institute, USA Harvard TH Chan School of Public Health, USA We know that genotype influences phenotype, but aside from a few highly penetrant Mendelian disorders, the link between genotype and phenotype is not well understood. We have used gene expression and genetic data to explore gene regulatory networks, to study phenotypic state transitions, and to analyze the connections between genotype and phenotype. I will describe how networks and their structure provide unique insight into how small effect variants influence phenotype.

Chair: Celine Veins

44


CONGRESS HALL 1

Best Student ML Paper Award

10:00 - 10:30

This paper proposes an ensemble method for time series forecasting tasks. Combining different forecasting models is a common approach to tackle these tasks. State-of-the-art methods track the loss of the available models and adapt their weights accordingly. Metalearning strategies such as stacking are also used in these tasks. We propose a metalearning approach for adaptively combining forecasting models that specializes them across the time series. Our assumption is that different forecasting models have different areas of expertise and a varying relative performance. Moreover, many time series show recurring structures due to factors such as seasonality. Therefore, the ability of a method to deal with changes in relative performance of models as well as recurrent changes in the data distribution can be very useful in dynamic environments. Our approach is based on an ensemble of heterogeneous forecasters, arbitrated by a metalearning model. This strategy is designed to cope with the different dynamics of time series and quickly adapt the ensemble to regime changes. We validate our proposal using time series from several real world domains. Empirical results show the competitiveness of the method in comparison to state-of-the-art approaches for combining forecasters.

WEDNESDAY


TUESDAY

Vitor Cerqueira, Luis Torgo, Fábio Pinto & Carlos Soares

MONDAY

Arbitrated Ensemble for Time Series Forecasting

FRIDAY

THURSDAY

Sponsored by


45

TUESDAY

SEPTEMBER 19

SESSIONS AT A GLANCE

Neural Networks and Deep Learning 1 Chair: Jesse Read

11:00 - 11:20 11:20 - 11:40 11:40 - 12:00 12:00 - 12:20 12:20 - 12:40

Harry Pratt, Bryan Williams, Frans Coenen, Yalin Zheng

Multimodal Classification for Analysing Social Media - NO SHOW Chi Thang Duong, Remi Lebret, Karl Aberer

Sequence Generation with Target Attention Yingce Xia, Fei Tian, Tao Qin, Nenghai Yu, Tie-Yan Liu

Weightless Neural Networks for Open Set Recognition Douglas O. Cardoso, João Gama, Felipe M. G. França

A Network Architecture for Multi-Multi Instance Learning Alessandro Tibo, Paolo Frasconi, Manfred Jaeger

Chair: Joao Gama

11:20 - 11:40 11:40 - 12:00 12:00 - 12:20 12:20 - 12:40

46

11:00-12:40

FCNNs: Fourier Convolutional Neural Networks

Time Series and Streams 1

11:00 - 11:20

CONGRESS HALL 1

Cost-Sensitive Perceptron Decision Trees for Imbalanced Drifting Data Streams Bartosz Krawczyk, Przemyslaw Skryjomski

Learning TSK Fuzzy Rules from Data Streams Ammar Shaker, Waleri Heldt, Eyke Huellermeier

Non-Parametric Online AUC Maximization Balazs Szorenyi, Snir Cohen, Shie Mannor

On-Line Dynamic Time Warping for Streaming Time Series Izaskun Oregi, Aritz Perez, Javier Del Ser, Jose Lozano

PowerCast: Mining and Forecasting Power Grid Sequences Hyun Ah Song, Bryan Hooi, Marko Jereminov, Amritanshu Pandey, Larry Pileggi, Christos Faloutsos


CONGRESS HALL 2 11:00-12:40

CONGRESS HALL 3

Ensembles and Meta Learning

11:00-12:40

Chair: Albrecht Zimmermann Adaptive Random Forests for Evolving Data Stream Classification

12:20 - 12:40

MONDAY

12:00 - 12:20

Dynamic Ensemble Selection with Probabilistic Classifier Chains Anil Narassiguin, Haytham Elghazel, Alexandre Aussem

Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks Shizhao Sun, Wei Chen, Jiang Bian, Xiaoguang Liu, Tie-Yan Liu

Fast and Accurate Density Estimation with Extremely Randomized Cutset Networks Nicola Di Mauro, Antonio Vergari, Teresa M.A. Basile, Floriana Esposito

CONGRESS HALL 4


11:00-12:40

Chair: Yasemin Altun

11:20 - 11:40 11:40 - 12:00 12:00 - 12:20 12:20 - 12:40

Probabilistic Inference of Twitter Users' Age Based on What They Follow Benjamin Chamberlain, Marc Deisenroth, Clive Humby

WEDNESDAY

11:00 - 11:20

Unsupervised Signature Extraction from Forensic Logs Stefan Thaler, Vlado Menkovski, Milan Petkovic

Event Detection and Summarization Using Phrase Networks: PhraseNet Sara Melvin, Wenchao Yu, Peng Ju, Sean Young, Wei Wang

Stance Classification of Tweets Using Skip Char NGrams Yaakov HaCohen-Kerner, Ziv Ido, Ronen Ya'Akobov

Optimal Client Recommendation for Market Makers in Illiquid Financial Products Dieter Hendricks, Stephen Roberts

CONGRESS HALL 1

Probabilistic Models and Methods 1

14:00-15:40

Chair: Giuseppe Manco

14:00 - 14:20 14:20 - 14:40

14:40 - 15:00

Bayesian Heatmaps: Probabilistic Classification with Multiple Unreliable Information Sources Edwin Simpson, Steven Reece, Stephen Roberts

Bayesian Inference for Least Squares Temporal Difference Regularization Nikolaos Tziortziotis, Christos Dimitrakakis

Discovery of Causal Models That Contain Latent Variables Through Bayesian Scoring of Independence Constraints Fattaneh Jabbari, Gregory Cooper, Joseph Ramsey, Peter Spirtes

15:00 - 15:20 15:20 - 15:40

TUESDAY

11:40 - 12:00

Classification of High-Dimensional Evolving Data Streams Via a Resource-Efficient Online Ensemble Tingting Zhai, Yang Gao, Hao Wang, Longbing Cao

THURSDAY

11:20 - 11:40

Heitor M. Gomes, Albert Bifet, Jesse Read, Jean Paul Barddal, Fabrício Enembreck, Bernhard Pfharinger, Geoff Holmes, Talel Abdessalem

FRIDAY

11:00 - 11:20

Efficient Parameter Learning of Bayesian Network Classifiers Nayyar A. Zaidi, Geoffrey I. Webb, Mark J. Carman, François Petitjean, Wray Buntine, Mike Hynes, Hans De Sterck

MixedTrails: Bayesian Hypothesis Comparison on Heterogeneous Sequential Data Martin Becker, Florian Lemmerich, Philipp Singer, Markus Strohmaier, Andreas Hotho


47

TUESDAY

SEPTEMBER 19

Networks and Graphs 1


Chair: Marc Plantevit

14:00 - 14:20 14:20 - 14:40 14:40 - 15:00 15:00 - 15:20 15:20 - 15:40

Attributed Graph Clustering with Unimodal Normalized Cut Wei Ye, Linfei Zhou, Xin Sun, Claudia Plant, Christian Böhm

Ensemble-Based Community Detection in Multilayer Networks Andrea Tagarelli, Alessia Amelio, Francesco Gullo

K-Clique-Graphs for Dense Subgraph Discovery Giannis Nikolentzos, Polykarpos Meladianos, Yannis Stavrakas, Michalis Vazirgiannis

Local Community Detection in Multilayer Networks Roberto Interdonato, Andrea Tagarelli, Dino Ienco, Arnaud Sallaberry, Pascal Poncelet

Local Lanczos Spectral Approximation for Membership Identification Pan Shi, Kun He, David Bindel, John Hopcroft

Pattern and Sequence Mining


Chair: Stefan Kramer

14:00 - 14:20 14:20 - 14:40 14:40 - 15:00 15:00 - 15:20 15:20 - 15:40

BeatLex: Summarizing and Forecasting Time Series with Patterns Bryan Hooi, Shenghua Liu, Asim Smailagic, Christos Faloutsos

Behavioral Constraint Template-Based Sequence Classification Johannes De Smedt, Galina Deeva, Jochen De Weerdt

Efficient Sequence Regression By Learning Linear Models in All-Subsequence Space Severin Gsponer, Barry Smyth, Georgiana Ifrim

Flexible Constrained Sampling with Guarantees for Pattern Mining Vladimir Dzyuba, Matthijs Van Leeuwen, Luc De Raedt

Subjectively Interesting Connecting Trees Florian Adriaens, Jefrey Lijffijt, Tijl De Bie


CONGRESS HALL 4

Chair: Yasemin Altun

14:00 - 14:20 14:20 - 14:40 14:40 - 15:00

14:00-15:40

Automatic Detection and Recognition of Individuals in Patterned Species Gullal Singh Cheema, Saket Anand

Sequential Keystroke Behavioral Biometrics for User Identification Via Multi-View Deep Learning Lichao Sun, Yuqi Wang, Bokai Cao, Philip Yu, Witawas Srisa-An, Alex Leow

Modeling the Temporal Nature of Human Behavior for Demographics Prediction Bjarke Felbo, Pål Sundsøy, Alex 'Sandy' Pentland, Sune Lehmann, Yves-Alexandre De Montjoye

RSSI Based Supervised Learning for Uncooperative Direction-Finding 15:00 - 15:20

15:20 - 15:40

48

Tathagata Mukherjee, Michael Duckett, Piyush Kumar, Jared Paquet, Daniel Rodriguez, Mallory Haulcomb, Kevin George, Eduardo Pasiliao

Urban Water Flow and Water Level Prediction Based on Deep Learning Haytham Assem, Salem Ghariba, Gabor Makrai, Paul Johnston, Laurence Gill, Francesco Pilla


MEZZANINE HALL 2+3

Nectar 1

14:00-15:40

Chairs: Donato Malerba & Jerzy Stefanowski Comparing Hypotheses on Sequential Behavior: A Bayesian Approach and Its Applications

15:00 - 15:20 15:20 - 15:40

Phenotype Inference from Text and Genomic Data

MONDAY

14:40 - 15:00

Music Generation Using Bayesian Networks Tetsuro Kitahara Maria Brbic, Matija Piskorec, Vedrana Vidulin, Anita Krisko, Tomislav Smuc, Fran Supek

Data-Driven Approaches for Smart Parking Fabian Bock, Sergio Di Martino, Monika Sester

Process-Based Modeling and Design of Dynamical Systems Jovan Tanevski, Nikola Simidjievski, Ljupco Todorovski, Sašo Džeroski

CONGRESS HALL 1

Probabilistic Models and Methods 2

16:00-17:40


16:20 - 16:40 16:40 - 17:00 17:00 - 17:20 17:20 - 17:40

Gaussian Conditional Random Fields Extended for Directed Graphs Tijana Vujicic, Jesse Glass, Fang Zhou, Zoran Obradovic

WEDNESDAY

16:00 - 16:20

Multi-View Generative Adversarial Networks Mickael Chen, Ludovic Denoyer

PAC-Bayesian Analysis for a Two-Step Hierarchical Mutliview Learning Approach Anil Goyal, Emilie Morvant, Pascal Germain, Massih-Reza Amini

Robust Multi-View Topic Modeling By Incorporating Detecting Anomalies Guoxi Zhang, Tomoharu Iwata, Hisashi Kashima

Semi-Supervised Bayesian Deep Multi-Modal Emotion Recognition - NO SHOW Changde Du, Changying Du, Jinpeng Li, Wei-long Zheng, Baoliang Lv, Huiguang He

CONGRESS HALL 2

Reinforcement Learning

16:00-17:40

Chair: Kurt Driessens

16:20 - 16:40 16:40 - 17:00 17:00 - 17:20 17:20 - 17:40

Generalized Exploration in Policy Search

FRIDAY

16:00 - 16:20

TUESDAY

14:20 - 14:40

Florian Lemmerich, Philipp Singer, Martin Becker, Lisette Espin-Noboa, Dimitar Dimitrov, Denis Helic, Andreas Hotho, Markus Strohmaier

THURSDAY

14:00 - 14:20

Herke Van Hoof, Daniel Tanneberg, Jan Peters

Generalized Inverse Reinforcement Learning on Linearly Solvable MDP Masahiro Kohjima, Tatsushi Matsubayashi, Hiroshi Sawada

Max K-Armed Bandit: On the ExtremeHunter Algorithm and Beyond Mastane Achab, Stephan Clémençon, Aurélien Garivier, Anne Sabourin, Claire Vernade

Offline Reinforcement Learning With Task Hierarchies Devin Schwab, Soumya Ray

Variational Thompson Sampling for Relational Recurrent Bandits Sylvain Lamprier, Thibault Gisselbrecht, Patrick Gallinari


49

TUESDAY

SEPTEMBER 19 CONGRESS HALL 3

Regression

16:00-17:40

Chair: Nikola Simidjievski

16:00 - 16:20 16:20 - 16:40 16:40 - 17:00 17:00 - 17:20 17:20 - 17:40

Adaptive Skip-Train Structured Regression for Temporal Networks Martin Pavlovski, Fang Zhou, Ivan Stojkovic, Ljupco Kocarev, Zoran Obradovic

ALADIN: A New Approach for Drug--Target Interaction Prediction Krisztian Buza, Ladislav Peska

Co-Regularised Support Vector Regression Katrin Ullrich, Michael Kamp, Thomas Gärtner, Martin Vogt, Stefan Wrobel

Online Regression with Controlled Label Noise Rate Edward Moroshko, Koby Crammer

Robust Regression Using Biased Objectives Matthew J. Holland, Kazushi Ikeda

Privacy and Security


Chair: Tobias Scheffer

16:00 - 16:20 16:20 - 16:40 16:40 - 17:00 17:00 - 17:20 17:20 - 17:40

Differentially Private Nearest Neighbor Classification Mehmet Emre Gursoy, Ali Inan, Mehmet Ercan Nergiz, Yucel Saygin

Malware Detection By Analysing Encrypted Network Traffic with Neural Networks Paul Prasse, Lukas Machlika, Tomas Pevny, Jiri Havelka, Tobias Scheffer

PEM: Practical Differentially Private System for Large-Scale Cross-Institutional Data Mining Li Yi, Yitao Duan, Wei Xu

Preserving Differential Privacy in Convolutional Deep Belief Networks NhatHai Phan, Xintao Wu, Dejing Dou

The Best Privacy Defense Is a Good Privacy Offense: Obfuscating a Search Engine User'S Profile Jörg Wicker, Stefan Kramer

MACEDONIAN OPERA & BALLET

Demo 1 Chairs: Jesse Read & Marinka Žitnik

WHODID: Web-Based Interface for Human-Assisted Factory Operations in Fault Detection Gouy-Pailler Cedric, Pierre Blanchart

TrAnET: Tracking and Analyzing the Evolution of Topics in Information Networks Livio Bioglio, Ruggero Pensa, Valentina Rho

Framework for Exploring and Understanding Multivariate Correlations Louis Kirsch, Niklas Riekenbrauck, Daniel Thevessen, Marcus Pappik, Axel Stebner, Julius Kunze, Alexander Meissner, Arvind Kumar Shekar, Emmanuel Müller

Tetrahedron: Barycentric Measure Visualizer Dariusz Brzezinski, Jerzy Stefanowski, Izabela Szczęch, Robert Susmaga

Delve: A Data Set Retrieval and Document Analysis System Uchenna Akujuobi, Xiangliang Zhang

50


20:00

MEZZANINE HALL 2+3

Discovery Challenge 1

16:00-17:40

Chair: Dino Ienco

Organizers : Dino Ienco, UMR TETIS - IRSTEA, Montpellier, France Raffaele Gaetano, UMR TETIS - CIRAD, Montpellier, France https://sites.google.com/site/dinoienco/tiselc

16:15 - 16:35

Introduction Dino Ienco, Raffaele Gaetano

WEDNESDAY

16:00 - 16:15

End-to-End Learning of Deep Spatio-Temporal Representations for Hyperspectral Image Time Series Classification - 1st Place Nicola Di Mauro, Antonio Vergari, Teresa M.A. Basile, Fabrizio G. Ventola and Floriana Esposito

17:15 - 17:35

Sergey Ryabukhin

TiSeLaC: A Distance Metric Learning Approach Bac Nguyen Cong, Christina Papagiannopoulou

THURSDAY

16:55 - 17:15

Time Series Land Cover Classification Challenge: Solution Report - 2nd Place

The IRISA/LETG Ensemble Classifier for the TiSeLaC Challenge Romain Tavenard, Simon Malinowski, Nicolas Courty

FRIDAY

16:35 - 16:55

TUESDAY

Nowadays, modern earth observation programs produce huge volumes of satellite images time series (SITS) that can be useful to monitor geographical areas through time. How to efficiently analyze such kind of information is still an open question in the remote sensing field. In the context of land cover classification, exploiting time series of satellite images, instead that one single image, can be fruitful to distinguish among classes based on the fact they have different temporal profiles. The objective of this challenge is to bring closer the Machine Learning and Remote Sensing communities to work on such kind of data. The Machine Learning community has the opportunity to validate and test their approaches on real world data in an application context that is getting more and more attention due to the increasing availability of SITS data while, this challenge offers to the Remote Sensing experts a way to discover and evaluate new data mining and machine learning methods to deal with SITS data. The challenge involves a multi-class single label classification problem where the examples to classify are pixels described by the time series of satellite images and the prediction is related to the land cover of associated to each pixel.

MONDAY

TISELAC Challenge : Time Series Land Cover Classification


51

TUESDAY

SEPTEMBER 19

SESSIONS WITH ABSTRACTS

Neural Networks & Deep Learning 1 11:00 - 11:20

CONGRESS HALL 1

FCNNs: Fourier Convolutional Neural Networks Harry Pratt, Bryan Williams, Frans Coenen, Yalin Zheng

The Fourier domain is used in computer vision and machine learning as image analysis tasks in the Fourier domain are analogous to spatial domain methods but are achieved using different operations. Convolutional Neural Networks (CNNs) use machine learning to achieve state-of-the-art results with respect to many computer vision tasks. One of the main limiting aspects of CNNs is the computational cost of updating a large number of convolution parameters. Further, in the spatial domain, larger images take exponentially longer than smaller image to train on CNNs due to the operations involved in convolution methods. Consequently, CNNs are often not a viable solution for large image computer vision tasks. In this paper a Fourier Convolution Neural Network (FCNN) is proposed whereby training is conducted entirely within the Fourier domain. The advantage offered is that there is a significant speed up in training time without loss of effectiveness. Using the proposed approach larger images can therefore be processed within viable computation time. The FCNN is fully described and evaluated. The evaluation was conducted using the benchmark Cifar10 and MNIST datasets, and a bespoke fundus retina image dataset. The results demonstrate that convolution in the Fourier domain gives a significant speed up without adversely affecting accuracy. For simplicity the proposed FCNN concept is presented in the context of a basic CNN architecture, however, the FCNN concept has the potential to improve the speed of any neural network system involving convolution. 11:20 - 11:40

Multimodal Classification for Analysing Social Media Chi Thang Duong, Remi Lebret, Karl Aberer

NO SHOW

This paper was accepted for presentation. However, it was not presented at the conference and is thus not published in the conference proceedings.

11:40 - 12:00

Sequence Generation with Target Attention Yingce Xia, Fei Tian, Tao Qin, Nenghai Yu, Tie-Yan Liu

Source-target attention mechanism (briefly, source attention) has become one of the key components in a wide range of sequence generation tasks, such as neural machine translation, image caption, and open-domain dialogue generation. In these tasks, the attention mechanism, typically in control of information flow from the encoder to the decoder, enables to generate every component in the target sequence relying on different source components. While source attention mechanism has attracted many research interests, few of them turn eyes to if the generation of target sequence can additionally benefit from attending back to itself, which however is intuitively motivated by the nature of attention. To investigate the question, in this paper, we propose a new target-target attention mechanism (briefly, target attention). Along the progress of generating target sequence, target attention mechanism takes into account the relationship between the component to generate and its preceding context within the target sequence, such that it can better keep the coherent consistency and improve the readability of the generated sequence. Furthermore, it complements the information from source attention so as to further enhance semantic adequacy. After designing an effective approach to incorporate target attention in encoder-decoder framework, we conduct extensive experiments on both neural machine translation and image caption. Experimental results clearly demonstrate the effectiveness of our design of integrating both source and target attention for sequence generation tasks.

52


12:00 - 12:20

Weightless Neural Networks for Open Set Recognition Douglas O. Cardoso, João Gama, Felipe M. G. França

A Network Architecture for Multi-multi Instance Learning Alessandro Tibo, Paolo Frasconi, Manfred Jaeger

Time Series & Streams 1 11:00 - 11:20

WEDNESDAY

We study an extension of the multi-instance learning problem where examples are organized as nested bags of instances (e.g., a document could be represented as a bag of sentences, which in turn are bags of words). This framework can be useful in various scenarios, such as graph classification, image classification and translation-invariant pooling in convolutional neural network. In order to learn multi-multi instance data, we introduce a special neural network layer, called bag-layer, whose units aggregate sets of inputs of arbitrary size. We prove that the associated class of functions contains all Boolean functions over sets of sets of instances. We present empirical results on semi-synthetic data showing that such class of functions can be actually learned from data. We also present experiments on citation graphs datasets where our model obtains competitive results.

TUESDAY

12:20 - 12:40

MONDAY

Open set recognition is a classification-like task. It is accomplished not only by the identification of observations which belong to targeted classes (i.e., the classes among those represented in the training sample which should be later recognized) but also by the rejection of inputs from other classes in the problem domain. The need for proper handling of elements of classes beyond those of interest is frequently ignored, even in works found in the literature. This leads to the improper development of learning systems, which may obtain misleading results when evaluated in their test beds, consequently failing to keep the performance level while facing some real challenge. The adaptation of a classifier for open set recognition is not always possible: the probabilistic premises most of them are built upon are not valid in a open-set setting. Still, this paper details how this was realized for WiSARD a weightless artificial neural network model. Such achievement was based on an elaborate distancelike computation this model provides and the definition of rejection thresholds during training. The proposed methodology was tested through a collection of experiments, with distinct backgrounds and goals. The results obtained confirm the usefulness of this tool for open set recognition.

CONGRESS HALL 2

Cost-Sensitive Perceptron Decision Trees for Imbalanced Drifting Data Streams

11:20 - 11:40

Learning TSK Fuzzy Rules from Data Streams Ammar Shaker, Waleri Heldt, Eyke Huellermeier

Learning from data streams has received increasing attention in recent years, not only in the machine learning community but also in other research fields, such as computational intelligence and fuzzy systems. In particular, several rule-based methods for the incremental induction of regression models have been proposed. In this paper, we develop a method that combines the strengths of two existing approaches rooted in different learning paradigms. Our method induces a set of fuzzy rules, which, compared to conventional rules with Boolean antecedents, has the advantage of producing smooth regression functions. To do so, it makes use of an induction technique inspired by AMRules, a very efficient and effective learning algorithm that can be seen as the state of the art in machine learning. We conduct a comprehensive experimental study showing that a combination of the expressiveness of fuzzy rules with the algorithmic concepts of AMRules yields a learning system with superb performance.


53

FRIDAY

Mining streaming and drifting data is among the most popular contemporary applications of machine learning methods. Due to the potentially unbounded number of instances arriving rapidly, evolving concepts and limitations imposed on utilized computational resources, there is a need to develop efficient and adaptive algorithms that can handle such problems. These learning difficulties can be further augmented by appearance of skewed distributions during the stream progress. Class imbalance in non-stationary scenarios is highly challenging, as not only imbalance ratio may change over time, but also the class relationships. In this paper we propose an efficient and fast cost-sensitive decision tree learning scheme for handling online class imbalance. In each leaf of the tree we train a perceptron with output adaptation to compensate for skewed class distributions, while McDiarmid’s bound is used for controlling the splitting attribute selection. The cost matrix automatically adapts itself to the current imbalance ratio in the stream, allowing for a smooth compensation of evolving class relationships. Furthermore, we analyze the characteristics of minority class instances and incorporate this information during the training process. It allows our classifier to focus on most difficult instances, while a sliding window keeps track of the changes in class structures. Experimental analysis carried out on a number of binary and multi-class imbalanced data streams indicate the usefulness of the proposed approach.

THURSDAY

Bartosz Krawczyk, Przemyslaw Skryjomski

TUESDAY

SEPTEMBER 19

11:40 - 12:00

Non-Parametric Online AUC Maximization Balazs Szorenyi, Snir Cohen, Shie Mannor

We consider the problems of online and one-pass maximization of the area under the ROC curve (AUC). AUC maximization is hard even in the offline setting and thus solutions often make some compromises. Existing results for the online problem typically optimize for some proxy defined via surrogate losses instead of maximizing the real AUC. This approach is confirmed by results showing that the optimum of these proxies, over the set of all (measurable) functions, maximize the AUC. The problem is that---in order to meet the strong requirements for per round run time complexity---online methods typically work with restricted hypothesis classes and this, as we show, corrupts the above compatibility and causes the methods to converge to suboptimal solutions even in some simple stochastic cases. To remedy this, we propose a different approach and show that it leads to asymptotic optimality. Our theoretical claims and considerations are tested by experiments on real datasets, which provide empirical justification to them.

12:00 - 12:20

On-Line Dynamic Time Warping for Streaming Time Series Izaskun Oregi, Aritz Perez, Javier Del Ser, Jose Lozano

Dynamic Time Warping is a well-known measure of dissimilarity between time series. Due to its flexibility to deal with non-linear distortions along the time axis, this measure has been widely utilized in machine learning models for this particular kind of data. Nowadays, the proliferation of streaming data sources has ignited the interest and attention of the scientific community around on-line learning models. In this work, we naturally adapt Dynamic Time Warping to the on-line learning setting. Specifically, we propose a novel on-line measure of dissimilarity for streaming time series which combines a warp constraint and a weighted memory mechanism to simplify the time series alignment and adapt to non-stationary data intervals along time. Computer simulations are analyzed and discussed so as to shed light on the performance and complexity of the proposed measure. 12:20 - 12:40

PowerCast: Mining and Forecasting Power Grid Sequences Hyun Ah Song, Bryan Hooi, Marko Jereminov, Amritanshu Pandey, Larry Pileggi, Christos Faloutsos

What will be the power consumption of our institution at 8am for the upcoming days? What will happen to the power consumption of a small factory, if it wants to double (or half) its production? Technologies associated with the smart electrical grid are needed. Central to this process are algorithms that accurately model electrical load behavior, and forecast future electric power demand. However, existing power load models fail to accurately represent electrical load behavior in the grid. In this paper, we propose PowerCast, a novel domain-aware approach for forecasting the electrical power demand, by carefully incorporating domain knowledge. Our contributions are as follows: 1. Infusion of domain expert knowledge: We represent the time sequences using an equivalent circuit model, the ``BIG” model, which allows for an intuitive interpretation of the power load, as the BIG model is derived from physics-based first principles. 2. Forecasting of the power load: Our PowerCast uses the BIG model, and provides (a) accurate prediction in multi-step-ahead forecasting, and (b) extrapolations, under what-if scenarios, such as variation in the demand (say, due to increase in the count of people on campus, or a decision to half the production in our factory etc.) 3. Anomaly detection: PowerCast can spot and, even explain, anomalies in the given time sequences. The experimental results based on two real world datasets of up to three weeks duration, demonstrate that PowerCast is able to forecast several steps ahead, with 59% error reduction, compared to the competitors. Moreover, it is fast, and scales linearly with the duration of the sequences.

Ensembles & Meta Learning 11:00 - 11:20

CONGRESS HALL 3

Adaptive Random Forests for Evolving Data Stream Classification Heitor M. Gomes, Albert Bifet, Jesse Read, Jean Paul Barddal, Fabrício Enembreck, Bernhard Pfharinger, Geoff Holmes, Talel Abdessalem

Random forests is currently one of the most used machine learning algorithms in the non-streaming (batch) setting. This preference is attributable to its high learning performance and low demands with respect to input preparation and hyper-parameter tuning. However, in the challenging context of evolving data streams, there is no random forests algorithm that can be considered state-of-the-art in comparison to bagging and boosting based algorithms. In this work, we present the adaptive random forest (ARF) algorithm for classification of evolving data streams. In contrast to previous attempts of replicating random forests for data stream learning, ARF includes an effective resampling method and adaptive operators that can cope with different types of concept drifts without complex optimizations for different data sets. We present experiments with a parallel implementation of ARF which has no degradation in terms of classification performance in comparison to a serial implementation, since trees and adaptive operators are independent from one another. Finally, we compare ARF with state-of-the-art algorithms in a traditional testthen-train evaluation and a novel delayed labelling evaluation, and show that ARF is accurate and uses a feasible amount of resources.

54


11:20 - 11:40

Classification of High-dimensional Evolving Data Streams via a Resource-efficient Online Ensemble

Dynamic Ensemble Selection with Probabilistic Classifier Chains Anil Narassiguin, Haytham Elghazel, Alexandre Aussem

Dynamic ensemble selection (DES) is the problem of finding, given an input x, a subset of models among the ensemble that achieves the best possible prediction accuracy. Recent studies have reformulated the DES problem as a multi-label classification problem and promising performance gains have been reported. However, their approaches may converge to an incorrect, and hence suboptimal, solution as they don’t optimize the true - but non standard - loss function directly. In this paper, we show that the label dependencies have to be captured explicitly and propose a DES method based on Probabilistic Classifier Chains. Experimental results on 20 benchmark data sets show the effectiveness of the proposed method against competitive alternatives, including the aforementioned multi-label approaches. Keywords: Dynamic ensemble selection, Multi-label learning, Probabilistic Classifier Chains

Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks Shizhao Sun, Wei Chen, Jiang Bian, Xiaoguang Liu, Tie-Yan Liu

Parallelization framework has become a necessity to speed up the training of deep neural networks (DNN) recently. Such framework typically employs the Model Average approach, denoted as MA-DNN, in which parallel workers conduct respective training based on their own local data while the parameters of local models are periodically communicated and averaged to obtain a global model which serves as the new start of local models. However, since DNN is a highly non-convex model, averaging parameters cannot ensure that such global model can perform better than those local models. To tackle this problem, we introduce a new parallel training framework called Ensemble-Compression, denoted as EC-DNN. In this framework, we propose to aggregate the local models by ensemble, i.e., averaging the outputs of local models instead of the parameters. As most of prevalent loss functions are convex to the output of DNN, the performance of ensemble-based global model is guaranteed to be at least as good as the average performance of local models. However, a big challenge lies in the explosion of model size since each round of ensemble can give rise to multiple times size increment. Thus, we carry out model compression after each ensemble, specialized by a distillation based method in this paper, to reduce the size of the global model to be the same as the local ones. Our experimental results demonstrate the prominent advantage of EC-DNN over MA-DNN in terms of both accuracy and speedup. 12:20 - 12:40

Fast and Accurate Density Estimation with Extremely Randomized Cutset Networks

WEDNESDAY

12:00 - 12:20

THURSDAY

11:40 - 12:00

TUESDAY

A novel online ensemble strategy, ensemble BPegasos(EBPegasos), is proposed to solve the problems simultaneously caused by concept drifting and the curse of dimensionality in classifying high-dimensional evolving data streams, which has not been addressed in the literature. First, EBPegasos uses BPegasos, an online kernelized SVM-based algorithm, as the component classifier to address the scalability and sparsity of high-dimensional data. Second, EBPegasos takes full advantage of the characteristics of BPegasos to cope with various types of concept drifts. Specifically, EBPegasos constructs diverse component classifiers by controlling the budget size of BPegasos; it also equips each component with a drift detector to monitor and evaluate its performance, and modifies the ensemble structure only when large performance degradation occurs. Such conditional structural modification strategy makes EBPegasos strike a good balance between exploiting and forgetting old knowledge. Lastly, we first prove experimentally that EBPegasos is more effective and resource-efficient than the tree ensembles on high-dimensional data. Then comprehensive experiments on synthetic and real-life datasets also show that EBPegasos can cope with various types of concept drifts significantly better than the state-of-the-art ensemble frameworks when all ensembles use BPegasos as the base learner.

MONDAY

Tingting Zhai, Yang Gao, Hao Wang, Longbing Cao

Cutset Networks (CNets) are recently introduced density estimators leveraging context-specific independences to provide exact inference in polynomial time. Learning a CNet is done first by building a weighted probabilistic OR tree and then estimating tractable distributions as its leaves. In practice, selecting an optimal OR split node requires cubic time in the number of the data features, and even approximate heuristics still scale in quadratic time. We introduce Extremely Randomized Cutset Networks (XCNets), CNets whose OR tree is learned by performing random conditioning. This simple yet surprisingly effective approach reduces the complexity of OR node selection to constant time. While the likelihood of an XCNet is slightly worse than an optimally learned CNet, ensembles of XCNets outperform state-of-the-art density estimators on a series of standard benchmark datasets, yet employing only a fraction of the time needed to learn the competitors.


55

FRIDAY

Nicola Di Mauro, Antonio Vergari, Teresa M.A. Basile, Floriana Esposito

TUESDAY

SEPTEMBER 19

Applied Data Science 1 11:00 - 11:20

CONGRESS HALL 4

Probabilistic Inference of Twitter Users’ Age Based on What They Follow Benjamin Chamberlain, Marc Deisenroth, Clive Humby

Twitter provides an open and rich source of data for studying human behaviour at scale and is widely used in social and network sciences. However, a major criticism of Twitter data is that demographic information is largely absent. Enhancing Twitter data with user ages would advance our ability to study social network structures, information flows and the spread of contagions. Approaches toward age detection of Twitter users typically focus on specific properties of tweets, e.g., linguistic features, which are language dependent. In this paper, we devise a languageindependent methodology for determining the age of Twitter users from data that is native to the Twitter ecosystem. The key idea is to use a Bayesian framework to generalise ground-truth age information from a few Twitter users to the entire network based on what/whom they follow. Our approach scales to inferring the age of 700 million Twitter accounts with high accuracy. 11:20 - 11:40

Unsupervised Signature Extraction from Forensic Logs Stefan Thaler, Vlado Menkovski, Milan Petkovic

Signature extraction is a key part of forensic log analysis. It involves recognizing patterns in log lines such that log lines that originated from the same line of code are grouped together. A log signature consists of immutable parts and mutable parts. The immutable parts define the signature, and the mutable parts are typically variable parameter values. In practice, the number of log lines and signatures can be quite large, and the task of detecting and aligning immutable parts of the logs to extract the signatures becomes a significant challenge. We propose a novel method based on a neural language model that outperforms the current state-of-the-art on signature extraction. We use an RNN auto-encoder to create an embedding of the log lines. Log lines embedded in such a way can be clustered to extract the signatures in an unsupervised manner.

11:40 - 12:00

Event Detection and Summarization Using Phrase Networks: PhraseNet Sara Melvin, Wenchao Yu, Peng Ju, Sean Young, Wei Wang

Identifying events in real-time data streams such as Twitter is crucial for many occupations to make timely, actionable decisions. It is however extremely challenging because of the subtle difference between ``events’’ and trending topics, the definitive rarity of these events, and the complexity of modern Internet’s text data. Existing approaches often utilize topic modeling technique and keywords frequency to detect events on Twitter, which have three main limitations: 1) supervised and semi-supervised methods run the risk of missing important, breaking news events; 2) existing topic/event detection models are base on words, while the correlations among phrases are ignored; 3) many previous methods identify trending topics as events. To address these limitations, we propose the model, PhraseNet, an algorithm to detect and summarize events from tweets. To begin, all topics are defined as a clustering of high-frequency phrases extracted from text. All trending topics are then identified based on temporal spikes of the phrase cluster frequencies. PhraseNet thus filters out high-confidence events from other trending topics using number of peaks and variance of peak intensity. We evaluate PhraseNet on a three month duration of Twitter data and show the both the efficiency and the effectiveness of our approach. 12:00 - 12:20

Stance Classification of Tweets Using Skip Char NGrams Yaakov HaCohen-Kerner, Ziv Ido, Ronen Ya’akobov

In this research, we focus on automatic supervised stance classification of tweets. Given test datasets of tweets from five various topics, we try to classify the stance of the tweet authors as either in FAVOR of the target, AGAINST it, or NONE. We apply eight variants of seven supervised machine learning methods and three filtering methods using the WEKA platform. The macro-average results obtained by our algorithm are significantly better than the state-of-art results re-ported by the best macro-average results achieved in the SemEval 2016 Task 6-A for all the five released datasets. In contrast to the competitors of the SemEval 2016 Task 6-A, who did not use any char skip ngrams but rather used thousands of ngrams and hundreds of word embedding features, our algorithm uses a few tens of features mainly character-based features where most of them are skip char ngram features.

56


12:20 - 12:40

Optimal Client recommendation for Market Makers in Illiquid Financial Products Dieter Hendricks, Stephen Roberts

MONDAY

The process of liquidity provision in financial markets can result in prolonged exposure to illiquid instruments for market makers. In this case, where a proprietary position is not desired, pro-actively targeting the right client who is likely to be interested can be an effective means to offset this position, rather than relying on commensurate interest arising through natural demand. In this paper, we consider the inference of a client profile for the purpose of corporate bond recommendation, based on typical recorded information available to the market maker. Given a historical record of corporate bond transactions and bond meta-data, we use a topic-modelling analogy to develop a probabilistic technique for compiling a curated list of client recommendations for a particular bond that needs to be traded, ranked by probability of interest. We show that a model based on Latent Dirichlet Allocation offers promising performance to deliver relevant recommendations for sales traders.

Edwin Simpson, Steven Reece, Stephen Roberts Unstructured data from diverse sources, such as social media and aerial imagery, can provide valuable up-to-date information for intelligent situation assessment. Mining these different information sources could bring major benefits to applications such as situation awareness in disaster zones and mapping the spread of diseases. Such applications depend on classifying the situation across a region of interest, which can be depicted as a spatial â€œheatmap”. Annotating unstructured data using crowdsourcing or automated classifiers produces individual classifications at sparse locations that typically contain many errors. In this paper, we propose a novel Bayesian approach that models the relevance, error rates and bias of each information source, enabling us to learn a spatial Gaussian Process classifier by aggregating data from multiple sources with varying reliability and relevance. Our method does not require gold-labelled data and can make predictions at any location in an area of interest given only sparse observations. We show empirically that our approach can handle noisy and biased data sources, and that simultaneously inferring reliability and transferring information between neighbouring reports leads to more accurate predictions. We demonstrate our method on two real-world problems from disaster response, showing how our approach reduces the amount of crowdsourced data required and can be used to generate valuable heatmap visualisations from SMS messages and satellite images.

14:20 - 14:40

Bayesian Inference for Least Squares Temporal Difference Regularization Nikolaos Tziortziotis, Christos Dimitrakakis

14:40 - 15:00

FRIDAY

This paper proposes a fully Bayesian approach for Least-Squares Temporal Differences (LSTD), resulting in fully probabilistic inference of value functions that avoids the overfitting commonly experienced with classical LSTD when the number of features is larger than the number of samples. Sparse Bayesian learning provides an elegant solution through the introduction of a prior over value function parameters. This gives us the advantages of probabilistic predictions, a sparse model, and good generalisation capabilities, as irrelevant parameters are marginalised out. The algorithm efficiently approximates the posterior distribution through variational inference. We demonstrate the ability of the algorithm in avoiding overfitting experimentally.

Discovery of Causal Models That Contain Latent Variables through Bayesian Scoring of Independence Constraints Fattaneh Jabbari, Gregory Cooper, Joseph Ramsey, Peter Spirtes

Discovering causal structure from observational data in the presence of latent variables remains an active research area. Constraint-based causal discovery algorithms are relatively efficient at discovering such causal models from data using independence tests. Typically, however, they derive and output only one such model. In contrast, Bayesian methods can generate and probabilistically score multiple models, outputting the most probable one; however, they are often computationally infeasible to apply when modeling latent variables. We introduce a hybrid method that derives a Bayesian probability that the set of independence tests associated with a given causal model are jointly correct. Using this constraintbased scoring method, we are able to score multiple causal models, which possibly contain latent variables, and output the most probable one. The structure-discovery performance of the proposed method is compared to an existing constraint-based method (RFCI) using data generated from several previously published Bayesian networks. The structural Hamming distances of the output models improved when using the proposed method compared to RFCI, especially for small sample sizes.


TUESDAY

Bayesian Heatmaps: Probabilistic Classification with Multiple Unreliable Information Sources

WEDNESDAY

14:00 - 14:20

CONGRESS HALL 1

THURSDAY


57

TUESDAY

SEPTEMBER 19

15:00 - 15:20

Efficient Parameter Learning of Bayesian Network Classifiers Nayyar A. Zaidi, Geoffrey I. Webb, Mark J. Carman, François Petitjean, Wray Buntine, Mike Hynes, Hans De Sterck

Recent advances have demonstrated substantial benefits from learning with both generative and discriminative parameters. On the one hand, generative approaches address the estimation of the parameters of the joint distribution—P(y, x), which for most network types is very computationally efficient (a notable exception to this are Markov networks) and on the other hand, discriminative approaches address the estimation of the parameters of the posterior distribution—and, are more effective for classification, since they fit P(y|x) directly. However, discriminative approaches are less computationally efficient as the normalization factor in the conditional log-likelihood precludes the derivation of closed-form estimation of parameters. This paper introduces a new discriminative parameter learning method for Bayesian network classifiers that combines in an elegant fashion parameters learned using both generative and discriminative methods. The proposed method is discriminative in nature, but uses estimates of generative probabilities to speed-up the optimization process. A second contribution is to propose a simple framework to characterize the parameter learning task for Bayesian network classifiers. We conduct an extensive set of experiments on 72 standard datasets and demonstrate that our proposed discriminative parameterization provides an efficient alternative to other state-of-the-art parameterizations.

15:20 - 15:40

MixedTrails: Bayesian Hypothesis Comparison on Heterogeneous Sequential Data Martin Becker, Florian Lemmerich, Philipp Singer, Markus Strohmaier, Andreas Hotho

Sequential traces of user data are frequently observed online and offline, e.g., as sequences of visited websites or as sequences of locations captured by GPS. However, understanding factors explaining the production of sequence data is a challenging task, especially since the data generation is often not homogeneous. For example, navigation behavior might change in different phases of browsing a website or movement behavior may vary between groups of users. In this work, we tackle this task and propose MixedTrails, a Bayesian approach for comparing the plausibility of hypotheses regarding the generative processes of heterogeneous sequence data. Each hypothesis is derived from existing literature, theory, or intuition and represents a belief about transition probabilities between a set of states that can vary between groups of observed transitions. For example, when trying to understand human movement in a city and given some data, a hypothesis assuming tourists to be more likely to move towards points of interests than locals can be shown to be more plausible than a hypothesis assuming the opposite. Our approach incorporates such hypotheses as Bayesian priors in a generative mixed transition Markov chain model, and compares their plausibility utilizing Bayes factors. We discuss analytical and approximate inference methods for calculating the marginal likelihoods for Bayes factors, give guidance on interpreting the results, and illustrate our approach with several experiments on synthetic and empirical data from Wikipedia and Flickr. Thus, this work enables a novel kind of analysis for studying sequential data in many application areas.

Networks & Graphs 1 14:00 - 14:20

CONGRESS HALL 2

Attributed Graph Clustering with Unimodal Normalized Cut Wei Ye, Linfei Zhou, Xin Sun, Claudia Plant, Christian Böhm

Graph vertices are often associated with attributes. For example, in addition to their connection relations, people in friendship networks have personal attributes, such as interests, age, and residence. Such graphs (networks) are called attributed graphs. The detection of clusters in attributed graphs is of great practical relevance, e.g., targeting ads. Attributes and edges often provide complementary information. The effective use of both types of information promises meaningful results. In this work, we propose a method called UNCut (for Unimodal Normalized Cut) to detect cohesive clusters in attributed graphs. A cohesive cluster is a subgraph that has densely connected edges and has as many homogeneous (unimodal) attributes as possible. We adopt the normalized cut to assess the density of edges in a graph cluster. To evaluate the unimodality of attributes, we propose a measure called unimodality compactness which exploits Hartigans’ dip test. Our method UNCut integrates the normalized cut and unimodality compactness in one framework such that the detected clusters have low normalized cut and unimodality compactness values. Extensive experiments on various synthetic and real-world data verify the effectiveness and efficiency of our method UNCut compared with state-of-the-art approaches.

58


14:20 - 14:40

Ensemble-Based Community Detection in Multilayer Networks

k-clique-graphs for Dense Subgraph Discovery

Giannis Nikolentzos, Polykarpos Meladianos, Yannis Stavrakas, Michalis Vazirgiannis Finding dense subgraphs in a graph is a fundamental graph mining task, with applications in several fields. Algorithms for identifying dense subgraphs are used in biology, in finance, in spam detection, etc. Standard formulations of this problem such as the problem of finding the maximum clique of a graph are hard to solve. However, some tractable formulations of the problem have also been proposed, focusing mainly on optimizing some density function, such as the degree density and the triangle density. However, maximization of degree density usually leads to large subgraphs with small density, while maximization of triangle density does not necessarily lead to subgraphs that are close to being cliques. In this paper, we introduce the k-clique-graph densest subgraph problem, k > k >> k_n. Compared to k-means’ O(nkd) complexity, our k2-means complexity is significantly lower, at the expense of slightly increasing the memory complexity by O(nk_n+k2). In our extensive experiments k2-means is an order of magnitude faster than standard methods in computing accurate clusterings on several standard datasets and settings with hundreds of clusters and high dimensional data. Moreover, the proposed divisive initialization generally leads to clustering energies comparable to those achieved with the standard k-means++ initialization, while being significantly faster.

80


11:40 - 12:00

Learning Constraints in spreadsheets and tabular data Samuel Kolb, Sergey Paramonov, Tias Guns, Luc De Raedt

Cluster analysis over Moving Object Databases (MODs) is a challenging research topic that has attracted the attention of the mobility data mining community. In this paper, we study the temporal-constrained sub-trajectory cluster analysis problem, where the aim is to discover clusters of sub-trajectories given an ad-hoc, user-specified temporal constraint within the dataset’s lifetime. The problem is challenging because: (a) the time window is not known in advance, instead it is specified at query time, and (b) the MOD is continuously updated with new trajectories. Existing solutions first filter the trajectory database according to the temporal constraint, and then apply a clustering algorithm from scratch on the filtered data. However, this approach is extremely inefficient, when considering explorative data analysis where multiple clustering tasks need to be performed over different temporal subsets of the database, while the database is updated with new trajectories. To address this problem, we propose an incremental and scalable solution to the problem, which is built upon a novel indexing structure, called Representative Trajectory Tree (ReTraTree). ReTraTree acts as an effective spatio-temporal partitioning technique; partitions in ReTraTree correspond to groupings of sub-trajectories, which are incrementally maintained and assigned to representative (sub-)trajectories. Due to the proposed organization of sub-trajectories, the problem under study can be efficiently solved as simply as executing a query operator on ReTraTree, while insertion of new trajectories is supported. Our extensive experimental study performed on real and synthetic datasets shows that our approach outperforms a state-of-the-art in-DBMS solution supported by PostgreSQL by orders of magnitude. 12:20 - 12:40

Pivot-Based Distributed k-Nearest Neighbor Mining Caitlin Kuhlman, Yizhou Yan, Lei Cao, Elke Rundensteiner

k-nearest-neighbor (kNN) search is a fundamental data mining task critical to many data analytics methods. Yet no effective techniques to date scale kNN search to large datasets. In this work we present PkNN, an exact distributed method that by leveraging modern distributed architectures for the first time scales kNN search to billion point datasets. The key to the PkNN strategy is a multi-round kNN search that exploits pivot-based data partitioning at each stage. This includes an outlier-driven partition adjustment mechanism that effectively minimizes data duplication and achieves a balanced workload across the compute cluster. Aggressive data-driven bounds along with a tiered support assignment strategy ensure correctness while limiting computation costs. Our experimental study on multi-dimensional real-world data demonstrates that PkNN achieves significant speedup over the state-of-the-art and scales effectively in data cardinality and dimension.

11:00 - 11:20

CONGRESS HALL 3

FRIDAY

Feature Selection & Extraction

TUESDAY

Nikos Pelekis, Panagiotis Tampakis, Marios Vodas, Christos Doulkeridis, Yannis Theodoridis

WEDNESDAY

On Temporal-Constrained Sub-Trajectory Cluster Analysis

THURSDAY

12:00 - 12:20

MONDAY

Spreadsheets, comma separated value files and other tabular data representations are in wide use today. However, writing, maintaining and identifying good formulas for tabular data and spreadsheets can be time-consuming and error-prone. We investigate the automatic learning of constraints (formulas and relations) in raw tabular data in an unsupervised way. We represent common spreadsheet formulas and relations through predicates and expressions whose arguments must satisfy the inherent properties of the constraint. The challenge is to automatically infer the set of constraints present in the data, without labeled examples or user feedback. We propose a two-stage generate and test method where the first stage uses constraint solving techniques to efficiently reduce the number of candidates, based on the predicate signatures. Our approach takes inspiration from inductive logic programming, constraint learning and constraint satisfaction. We show that we are able to accurately discover constraints in spreadsheets from various sources.

Deep Discrete Hashing with Self-Supervised Labels Jingkuan Song, Tao He, Hangbo Fan, lianli Gao

Hashing methods have been widely used for applications of large-scale image retrieval and classification. Non-deep hashing methods using handcrafted features have been significantly outperformed by deep hashing methods due to their better feature representation and end-to-end learning framework. However, the most striking successes in deep hashing have mostly involved discriminative models, which require labels. In this paper, we propose a novel unsupervised deep hashing method, named Deep Discrete Hashing (DDH), for large-scale image retrieval and classification. In the proposed framework, we address two main problems: 1) how to directly learn discrete binary codes? 2) how to equip the binary representation with the ability of accurate image retrieval and classification in an unsupervised way? We resolve these problems by introducing an intermediate variable and a loss function steering the learning process, which is based on the neighborhood structure in the original space. Experiments on real datasets show that our method can significantly outperform other unsupervised methods to achieve the stateof-the-art performance for image retrieval and object recognition.


81

WEDNESDAY 11:20 - 11:40

SEPTEMBER 20 Including Multi-feature interactions and Redundancy for Feature Ranking in Mixed Datasets Arvind Kumar Shekar, Tom Bocklisch, Patricia Iglesias Sanchez, Christoph-Nikolas Straehle, Emmanuel Müller

Feature ranking is beneficial to gain knowledge and to identify the relevant features from a high-dimensional dataset. However, in several datasets, few features by itself might have small correlation with the target classes, but by combining these features with some other features, they can be strongly correlated with the target. This means that multiple features exhibit interactions among themselves. It is necessary to rank the features based on these interactions for better analysis and classifier performance. However, evaluating these interactions on large datasets is computationally challenging. Furthermore, datasets often have features with redundant information. Using such redundant features hinders both efficiency and generalization capability of the classifier. The major challenge is to efficiently rank the features based on relevance and redundance on mixed datasets. In this work, we propose a filter-based framework based on Relevance and Redundancy (RaR), RaR computes a single score that quantifies the feature relevance by considering interactions between features and redundancy. The top ranked features of RaR are characterized by maximum relevance and non-redundance. The evaluation on synthetic and real world datasets demonstrates that our approach outperforms several state-of-the-art feature selection techniques. 11:40 - 12:00

Non-Redundant Spectral Dimensionality Reduction Yochai Blau, Tomer Michaeli

Spectral dimensionality reduction algorithms are widely used in numerous domains, including for recognition, segmentation, tracking and visualization. However, despite their popularity, these algorithms suffer from a major limitation known as the ``repeated Eigen-directions’’ phenomenon. That is, many of the embedding coordinates they produce typically capture the same direction along the data manifold. This leads to redundant and inefficient representations that do not reveal the true intrinsic dimensionality of the data. In this paper, we propose a general method for avoiding redundancy in spectral algorithms. Our approach relies on replacing the orthogonality constraints underlying those methods by unpredictability constraints. Specifically, we require that each embedding coordinate be unpredictable (in the statistical sense) from all previous ones. We prove that these constraints necessarily prevent redundancy, and provide a simple technique to incorporate them into existing methods. As we illustrate on challenging high-dimensional scenarios, our approach produces significantly more informative and compact representations, which improve visualization and classification tasks.

12:00 - 12:20

Rethinking Unsupervised Feature Selection: From Pseudo Labels to Pseudo Must-Links Xiaokai Wei, Sihong Xie, Bokai Cao, Philip Yu

High-dimensional data are prevalent in various machine learning applications. Feature selection is a useful technique for alleviating the curse of dimensionality. Unsupervised feature selection problem tends to be more challenging than its supervised counterpart due to the lack of class labels. State-of-the-art approaches usually use the concept of pseudo labels to select discriminative features by their regression coefficients and the pseudo-labels derived from clustering is usually inaccurate. In this paper, we propose a new perspective for unsupervised feature selection by Discriminatively Exploiting Similarity (DES). Through forming similar and dissimilar data pairs, implicit discriminative information can be exploited. The similar/dissimilar relationship of data pairs can be used as guidance for feature selection. Based on this idea, we propose hypothesis testing based and classification based methods as instantiations of the DES framework. We evaluate the proposed approaches extensively using six real-world datasets. Experimental results demonstrate that our approaches achieve significantly outperforms the state-of-the-art unsupervised methods. More surprisingly, our unsupervised method even achieves performance comparable to a supervised feature selection method. 12:20 - 12:40

SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, Jiawei Han

Corpus-based set expansion (i.e., finding the “complete” set of entities belonging to the same semantic class, based on a given corpus and a tiny set of seeds) is a critical task in knowledge discovery. It may facilitate numerous downstream applications, such as information extraction, taxonomy induction, question answering, and web search. To discover new entities in an expanded set, previous approaches either make onetime entity ranking based on distributional similarity, or resort to iterative pattern-based bootstrapping. The core challenge for these methods is how to deal with noisy context features derived from free-text corpora, which may lead to entity intrusion and semantic drifting. In this study, we propose a novel framework, SetExpan, which tackles this problem, with two techniques: (1) a context feature selection method that selects clean context features for calculating entity-entity distributional similarity, and (2) a ranking-based unsupervised ensemble method for expanding entity set based on denoised context features. Experiments on three datasets show that SetExpan is robust and outperforms previous state-of-the-art methods in terms of mean average precision.

82



CONGRESS HALL 4

Koopman Spectral kernels for comparing complex dynamics: Application to multiagent sport plays

Understanding the complex dynamics in the real-world such as in multi-agent behaviors is a challenge in numerous engineering and scientific fields. Spectral analysis using Koopman operators has been attracting attention as a way of obtaining a global modal description of a nonlinear dynamical system, without requiring explicit prior knowledge. However, when applying this to the comparison or classification of complex dynamics, it is necessary to incorporate the Koopman spectra of the dynamics into an appropriate metric. One way of implementing this is to design a kernel that reflects the dynamics via the spectra. In this paper, we introduced Koopman spectral kernels to compare the complex dynamics by generalizing the Binetâ€“Cauchy kernel to nonlinear dynamical systems without specifying an underlying model. We applied this to strategic multiagent sport plays wherein the dynamics can be classified, e.g., by the success or failure of the shot. We mapped the latent dynamic characteristics of multiple attackerâ€“defender distances to the feature space using our kernels and then evaluated the scorability of the play by using the features in different classification models.

11:20 - 11:40

MONDAY

Keisuke Fujii, Yuki Inaba, Yoshinobu Kawahara

Boosting Based Multiple Kernel Learning and Transfer Regression for Electricity Load Forecasting

DC-Prophet: Predicting Catastrophic Machine Failures in DataCenters You-Luen Lee, Da-Cheng Juan, Xuan-An Tseng, Yu-Ting Chen, Shih-Chieh Chang

When will a server fail catastrophically in an industrial datacenter? Is it possible to forecast these failures so preventive actions can be taken to increase the reliability of a datacenter? To answer these questions, we have studied what are probably the largest, publicly available datacenter traces, containing more than 104 million events from 12,500 machines. Among these samples, we observe and categorize three types of machine failures, all of which are catastrophic and may lead to information loss, or even worse, reliability degradation of a datacenter. We further propose a two-stage framework “DC-Prophet” based on One-Class Support Vector Machine and Random Forest. DC-Prophet extracts surprising patterns and accurately predicts the next failure of a machine. Experimental results show that DC-Prophet achieves an AUC of 0.93 in predicting the next machine failure, and a F3-score of 0.88 (out of 1). On average, DC-Prophet outperforms other classical machine learning methods by 39.45% in F3-score. 12:00 - 12:20

WEDNESDAY

11:40 - 12:00

THURSDAY

Accurate electricity load forecasting is of crucial importance for power system operation and smart grid energy management. Different factors, such as weather conditions, lagged values, and day types may affect electricity load consumption. We propose to use multiple kernel learning (MKL) for electricity load forecasting, as it provides more flexibilities than traditional kernel methods. Computation time is an important issue for short-term load forecasting, especially for energy scheduling demand. However, conventional MKL methods usually lead to complicated optimization problems. Another practical aspect of this application is that there may be very few data available to train a reliable forecasting model for a new building, while at the same time we may have prior knowledge learned from other buildings. In this paper, we propose a boosting based framework for MKL regression to deal with the aforementioned issues for short-term load forecasting. In particular, we first adopt boosting to learn an ensemble of multiple kernel regressors, and then extend this framework to the context of transfer learning. Experimental results on residential data sets show the effectiveness of the proposed algorithms.

TUESDAY

Di Wu, Boyu Wang, Doina Precup, Benoit Boulet

Quantifying Heterogeneous Causal Treatment Effects in World Bank Development Finance Projects

The World Bank provides billions of dollars in development finance to countries across the world every year. As many projects are related to the environment, we want to understand the World Bank projects impact to forest cover. However, the global extent of these projects results in substantial heterogeneity in impacts due to geographic, cultural, and other factors. Recent research by Athey and Imbens has illustrated the potential for hybrid machine learning and causal inferential techniques which may be able to capture such heterogeneity. We apply their approach using a geolocated dataset of World Bank projects, and augment this data with satellite-retrieved characteristics of their geographic context (including temperature, precipitation, slope, distance to urban areas, and many others). We use this information in conjunction with causal tree (CT) and causal forest (CF) approaches to contrast ‘control’ and ‘treatment’ geographic locations to estimate the impact of World Bank projects on vegetative cover.


83

FRIDAY

Jianing Zhao, Daniel Runfola, Peter Kemper

WEDNESDAY 12:20 - 12:40

SEPTEMBER 20 Taking It for a Test Drive: A Hybrid Spatio-temporal Model for Wildlife Poaching Prediction Evaluated through a Controlled Field Test Shahrzad Gholami, Benjamin Ford, Fei Fang, Andrew Plumptre, Milind Tambe, Margaret Driciru, Fred Wanyama, Aggrey Rwetsiba, Mustapha Nsubaga, Joshua Mabonga

Worldwide, conservation agencies employ rangers to protect conservation areas from poachers. However, agencies lack the manpower to have rangers effectively patrol these vast areas frequently. While past work has modeled poachers’ behavior so as to aid rangers in planning future patrols, those models’ predictions were not validated by extensive field tests. In this paper, we present a hybrid spatio-temporal model that predicts poaching threat levels and results from a five-month field test of our model in Uganda’s Queen Elizabeth Protected Area (QEPA). To our knowledge, this is the first time that a predictive model has been evaluated through such an extensive field test in this domain. We present two major contributions. First, our hybrid model consists of two components: (i) an ensemble model which can work with the limited data common to this domain and (ii) a spatio-temporal model to boost the ensemble’s predictions when sufficient data are available. When evaluated on real-world historical data from QEPA, our hybrid model achieves significantly better performance than previous approaches with either temporally-aware dynamic Bayesian networks or an ensemble of spatially-aware models. Second, in collaboration with the Wildlife Conservation Society and Uganda Wildlife Authority, we present results from a five-month controlled experiment, where rangers patrolled over 450 sq km across QEPA. We demonstrate that our model successfully predicted (1) where snaring activity would occur and (2) where it would not occur; in areas where we predicted a high rate of snaring activity, rangers found more snares than in areas of lower predicted activity. These findings demonstrate that (1) our model’s predictions are selective, (2) our model’s superior laboratory performance extends to the real world, and (3) these predictive models can aid rangers in focusing their efforts to prevent wildlife poaching and save animals.

Neural Networks & Deep Learning 2 14:00 - 14:20

CONGRESS HALL 1

CON-S2V: A Generic Framework for Incorporating Extra-Sentential Context Into Sen2Vec Tanay Kumar Saha, Shafiq Joty, Mohammad Al Hasan

We present a novel approach to learning distributed representation of sentences from unlabeled data by modeling content and context of a sentence. The content model learns sentence representation by predicting its words. The context model comprises a neighbor prediction component and a regularizer to model distributional and proximity hypotheses, respectively. We propose an online algorithm to train the model components jointly. For the first time, we evaluate the models in a setup, where contextual information is available to infer the sentence vectors. The experimental results on tasks involving classifying, clustering, and ranking sentences show that our model outperforms best existing models by a wide margin across multiple datasets. 14:20 - 14:40

Deep Over-Sampling Framework for Classifying Imbalanced Data Shin Ando, Chun Yuan Huang

Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic over-sampling have had limited success with complex, structured data handled by deep learning models. In this paper, we propose Deep Over-sampling (DOS), a framework for extending the synthetic over-sampling method to the deep feature space acquired by a convolutional neural network (CNN). Its key feature is an explicit, supervised representation learning, for which the training data presents each raw input sample with a synthetic embedding target in the deep feature space, which is sampled from the linear subspace of in-class neighbors. We implement an iterative process of training the CNN and updating the targets, which induces smaller in-class variance among the embeddings, to increase the discriminative power of the deep representation. We present an empirical study using public benchmarks, which shows that the DOS framework not only counteracts class imbalance better than the existing method, but also improves the performance of the CNN in the standard, balanced settings.

84


14:40 - 15:00

Joint User Modeling Across Aligned Heterogeneous Sites using Neural Networks

The quality of user modeling is crucial for personalized recommender systems. Traditional within-site recommender systems aim at modeling user preferences using only actions within target site, thus suffer from cold-start problem. To alleviate such problem, researchers propose crossdomain models to leverage user actions from other domains within same site. Joint user modeling is later proposed to further integrate user actions from aligned sites for data enrichment. However, there are still limitations in existing works regarding the modeling of heterogeneous actions, the requirement of full alignment and the design of preferences coupling. To tackle these, we propose JUN: a joint user modeling framework using neural network. We take advantage of neural network’s capability of capturing different data types and its ability for mining highlevel non-linear correlations. Specifically, in additional to site-specific preferences models, we further introduce an auxiliary neural network to transfer knowledge between sites by fine-tuning the user embeddings using alignment information. We adopt JUN for item-based and text-based site to demonstrate its performance. Experimental results indicate that JUN outperforms both within-site and cross-site models. Specifically, JUN achieves relative improvement of 2.96% and 2.37% for item-based and text-based sites (5.77% and 13.54% for cold-start scenarios). Besides performance gain, JUN also achieves great generality and significantly extends the use scenarios. 15:00 - 15:20

Learning Deep Kernels in the Space of Dot Product Polynomials

MONDAY

Xuezhi Cao, Yong Yu

Wikipedia Vandal Early Detection: from User Behavior to User Embedding Shuhan Yuan, Panpan Zheng, Xintao Wu, Yang Xiang

Wikipedia is the largest online encyclopedia that allows anyone to edit articles. In this paper, we propose the use of deep learning to detect vandals based on their edit history. In particular, we develop a multi-source long-short term memory network (M-LSTM) to model user behaviors by using a variety of user edit aspects as inputs, including the history of edit reversion information, edit page titles and categories. With M-LSTM, we can encode each user into a low dimensional real vector, called user embedding. Meanwhile, as a sequential model, M-LSTM updates the user embedding each time after the user commits a new edit. Thus, we can predict whether a user is benign or vandal dynamically based on the up-to-date user embedding. Furthermore, those user embeddings are crucial to discover collaborative vandals.

Kernel Methods 1 Fair Kernel Learning

FRIDAY

14:00 - 14:20

CONGRESS HALL 2

Adrian Perez-Suay, Valero Laparra, Gonzalo Mateo-García, Jordi Muñoz-Marí, Luis Gómez-Chova, Gustau Camps-Valls New social and economic activities massively exploit big data and machine learning algorithms to do inference on people’s lives. Applications include automatic curricula evaluation, wage determination, and risk assessment for credits and loans. Recently, many governments and institutions have raised concerns about the lack of fairness, equity and ethics in machine learning to treat these problems. It has been shown that not including sensitive features that bias fairness, such as gender or race, is not enough to mitigate the discrimination when other related features are included. Instead, including fairness in the objective function has been shown to be more efficient. We present novel fair regression and dimensionality reduction methods built on a previously proposed fair classification framework. Both methods rely on using the Hilbert Schmidt independence criterion as the fairness term. Unlike previous approaches, this allows us to simplify the problem and to use multiple sensitive variables simultaneously. Replacing the linear formulation by kernel functions allows the methods to deal with nonlinear problems. For both linear and nonlinear formulations the solution reduces to solving simple matrix inversions or generalized eigenvalue problems. This simplifies the evaluation of the solutions for different trade-off values between the predictive error and fairness terms. We illustrate the usefulness of the proposed methods in toy examples, and evaluate their performance on real world datasets to predict income using gender and/or race discrimination as sensitive variables, and contraceptive method prediction under demographic and socio-economic sensitive descriptors.


WEDNESDAY

15:20 - 15:40

THURSDAY

Recent literature has shown the merits of having deep representations in the context of neural networks. An emerging challenge in kernel learning is the definition of similar deep representations. In this paper, we propose a general methodology to define a hierarchy of base kernels with increasing expressiveness and combine them via multiple kernel learning (MKL) with the aim to generate overall deeper kernels. As a leading example, this methodology is applied to learning the kernel in the space of Dot-Product Polynomials (DPPs), that is a positive combination of homogeneous polynomial kernels (HPKs). We show theoretical properties about the expressiveness of HPKs that make their combination empirically very effective. This can also be seen as learning the coefficients of the Maclaurin expansion of any definite positive dot product kernel thus making our proposed method generally applicable. We empirically show the merits of our approach comparing the effectiveness of the kernel generated by our method against baseline kernels (including homogeneous and non homogeneous polynomials, RBF, etc...) and against another hierarchical approach on several benchmark datasets.

TUESDAY

Michele Donini, Fabio Aiolli

85

WEDNESDAY 14:20 - 14:40

SEPTEMBER 20 GaKCo: A Fast Gapped k-mer String Kernel using Counting Ritambhara Singh, Arshdeep Sekhon, Andrew Norton, Jack Lanchantin, Kamran Kowsari, Beilun Wang, Yanjun Qi

String Kernel (SK) techniques, especially those using gapped k-mers as features (gk), have obtained great success in classifying sequences like DNA, protein, and text. However, the state-of-the-art gk-SK runs extremely slow when we increase the dictionary size or allow more number of mismatches. This is because current gk-SK uses a trie-based algorithm to calculate co-occurrence of mismatched substrings resulting in a time cost proportional to dictionary size and mismatches. We propose a fast algorithm for calculating Gapped k-mer Kernel using counting (GaKCo). GaKCo uses associative arrays to calculate the co-occurrence of substrings using cumulative counting. This algorithm is fast, scalable to larger dictionary and mismatches, and naturally parallelizable. We provide a rigorous asymptotic analysis that compares GaKCo with the state-of-the-art gk-SK. Theoretically, the time cost of GaKCo is independent of the dictionary term that slows down the trie-based approach. Experimentally, we observe that GaKCo achieves the same accuracy as the state-of-the-art and outperforms its speed by factors of 2, 100, and 4, on classifying sequences of DNA (5 datasets), protein (12 datasets), and character-based English text (2 datasets). 14:40 - 15:00

Graph Enhanced Memory Networks for Sentiment Analysis Zhao Xu, Romain Vial, Kristian Kersting

Memory networks model information and knowledge as memories that can be manipulated for prediction and reasoning about questions of interest. In many cases, there exists complicated relational structure in the data, by which the memories can be linked together into graphs to propagate information. Typical examples include tree structure of a sentence and knowledge graph in a dialogue system. In this paper, we present a novel graph enhanced memory network GEMN to integrate relational information between memories for prediction and reasoning. Our approach introduces graph attentions to model the relations, and couples them with content-based attentions via an additional neural network layer. It thus can better identify and manipulate the memories related to a given question, and provides more accurate prediction about the final response. We demonstrate the effectiveness of the proposed approach with aspect based sentiment classification. The empirical analysis on real data shows the advantages of incorporating relational dependencies into the memory networks. 15:00 - 15:20

Kernel Sequential Monte Carlo Ingmar Schuster, Heiko Strathmann, Brooks Paige, Dino Sejdinovic

We propose kernel sequential Monte Carlo (KSMC), a framework for sampling from static target densities. KSMC is a family of sequential Monte Carlo (SMC) algorithms that are based on building emulator models of the current particle system in a reproducing kernel Hilbert space. We here focus on modelling nonlinear covariance structure and gradients of the target. The emulator’s geometry is adaptively updated and subsequently used to inform local proposals. Unlike in adaptive Markov chain Monte Carlo (MCMC), continuous adaptation does not compromise convergence of the sampler. KSMC combines the strengths of SMC and kernel methods: superior performance for multimodal targets and the ability to estimate model evidence as compared to MCMC, and the emulator’s ability to represent targets that exhibit high degrees of nonlinearity. As KSMC does not require access to target gradients, it is particularly applicable on targets whose gradients are unknown or prohibitively expensive. We describe necessary tuning details and demonstrate the benefits of the the proposed methodology on a series of challenging synthetic and real-world examples. 15:20 - 15:40

Learning Lukasiewicz Logic Fragments By Quadratic Programming Francesco Giannini, Michelangelo Diligenti, Marco Gori, Marco Maggini

In this paper we provide a framework to embed logical constraints into the classical learning scheme of kernel machines, that gives rise to a learning algorithm based on a quadratic programming problem. In particular, we show that, once the constraints are expressed using a specific fragment from the Lukasiewicz logic, the learning objective turns out to be convex. We formulate the primal and dual forms of a general multitask learning problem, where the functions to be determined are predicates (of any arity) defined on the feature space. The learning set contains both supervised examples for each predicate and unsupervised examples exploited to enforce the constraints. We give some properties of the solutions constructed by the framework along with promising experimental results.

86


Nectar 2 14:00 - 14:20

CONGRESS HALL 3 Recent Advances in Kernel-Based Graph Classification

We review our recent progress in the development of graph kernels. We discuss the hash graph kernel framework, which makes the computation of kernels for graphs with vertices and edges annotated with real-valued information feasible for large data sets. Moreover, we summarize our general investigation of the benefits of explicit graph feature maps in comparison to using the kernel trick. Our experimental studies on real-world data sets suggest that explicit feature maps often provide sufficient classification accuracy while being computed more efficiently. Finally, we describe how to construct valid kernels from optimal assignments to obtain new expressive graph kernels. These make use of the kernel trick to establish one-to-one correspondences. We conclude by a discussion of our results and their implication for the future development of graph kernels. 14:20 - 14:40

Activity-Driven Influence Maximization in Social Networks

MONDAY

Nils Kriege, Christopher Morris

QuickScorer: Efficient Traversal of Large Ensembles of Decision Trees Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, Rossano Venturini

Machine-learnt models based on additive ensembles of binary regression trees are currently deemed the best solution to address complex classification, regression, and ranking tasks. Evaluating these models is a computationally demanding task as it needs to traverse thousands of trees with hundreds of nodes each. The cost of traversing such large forests of trees significantly impacts their application to big and stream input data, when the time budget available for each prediction is limited to guarantee a given processing throughput. Document ranking in Web search is a typical example of this challenging scenario, where the exploitation of tree-based models to score query-document pairs, and finally rank lists of documents for each incoming query, is the state-of-art method for ranking (a.k.a. Learning-to-Rank). This paper presents QuickScorer, a novel algorithm for the traversal of huge decision trees ensembles that, thanks to a cache- and CPU-aware design, provides a ~9x speedup over best competitors. 15:00 - 15:20

An AI Planning System for Data Cleaning

WEDNESDAY

14:40 - 15:00

THURSDAY

Interaction networks consist of a static graph with a timestamped list of edges over which interaction took place. Examples of interaction networks are social networks whose users interact with each other through messages or location-based social networks where people interact by checking in to locations. Previous work on finding influential nodes in such networks mainly concentrate on the static structure imposed by the interactions or are based on fixed models for which parameters are learned using the interactions. In two recent works, however, we proposed an alternative activity data-driven approach based on the identification of influence propagation patterns. In the first work, we identify so-called information-channels to model potential pathways for information spread, while the second work exploits how users in a location-based social network check-in to locations in order to identify influential locations. To make our algorithms scalable, approximate versions based on sketching techniques from the data streams domain have been developed. Experiments show that in this way it is possible to efficiently find good seed sets for influence propagation in social networks.

TUESDAY

Rohit Kumar, Muhammad Saleem, Toon Calders, Torben Pedersen, Xike Xie

Data Cleaning represents a crucial and error prone activity in KDD that might have unpredictable effects on data analytics, affecting the believability of the whole KDD process. In this paper we describe how a bridge between AI Planning and Data Quality communities has been made, by expressing both the data quality and cleaning tasks in terms of AI planning. We also report a real-life application of our approach

15:20 - 15:40

Image Representation, Annotation and Retrieval with Predictive Clustering Trees Ivica Dimitrovski, Dragi Kocev, Suzana Loskovska, Sašo Džeroski

In this work, we summarize our work on using the predictive clustering framework for image analysis. More specifically, we used predictive clustering trees to generate image representations, that can then be used to perform image retrieval and/or image annotation. We evaluated the proposed method for performing image retrieval on general purpose images, and annotation of general purpose images, medical images and diatom images.


87

FRIDAY

Roberto Boselli, Mirko Cesarini, Fabio Mercorio, Mario Mezzanzanica

WEDNESDAY

SEPTEMBER 20


CONGRESS HALL 4

Analyzing Granger Causality in Climate Data with Time Series Classification Methods Christina Papagiannopoulou, Stijn Decubber, Diego Miralles, Matthias Demuzere, Niko Verhoest, Willem Waegeman

Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested.

14:20 - 14:40

CREST - Risk Prediction for Clostridium Difficile Infection Using Multimodal Data Mining Cansu Sen, Thomas Hartvigsen, Elke Rundensteiner, Kajal Claypool

Clostridium difficile infection (CDI) is a common hospital acquired infection with a $1B annual price tag that resulted in ~30,000 deaths in 2011. Studies have shown that early detection of CDI significantly improves the prognosis for the individual patient and reduces the overall mortality rates and associated medical costs. In this paper, we present CREST: CDI Risk Estimation, a data-driven framework for early and continuous detection of CDI in hospitalized patients. CREST uses a three-pronged approach for high accuracy risk prediction. First, CREST builds a rich set of highly predictive features from Electronic Health Records. These features include clinical and non-clinical phenotypes, key biomarkers from the patient’s laboratory tests, synopsis features processed from time series vital signs, and medical history mined from clinical notes. Given the inherent multimodality of clinical data, CREST bins these features into three sets: time-invariant, time-variant, and temporal synopsis features. CREST then learns classifiers for each set of features, evaluating their relative effectiveness. Lastly, CREST employs a second-order meta learning process to ensemble these classifiers for optimized estimation of the risk scores. We evaluate the CREST framework using publicly available critical care data collected for over 12 years from Beth Israel Deaconess Medical Center, Boston. Our results demonstrate that CREST predicts the probability of a patient acquiring CDI with an AUC of 0.76 five days prior to diagnosis. This value increases to 0.80 and even 0.82 for prediction two days and one day prior to diagnosis, respectively.

88



CONGRESS HALL 1

Regularizing Neural Knowledge Graph Embeddings via Equivalence and Inversion Axioms

Learning embeddings of entities and relations using neural architectures is an effective method of performing statistical learning on largescale relational data, such as knowledge graphs. In this paper, we consider the problem of regularizing the training of neural knowledge graph embeddings by leveraging external background knowledge. We propose a principled and scalable method for leveraging equivalence and inversion axioms during the learning process, by imposing a set of model-dependent soft constraints on the predicate embeddings. The method has several advantages: i) the number of introduced constraints does not depend on the number of entities in the knowledge base; ii) regularities in the embedding space effectively reflect available background knowledge; iii) it yields more accurate results in link prediction tasks over non-regularized methods; and iv) it can be adapted to a variety of models, without affecting their scalability properties. We demonstrate the effectiveness of the proposed method on several large knowledge graphs. Our evaluation shows that it consistently improves the predictive accuracy of several neural knowledge graph embedding models (for instance,the MRR of TransE on WordNet increases by 11%) without compromising their scalability properties.

Polina Rozenstein, Nikolaj Tatti, Aristides Gionis In this paper we study a problem of determining when entities are active based on their interactions with each other. More formally, we consider a set of entities V and a sequence of time-stamped edges E among the entities. Each edge (u,v,t) in E denotes an interaction between entities u and v that takes place at time t. We view this input as a temporal network. We then assume a simple activity model in which each entity is active during a short time interval. An interaction (u,v,t) can be explained if at least one of u or v are active at time t. Our goal is to reconstruct the activity intervals, for all entities in the network, so as to explain the observed interactions. This problem, which we refer to as the network-untangling problem, can be applied to discover timelines of events from complex interactions among entities. We provide two formulations for the networkuntangling problem: (i) minimizing the total interval length over all entities, and (ii) minimizing the maximum interval length. We show that the sum problem is NP-hard, while, surprisingly, the max problem can be solved optimally in linear time, using a mapping to 2-SAT. For the sum problem we provide efficient and effective algorithms based on realistic assumptions. Furthermore, we complement our study with an extensive evaluation on synthetic and real-world datasets, which demonstrates the validity of our concepts and the good performance of our algorithms.

16:40 - 17:00

TransT: Type-Based Multiple Embedding Representations for Knowledge Graph Completion

TUESDAY

The Network-untangling Problem: From Interactions to Activity Timelines

WEDNESDAY

16:20 - 16:40

MONDAY

Pasquale Minervini, Luca Costabello, Emir Munoz, Vit Novacek, Pierre-Yves Vandenbussche

Kernel Methods 2 16:00 - 16:20

FRIDAY

Knowledge graph completion with representation learning predicts new entity-relation triples from the existing knowledge graphs by embedding entities and relations into a vector space. Most existing methods focus on the structured information of triples and maximize the likelihood of them. However, they neglect semantic information contained in most knowledge graphs and the prior knowledge indicated by the semantic information. To overcome this drawback, we propose an approach that integrates the structured information and entity types which describe the categories of entities. Our approach constructs relation types from entity types and utilizes type-based semantic similarity of the related entities and relations to capture prior distributions of entities and relations. With the type-based prior distributions, our approach generates multiple embedding representations of each entity in different contexts and estimates the posterior probability of entity and relation prediction. Extensive experiments show that our approach outperforms previous semantics-based methods.

CONGRESS HALL 2

Bayesian Nonlinear Support Vector Machines for Big Data Florian Wenzel, Theo Galy-Fajou, Matthäus Deutsch, Marius Kloft

We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional features over frequentist competitors such as accurate predictive uncertainty estimates and automatic hyperparameter search.


THURSDAY

Shiheng Ma, Jianhui Ding, Weijia Jia, Kun Wang, Minyi Guo

89

WEDNESDAY 16:20 - 16:40

SEPTEMBER 20 Entropic Trace Estimation for Log Determinants Jack Fitzsimons, Diego Granziol, Kurt Cutajar, Michael Osborne, Maurizio Filippone, Stephen Roberts

The scalable calculation of matrix determinants has been a bottleneck to the widespread application of many machine learning methods such as determinantal point processes, Gaussian processes, generalised Markov random fields, graph models and many others. In this work, we estimate log determinants under the framework of maximum entropy, given information in the form of moment constraints from stochastic trace estimation. The estimates demonstrate a significant improvement on state-of-the-art alternative methods, as shown on a wide variety of matrices from the SparseSuite Matrix Collection. By taking the example of a general Markov random field, we also demonstrate how this approach can significantly accelerate inference in large-scale learning methods involving the log determinant. 16:40 - 17:00

Nystrom Sketching Daniel Perry, Braxton Osting, Ross Whitaker

Despite prolific success, kernel methods become difficult to use in many large-scale unsupervised problems because of the evaluation and storage of the full Gram matrix. Here we overcome this difficulty by proposing a novel approach: compute the optimal small, out-of-sample Nystrom sketch which allows for fast approximation of the Gram matrix via the Nystrom method. We demonstrate and compare several methods for computing the optimal Nystrom sketch and show how this approach outperforms previous state-of-the-art Nystrom subset-based methods of similar size.

Learning & Optimization 2 16:00 - 16:20

CONGRESS HALL 3

A Constrained l1 minimization approach for estimating multiple Sparse Gaussian or Nonparanormal Graphical Models Beilun Wang, Ritambhara Singh, Yanjun Qi

Identifying context-specific entity networks from aggregated data is an important task, arising often in bioinformatics and neuroimaging applications. Computationally, this task can be formulated as jointly estimating multiple different, but related, sparse undirected graphical models (UGM) from aggregated samples across several contexts. Previous joint-UGM studies have mostly focused on sparse Gaussian graphical models (sGGMs) and can’t identify context-specific edge patterns directly. We, therefore, propose a novel approach, SIMULE (detecting Shared and Individual parts of MULtiple graphs Explicitly) to learn multi-UGM via a constrained l1 minimization. SIMULE automatically infers both specific edge patterns that are unique to each context and shared interactions preserved among all the contexts. Through the l1 constrained formulation, this problem is cast as multiple independent subtasks of linear programming that can be solved efficiently in parallel. In addition to Gaussian data, SIMULE can also handle multivariate Nonparanormal data that greatly relaxes the normality assumption that many real-world applications do not follow. We provide a novel theoretical proof showing that SIMULE achieves a consistent result at the rate O(log(Kp)/ntot). On multiple synthetic datasets and two biomedical datasets, SIMULE shows significant improvement over state-of-the-art multi-sGGM and single-UGM baselines (SIMULE implementation and the used datasets @https://github.com/QData/SIMULE).

16:20 - 16:40

Distributed Stochastic Optimization of the Regularized Risk via SaddlePoint Problem Shin Matsushima, Hyokun Yun, Xinhua Zhang, S.V.N. Vishwanathan

Many machine learning algorithms minimize a regularized risk, and stochastic optimization is widely used for this task. When working with massive data, it is desirable to perform stochastic optimization in parallel. Unfortunately, many existing stochastic optimization algorithms cannot be parallelized efficiently. In this paper we show that one can rewrite the regularized risk minimization problem as an equivalent saddle-point problem, and propose an efficient distributed stochastic optimization (DSO) algorithm. We prove the algorithm’s rate of convergence; remarkably, our analysis shows that the algorithm scales almost linearly with the number of processors. We also verify with empirical evaluations that the proposed algorithm is competitive with other parallel, general purpose stochastic and batch optimization algorithms for regularized risk minimization.

90


16:40 - 17:00

Sparse Probit Linear Mixed Model Stephan Mandt, Florian Wenzel, Shinichi Nakajima, John Cunningham, Christoph Lippert, Marius Kloft

MONDAY

Linear mixed models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting for various confounding factors such as age, ethnicity and population structure. Formulated as models for linear regression, LMMs have been restricted to continuous phenotypes. We introduce the sparse probit linear mixed model (Probit-LMM), where we generalize the LMM modeling paradigm to binary phenotypes. As a technical challenge, the model no longer possesses a closed-form likelihood function. In this paper, we present a scalable approximate inference algorithm that lets us fit the model to high-dimensional data sets. We show on three real-world examples from different domains that in the setup of binary labels, our algorithm leads to better prediction accuracies and also selects features which show less correlation with the confounding factors.

Subgroup Discovery

16:20 - 16:40

Flash Points: Discovering exceptional pairwise behaviors in Vote or Rating Data Adnene Belfodil, Sylvie Cazalens, Philippe Lamarre, Marc Plantevit

We address the problem of discovering contexts that lead well-distinguished collections of individuals to change their pairwise agreement w.r.t. their usual one. For instance, in the European parliament, while in overall, a strong disagreement is witnessed between deputies of the far-right French party Front National and deputies of the left party Front de Gauche, a strong agreement is observed between these deputies in votes related to the thematic: External relations with the union. We devise the method DSC (Discovering Similarities Changes) which relies on exceptional model mining to uncover three-set patterns that identify contexts and two collections of individuals where an unex- pected strengthening or weakening of pairwise agreement is observed. To efficiently explore the search space, we define some closure operators and pruning techniques using upper bounds on the quality measure. In addition of handling usual attributes (e.g. numerical, nominal), we propose a novel pattern domain which involves hierarchical multi-tag attributes that are present in many datasets. A thorough empirical study on two real-world datasets (i.e., European parliament votes and collaborative movie reviews) demonstrates the efficiency and the effectiveness of our approach as well as the interest and the actionability of the patterns. 16:40 - 17:00

Identifying Consistent Statements About Numerical Data with Dispersion-Corrected Subgroup Discovery Mario Boley, Bryan R. Goldsmith, Luca M. Ghiringhelli, Jilles Vreeken

Existing algorithms for subgroup discovery with numerical targets do not optimize the error or target variable dispersion of the groups they find. This often leads to unreliable or inconsistent statements about the data, rendering practical applications, especially in scientific domains, futile. Therefore, we here extend the optimistic estimator framework for optimal subgroup discovery to a new class of objective functions: we show how tight estimators can be computed efficiently for all functions that are determined by subgroup size (non-decreasing dependence), the subgroup median value, and a dispersion measure around the median (non-increasing dependence). In the important special case when dispersion is measured using the mean absolute deviation from the median, this novel approach yields a linear time algorithm. Empirical evaluation on a wide range of datasets shows that, when used within branch-and-bound search, this approach is highly efficient and indeed discovers subgroups with much smaller errors.


91

TUESDAY

We propose a novel approach to finding explanations of deviating subsets, often called subgroups. Existing approaches for subgroup discovery rely on various quality measures that nonetheless often fail to find subgroup sets that are diverse, of high quality, and most importantly, provide good explanations of the deviations that occur in the data. To tackle this issue we introduce explanation networks, which provide a holistic view on all candidate subgroups and how they relate to each other, offering elegant ways to select high-quality yet diverse subgroup sets. Explanation networks are constructed by representing subgroups by nodes and having weighted edges represent the extent to which one subgroup explains another. Explanatory strength is defined by extending ideas from database causality, in which interventions are used to quantify the effect of one query on another. Given an explanatory network, existing network analysis techniques can be used for subgroup discovery. In particular, we study the use of PageRank for pattern ranking and seed selection (from influence maximization) for pattern set selection. Experiments on synthetic and real data show that the proposed approach finds subgroup sets that are more likely to capture the generative processes of the data than other methods.

WEDNESDAY

Antti Ukkonen, Vladimir Dzyuba, Matthijs Van Leeuwen

THURSDAY

Explaining Deviating Subsets Through Explanation Networks

FRIDAY

16:00 - 16:20

CONGRESS HALL 4

THURSDAY 92


PROGRAM AT A GLANCE CONGRESS HALL 1 09:00 - 10:00

Invited Talk PIERRE-PHILIPPE MATHIEU ENABLING A SMARTER PLANET WITH EARTH OBSERVATION

10:00 - 10:30

BEST STUDENT KDD PAPER PRESENTATION

10:30 - 11:00

COFFEE & TEA BREAK

11:00 - 12:40

CONGRESS HALL

CONGRESS HALL

CONGRESS HALL

CONGRESS HALL

MEZZANINE HALL

1

2

3

4

2+3

Networks & Graphs 3

Unsupervised & Semi-supervised Learning 2

Anomaly Detection





Computer Vision


LUNCH BREAK

12:40 - 14:00

14:00 - 15:40

Recommendation

Transfer & Multi-Task Learning 1

15:40 - 16:00

16:00 - 17:40

Time Series & Streams 2 COFFEE & TEA BREAK


Matrix & Tensor Factorization



Invited Talk CORDELIA SCHMID AUTOMATIC UNDERSTANDING OF THE VISUAL WORLD Demo Spotlights

20:00 - 23:00

____________ DEM0 2 & POSTER SESSION 2


93

THURSDAY

SEPTEMBER 21

CONGRESS HALL 1

Invited Talk

09:00 - 10:00

Enabling a Smarter Planet with Earth Observation Pierre-Philippe Mathieu ESA/ESRIN, EO Science, Applications and New Technologies, Italy Nowadays, teams of researchers around the world can easily access a wide range of open data across disciplines and remotely process them on the Cloud, combining them with their own data to generate knowledge, develop information products for societal applications, and tackle complex integrative complex problems that could not be addressed a few years ago. Such rapid exchange of digital data is fostering a new world of data-intensive research, characterized by openness, transparency, and scrutiny and traceability of results, access to large volume of complex data, availability of community open tools, unprecedented level of computing power, and new collaboration among researchers and new actors such as citizen scientists. The EO scientific community is now facing the challenge of responding to this new paradigm in science 2.0 in order to make the most of the large volume of complex and diverse data delivered by the new generation of EO missions, and in particular the Sentinels. In this context, ESA is supporting a variety of activities in partnership with research communities to ease the transition and make the most of the data. These include the generation of new open tools and exploitation platforms, exploring new ways to disseminate data, building new partnership with citizen scientists, and training the new generation of data scientists. The talk will give a brief overview of some of ESA activities aiming to facilitate the exploitation of large amounts of data from EO missions in a collaborative, cross-disciplinary, and open way, for uses ranging from science to applications and education.

Chair: Ljupčo Todorovski

Sponsored by

Invited Talk


Automatic Understanding of the Visual World Cordelia Schmid INRIA, France One of the central problems of artificial intelligence is machine perception, i.e., the ability to understand the visual world based on input from sensors, such as cameras. Computer vision is the area which analyzes visual input. In this talk, I will present recent progress in visual understanding. It is for the most part due to the design of robust visual representations and learned models capturing the variability of the visual world based on state-of-the-art machine learning techniques, including convolutional neural networks. Progress has resulted in technology for a variety of applications. I will present in particular results for human action recognition.


94


CONGRESS HALL 1

Best Student KDD Paper Award

10:00 - 10:30

Reliable evaluation of network mining tools implies significance and scalability testing. This is usually achieved by picking several graphs of various size from different domains. However, graph properties and thus evaluation results could be dramatically different from one domain to another. Hence the necessity of aggregating results over a multitude of graphs within each domain. The paper introduces an approach to automatically learn features of a directed graph from any domain and generate similar graphs while scaling input graph size with a real-valued factor. Generating multiple graphs with similar size allows significance testing, while scaling graph size makes scalability evaluation possible. The proposed method relies on embedding an input graph into low-dimensional space, thus encoding graph features in a set of node vectors. Edge weights and node communities could be imitated as well in optional steps. We demonstrate that embedding-based approach ensures variability of synthetic graphs while keeping degree and subgraphs distributions close to the original graphs. Therefore, the method could make significance and scalability testing of network algorithms more reliable without the need to collect additional data. We also show that embeddingbased approach preserves various features in generated graphs which can’t be achieved by other generators imitating a given graph.

WEDNESDAY

Chair: Ljupčo Todorovski

TUESDAY

Mikhail Drobyshevskiy, Anton Korshunov & Denis Turdakov

MONDAY

Learning and Scaling Directed Networks via Graph Embedding

FRIDAY

THURSDAY

Sponsored by


95

THURSDAY

SEPTEMBER 21

SESSIONS AT A GLANCE

Networks & Graphs 3


Chair: Toon Calders

11:00 - 11:20 11:20 - 11:40 11:40 - 12:00 12:00 - 12:20 12:20 - 12:40

Graph-Based Predictable Feature Analysis Björn Weghenkel, Asja Fischer, Laurenz Wiskott

Lagrangian Relaxations for Multiple Network Alignment Eric Malmi, Sanjay Chawla, Aristides Gionis

Measuring and Moderating Opinion Polarization in Social Networks Antonis Matakos, Evimaria Terzi, Panayiotis Tsaparas

Micro-Review Synthesis for Multi-Entity Summarization Thanh-Son Nguyen, Hady W. Lauw, Panayiotis Tsaparas

Survival Factorization for Topical Cascades on Diffusion Networks Nicola Barbieri, Giuseppe Manco, Ettore Ritacco

Unsupervised & Semi-supervised Learning 2

CONGRESS HALL 2

Chair: Ulf Brefeld

11:00 - 11:20 11:20 - 11:40 11:40 - 12:00 12:00 - 12:20 12:20 - 12:40

96

An Exponential Family Framework For Learning to Predict Unseen Classes Vinay Verma, Wenlin Wang, Piyush Rai

An Expressive Similarity Measure for Relational Clustering Using Neighbourhood Trees Sebastijan Dumančić, Hendrik Blockeel

DeepCluster: A General Clustering Framework Based on Deep Learning Kai Tian, Shuigeng Zhou, Jihong Guan

Local PurTree Subspace Spectral Clustering for Customer Transaction Data - NO SHOW Xiaojun Chen, JianZhe Zhang, Wenya Sun, Joshua Huang, Qingyao Wu

Multi-View Spectral Clustering on Conflicting Views Xiao He, Limin Li, Damian Roqueiro, Karsten Borgwardt


11:00-12:40

CONGRESS HALL 3

Anomaly Detection

11:00-12:40

Chair: Matthijs van Leeuwen

11:40 - 12:00 12:00 - 12:20 12:20 - 12:40

Fabrizio Angiulli

Efficient Top Rank Optimization with Gradient Boosting for Supervised Anomaly Detection Jordan Frery, Marc Sebban, Amaury Habrard, Olivier Caelen, Liyun Guelton

Robust, Deep and Inductive Anomaly Detection Raghavendra Chalapathy, Aditya Krishna Menon, Sanjay Chawla

MONDAY

11:20 - 11:40

Concentration Free Outlier Detection

Sentiment Informed Cyberbullying Detection in Social Media Harsh Dani, Jundong Li, Huan Liu

ZooRank: Ranking Suspicious Activities in Time-Evolving Tensors Hemank Lamba, Bryan Hooi, Kijung Shin, Christos Faloutsos, Juergen Pfeffer

CONGRESS HALL 4


11:00-12:40

Chair: Kamalika Das

11:40 - 12:00 12:00 - 12:20

12:20 - 12:40

Rui Chen, Jiajun Liu

WEDNESDAY

11:20 - 11:40

A Novel Framework for Online Sales Burst Prediction MRNet-Product2Vec: A Multi-Task Recurrent Neural Network for Product Embeddings Arijit Biswas, Mukul Bhutani, Subhajit Sanyal

Disjoint-Support Factors and Seasonality Estimation in E-Commerce Abhay Jha

Generalising Random Forest Parameter Optimisation to Include Stability and Cost Chak Hin Bryan Liu, Benjamin Chamberlain, Duncan Little, Angelo Cardoso

THURSDAY

11:00 - 11:20

Session-Based Fraud Detection in Online E-Commerce Transactions Using Recurrent Neural Networks Shuhao Wang, Cancheng Liu, Xiang Gao, Hongtao Qu, Wei Xu

CONGRESS HALL 1

Recommendation

14:20 - 14:40 14:40 - 15:00 15:00 - 15:20 15:20 - 15:40

FRIDAY

14:00-15:40

Chair: Myra Spiliopoulu

14:00 - 14:20

TUESDAY

11:00 - 11:20

A Regularization Method with Inference of Trust and Distrust in Recommender Systems Dimitrios Rafailidis, Fabio Crestani

A Unified Contextual Bandit Framework for Long- and Short-Term Recommendations Maryam Tavakol, Ulf Brefeld

Perceiving the Next Choice with Comprehensive Transaction Embeddings for Online Recommendation Shoujin Wang, Liang Hu, Longbing Cao, Xiaoshui Huang

Social Regularized Von Mises-Fisher Mixture Model for Item Recommendation Aghiles Salah, Mohamed Nadif

Tour Recommendation for Groups Aris Anagnostopoulos, Reem Atassi, Luca Becchetti, Adriano Fazzone, Fabrizio Silvestri


97

THURSDAY

SEPTEMBER 21



Chair: Gianvito Pio

14:00 - 14:20 14:20 - 14:40 14:40 - 15:00 15:00 - 15:20 15:20 - 15:40

Lifelong Machine Learning with Gaussian Processes Christopher Clingerman, Eric Eaton

Personalized Tag Recommendation for Images Using Deep Transfer Learning Hanh T.H. Nguyen, Martin Wistuba, Lars Schmidt-Thieme

Theoretical Analysis of Domain Adaptation with Optimal Transport Ievgen Redko, Amaury Habrard, Marc Sebban

TSP: Learning Task-Specific Pivots for Unsupervised Domain Adaptation Xia Cui, Frans Coenen, Danushka Bollegala

Varying-Coefficient Models for Geospatial Transfer Learning Matthias Bussas, Christoph Sawade, Nicolas Kühn, Tobias Scheffer, Niels Landwehr

Time Series & Streams 2


Chair: Panče Panov

14:00 - 14:20 14:20 - 14:40 14:40 - 15:00 15:00 - 15:20 15:20 - 15:40

A Multiscale Bezier-Representation for Time Series That Supports Elastic Matching Frank Hoeppner, Tobias Sobek

Cost Sensitive Time-Series Classification Shoumik Roychoudhury, Mohamed Ghalwash, Zoran Obradovic

Efficient Temporal Kernels Between Feature Sets for Time Series Classification Romain Tavenard, Simon Malinowski, Laetitia Chapel, Adeline Bailly, Heider Sanchez, Benjamin Bustos

Forecasting and Granger Modelling with Non-Linear Dynamical Dependencies Magda Gregorova, Alexandros Kalousis, Stephan Marchand-Maillet

UAPD: Predicting Urban Anomalies from Spatial-Temporal Data Xian Wu, Yuxiao Dong, Chao Huang, Jian Xu, Dong Wang, Nitesh Chawla


CONGRESS HALL 4

Chair: Kamalika Das

14:00 - 14:20

14:00-15:40

Predicting Self-Reported Customer Satisfaction of Interactions with a Corporate Call Center Joseph Bockhorst, Shi Yu, Luisa Polania Cabrera, Glenn Fung Moo

Have It Both Ways - from A/B Testing to A&B Testing with Exceptional Model Mining 14:20 - 14:40

14:40 - 15:00 15:00 - 15:20 15:20 - 15:40

98

Wouter Duivesteijn, Tara Farzami, Thijs Putman, Evertjan Peer, Hilde J.P. Weerts, Jasper Adegeest, Gerson Foks, Mykola Pechenizkiy

Structural Semantic Models for Automatic Analysis of Urban Areas Gianni Barlacchi, Alberto Rossi, Bruno Lepri, Alessandro Moschitti

Using Machine Learning for Labour Market Intelligence Roberto Boselli, Mirko Cesarini, Fabio Mercorio, Mario Mezzanzanica

SINAS: Suspect Investigation Using Offenders' Activity Space Mohammad Tayebi, Uwe Glässer, Patricia Brantingham, Hamed Yaghoubi Shahir


CONGRESS HALL 1


16:00-17:40

Chair: Indrė Žliobaitė

Labeled DBN Learning with Community Structure Knowledge Etienne Auclair, Nathalie Peyrard, Régis Sabbadin

Online Sparse Collapsed Hybrid Variational-Gibbs Algorithm for Hierarchical Dirichlet Process Topic Models

MONDAY

16:40 - 17:00

Pedram Daee, Tomi Peltola, Marta Soare, Samuel Kaski

Sophie Burkhardt, Stefan Kramer

17:20 - 17:40

Partial Device Fingerprints Michael Ciere, Carlos Ganan, Michel Van Eeten

Vine Copulas for Mixed Data : Multi-View Clustering for Mixed Data Beyond Meta-Gaussian Dependencies Lavanya Sita Tekumalla, Vaibhav Rajan, Chiranjib Bhattacharyya

CONGRESS HALL 2

Matrix & Tensor Factorization

16:00-17:40

Chair: Katharina Morik

16:20 - 16:40 16:40 - 17:00 17:00 - 17:20 17:20 - 17:40

C-SALT: Mining Class-Specific ALTerations in Boolean Matrix Factorization Sibylle Hess, Katharina Morik

WEDNESDAY

16:00 - 16:20

Comparative Study of Inference Methods for Bayesian Nonnegative Matrix Factorisation Thomas Brouwer, Jes Frellsen, Pietro Lio

Content-Based Social Recommendation with Poisson Matrix Factorization Eliezer De Souza Da Silva, Helge Langseth, Heri Ramampiaro

Feature Extraction for Incomplete Data Via Low-Rank Tucker Decomposition Qiquan Shi, Yiu-Ming Cheung, Qibin Zhao

Structurally Regularized Non-Negative Tensor Factorization for Spatio-Temporal Pattern Discoveries Koh Takeuchi, Yoshinobu Kawahara, Tomoharu Iwata



Chair: Pavel Brazdil

16:00 - 16:20

A Novel Rating Pattern Transfer Model for Improving Non-Overlapping Cross-Domain Collaborative Filtering

FRIDAY

17:00 - 17:20

Yizhou Zang, Xiaohua Hu 16:20 - 16:40 16:40 - 17:00 17:00 - 17:20

TUESDAY

16:20 - 16:40

Knowledge Elicitation Via Sequential Probabilistic Inference for High-Dimensional Prediction

THURSDAY

16:00 - 16:20

Distributed Multi-Task Learning for Sensor Network Jiyi Li, Tomohiro Arai, Yukino Baba, Hisashi Kashima, Shotaro Miwa

Learning Task Structure Via Sparsity Grouped Multitask Learning Meghana Kshirsagar, Eunho Yang, Aurelie Lozano

Ranking Based Multitask Learning of Scoring Functions Ivan Stojkovic, Mohamed Ghalwash, Zoran Obradovic


99

THURSDAY

SEPTEMBER 21

Computer Vision



16:00 - 16:20 16:20 - 16:40 16:40 - 17:00 17:00 - 17:20 17:20 - 17:40

Alternative Semantic Representations for Zero-Shot Human Action Recognition Qian Wang, Ke Chen

Early Active Learning with Pairwise Constraint for Person Re-Identification Wenhe Liu, Xiaojun Chang, Ling Chen, Yi Yang, Alexander Hauptmann

Guiding InfoGAN with Semi-Supervision Adrian Spurr, Emre Aksan, Otmar Hilliges

Scatteract: Automated Extraction of Data from Scatter Plots Mathieu Cliche, David Rosenberg, Connie Yee, Dhruv Madeka

Unsupervised Diverse Colorization Via Generative Adversarial Networks Yun Cao, Zhiming Zhou, Weinan Zhang, Yong Yu

MACEDONIAN OPERA & BALLET

Demo 2 Chair: Jesse Read & Marinka Žitnik

Monitoring Physical Activity and Mental Stress Using Wrist-Worn Device and a Smartphone Božidara Cvetković, Martin Gjoreski, Jure Šorn, Pavel Maslov, Mitja Luštrek

Lit@EVE: Explainable Recommendation Based on Wikipedia Concept Vectors Muhammad Atif Qureshi, Derek Greene

TF Boosted Trees: A Scalable TensorFlow Based Framework for Gradient Boosting Natalia Ponomareva, Soroush Radpour, Gilbert Hendry, Salem Haykal, Thomas Colthurst, Petr Mitrichev, Alexander Grushetsky

ASK-the-Expert: Active Learning Based Knowledge Discovery Using the Expert Kamalika Das, Ilya Avrekh, Bryan Matthews, Manali Sharma, Nikunj Oza

TrajViz: A Tool for Visualizing Patterns and Anomalies in Trajectory Yifeng Gao, Qingzhe Li, Xiaosheng Li, Jessica Lin, Huzefa Rangwala, Ranjeev Mittu

100


20:00


MONDAY

SESSIONS WITH ABSTRACTS

CONGRESS HALL 1

Graph-Based Predictable Feature Analysis

Lagrangian Relaxations for Multiple Network Alignment Eric Malmi, Sanjay Chawla, Aristides Gionis

We propose a principled approach for the problem of aligning multiple partially overlapping networks. The objective is to map multiple graphs into a single graph while preserving vertex and edge similarities. The problem is inspired by the task of integrating partial views of a family tree (genealogical network) into one unified network, but it also has applications, for example, in social and biological networks. Our approach, called Flan, introduces the idea of generalizing the facility location problem by adding a non-linear term to capture edge similarities and to infer the underlying entity network. The problem is solved using an alternating optimization procedure with a Lagrangian relaxation. Flan has the advantage of being able to leverage prior information on the number of entities, so that when this information is available, Flan is shown to work robustly without the need to use any ground truth data for fine-tuning method parameters. Additionally, we present three multiple-network extensions to an existing state-of-the-art pairwise alignment method called Natalie. Extensive experiments on synthetic, as well as real-world datasets on social networks and genealogical networks, attest to the effectiveness of the proposed approaches which clearly outperform a popular multiple network alignment method called IsoRankN. 11:40 - 12:00

WEDNESDAY

11:20 - 11:40

THURSDAY

We propose graph-based predictable feature analysis (GPFA), a new method for unsupervised learning of predictable features from highdimensional time series, where high predictability is understood very generically as low variance in the distribution of the next data point given the previous ones. We show how this measure of predictability can be understood in terms of graph embedding as well as how it relates to the information-theoretic measure of predictive information in special cases. We confirm the effectiveness of GPFA on different datasets, comparing it to three existing algorithms with similar objectives—namely slow feature analysis, forecastable component analysis, and predictable feature analysis—to which GPFA shows very competitive results.

TUESDAY

Björn Weghenkel, Asja Fischer, Laurenz Wiskott

Measuring and Moderating Opinion Polarization in Social Networks

The polarization of society over controversial social issues has been the subject of study in social sciences for decades (Isenberg in J Personal Soc Psychol 50(6):1141–1151, 1986, Sunstein in J Polit Philos 10(2):175–195, 2002). The widespread usage of online social networks and social media, and the tendency of people to connect and interact with like-minded individuals has only intensified the phenomenon of polarization (Bakshy et al. in Science 348(6239):1130–1132, 2015). In this paper, we consider the problem of measuring and reducing polarization of opinions in a social network. Using a standard opinion formation model (Friedkin and Johnsen in J Math Soc 15(3–4):193–206, 1990), we define the polarization index, which, given a network and the opinions of the individuals in the network, it quantifies the polarization observed in the network. Our measure captures the tendency of opinions to concentrate in network communities, creating echo-chambers. Given this numeric measure of polarization, we then consider the problem of reducing polarization in the network by convincing individuals (e.g., through education, exposure to diverse viewpoints, or incentives) to adopt a more neutral stand towards controversial issues. We formally define the ModerateInternal and ModerateExpressed problems, and we prove that both our problems are NP-hard. By exploiting the linear- algebraic characteristics of the opinion formation model we design polynomial-time algorithms for both problems. Our experiments with real-world datasets demonstrate the validity of our metric, and the efficiency and the effectiveness of our algorithms in practice.


101

FRIDAY

Antonis Matakos, Evimaria Terzi, Panayiotis Tsaparas

THURSDAY 12:00 - 12:20

SEPTEMBER 21 Micro-Review Synthesis for Multi-Entity Summarization Thanh-Son Nguyen, Hady W. Lauw, Panayiotis Tsaparas

Location-based social networks (LBSNs), exemplified by Foursquare, are fast gaining popularity. One important feature of LBSNs is microreview. Upon check-in at a particular venue, a user may leave a short review (up to 200 characters long), also known as a tip. These tips are an important source of information for others to know more about various aspects of an entity (e.g., restaurant), such as food, waiting time, or service. However, a user is often interested not in one particular entity, but rather in several entities collectively, for instance within a neighborhood or a category. In this paper, we address the problem of summarizing the tips of multiple entities in a collection, by way of synthesizing new micro-reviews that pertain to the collection, rather than to the individual entities per se. We formulate this problem in terms of first finding a representation of the collection, by identifying a number of “aspects” that link common threads across two or more entities within the collection. We express these aspects as dense subgraphs in a graph of sentences derived from the multi-entity corpora. This leads to a formulation of maximal multi-entity quasi-cliques, as well as a heuristic algorithm to find K such quasi-cliques maximizing the coverage over the multi-entity corpora. To synthesize a summary tip for each aspect, we select a small number of sentences from the corresponding quasi-clique, balancing conciseness and representativeness in terms of a facility location problem. Our approach performs well on collections of Foursquare entities based on localities and categories, producing more representative and diverse summaries than the baselines.

12:20 - 12:40

Survival Factorization for Topical Cascades on Diffusion Networks Nicola Barbieri, Giuseppe Manco, Ettore Ritacco

In this paper we propose a survival factorization framework that models information cascades by tying together social influence patterns, topical structure and temporal dynamics. This is achieved through the introduction of a latent space which encodes: (a) the relevance of a information cascade on a topic; (b) the topical authoritativeness and the susceptibility of each individual involved in the information cascade, and (c) temporal topical patterns. By exploiting the cumulative properties of the survival function and of the likelihood of the model on a given adoption log, which records the observed activation times of users and side-information for each cascade, we show that the inference phase is linear in the number of users and in the number of adoptions. The evaluation on both synthetic and real-world data shows the effectiveness of the model in detecting the interplay between topics and social influence patterns, which ultimately provides high accuracy in predicting users activation times.

Unsupervised & Semi-supervised Learning 2 11:00 - 11:20

CONGRESS HALL 2

An Exponential Family Framework For Learning to Predict Unseen Classes Vinay Verma, Wenlin Wang, Piyush Rai

We present a simple generative framework for learning to predict previously unseen classes, based on estimating class-attribute-gated classconditional distributions. We model each class-conditional distribution as an exponential family distribution and the parameters of the distribution of each seen/unseen class are defined as functions of the respective observed class attributes. These functions can be learned using only the seen class data and can be used to predict the parameters of the class-conditional distribution of each unseen class. Unlike most existing methods for zero-shot learning that represent classes as fixed embeddings in some vector space, our generative model naturally represents each class as a probability distribution. It is simple to implement and also allows leveraging additional unlabeled data from unseen classes to improve the estimates of their class-conditional distributions using transductive/semi-supervised learning. Moreover, it extends seamlessly to few-shot learning by easily updating these distributions when provided with a small number of additional labelled examples from unseen classes. Through a comprehensive set of experiments on several benchmark data sets, we demonstrate the efficacy of our framework.

102


11:20 - 11:40

An Expressive Similarity Measure for Relational Clustering Using Neighbourhood Trees

Clustering is an underspecified task: there are no universal criteria for what makes a good clustering. This is especially true for relational data, where similarity can be based on the features of individuals, the relationships between them, or a mix of both. Existing methods for relational clustering have strong and often implicit biases in this respect. In this paper, we introduce a novel dissimilarity measure for relational data. It is the first approach to incorporate a wide variety of types of similarity, including similarity of attributes, similarity of relational context, and proximity in a hypergraph. We experimentally evaluate the proposed dissimilarity measure on both clustering and classification tasks using data sets of very different types. Considering the quality of the obtained clustering, the experiments demonstrate that (a) using this dissimilarity in standard clustering methods consistently gives good results, whereas other measures work well only on data sets that match their bias; and (b) on most data sets, the novel dissimilarity outperforms even the best among the existing ones. On the classification tasks, the proposed method outperforms the competitors on the majority of data sets, often by a large margin. Moreover, we show that learning the appropriate bias in an unsupervised way is a very challenging task, and that the existing methods offer a marginal gain compared to the proposed similarity method, and can even hurt performance. Finally, we show that the asymptotic complexity of the proposed dissimilarity measure is similar to the existing stateof-the-art approaches. The results confirm that the proposed dissimilarity measure is indeed versatile enough to capture relevant information, regardless of whether that comes from the attributes of vertices, their proximity, or connectedness of vertices, even without parameter tuning.

11:40 - 12:00

MONDAY

Sebastijan Dumančić, Hendrik Blockeel

DeepCluster: A General Clustering Framework Based on Deep Learning

12:00 - 12:20

Local PurTree Subspace Spectral Clustering for Customer Transaction Data

NO SHOW

Xiaojun Chen, JianZhe Zhang, Wenya Sun, Joshua Huang, Qingyao Wu

12:20 - 12:40

THURSDAY

This paper was accepted for presentation. However, it was not presented at the conference and is thus not published in the conference proceedings.

WEDNESDAY

In this paper, we propose a general framework DeepCluster to integrate traditional clustering methods into deep learning (DL) models and adopt Alternating Direction of Multiplier Method (ADMM) to optimize it. While most existing DL based clustering techniques have separate feature learning (via DL) and clustering (with traditional clustering methods), DeepCluster simultaneously learns feature representation and does cluster assignment under the same framework. Furthermore, it is a general and flexible framework that can employ different networks and clustering methods. We demonstrate the effectiveness of DeepCluster by integrating two popular clustering methods: K-means and Gaussian Mixture Model (GMM) into deep networks. The experimental results shown that our method can achieve state-of-the-art performance on learning representation for clustering analysis.

TUESDAY

Kai Tian, Shuigeng Zhou, Jihong Guan

Multi-View Spectral Clustering on Conflicting Views

In a growing number of application domains, multiple feature representations or views are available to describe objects. Multi-view clustering tries to find similar groups of objects across these views. This task is complicated when the corresponding clusterings in each view show poor agreement (conflicting views). In such cases, traditional multi-view clustering methods will not benefit from using multi-view data. Here, we propose to overcome this problem by combining the ideas of multi-view spectral clustering with alternative clustering through kernel-based dimensionality reduction. Our method automatically determines feature transformations in each view that lead to an optimal clustering w.r.t to a new proposed objective function for conflicting views. In our experiments, our approach outperforms state-of-the-art multi-view clustering methods by more accurately detecting the ground truth clustering supported by all views.


103

FRIDAY

Xiao He, Limin Li, Damian Roqueiro, Karsten Borgwardt

THURSDAY

SEPTEMBER 21

Anomaly Detection 11:00 - 11:20

CONGRESS HALL 3

Concentration Free Outlier Detection Fabrizio Angiulli

We present a novel notion of outlier, called Concentration Free Outlier Factor (CFOF), having the peculiarity to resist concentration phenomena that affect other scores when the dimensionality of the feature space increases. Indeed we formally prove that CFOF does not concentrate in intrinsically high-dimensional spaces. Moreover, CFOF is adaptive to different local density levels and it does not require the computation of exact neighbors in order to be reliably computed. We present a very efficient technique, named fast-CFOF, for detecting outliers in very large highdimensional datasets. The technique is efficiently parallelizable, and we provide a MIMD-SIMD implementation. Experimental results witness for scalability and effectiveness of the technique and highlight that CFOF exhibits state of the art detection performances. 11:20 - 11:40

Efficient Top Rank Optimization with Gradient Boosting for Supervised Anomaly Detection Jordan Frery, Marc Sebban, Amaury Habrard, Olivier Caelen, Liyun Guelton

In this paper we address the anomaly detection problem in a supervised setting where positive examples might be very sparse. We tackle this task with a learning to rank strategy by optimizing a differentiable smoothed surrogate of the so-called Average Precision (AP). Despite its nonconvexity, we show how to use it efficiently in a stochastic gradient boosting framework. We show that using AP is much better to optimize the top rank alerts than the state of the art measures. We demonstrate on anomaly detection tasks that the interest of our method is even reinforced in highly unbalanced scenarios. 11:40 - 12:00

Robust, Deep and Inductive Anomaly Detection Raghavendra Chalapathy, Aditya Krishna Menon, Sanjay Chawla

PCA is a classical statistical technique whose simplicity and maturity has seen it find widespread use as an anomaly detection technique. However, it is limited in this regard by being sensitive to gross perturbations of the input, and by seeking a linear subspace that captures normal behaviour. The first issue has been dealt with by robust PCA, a variant of PCA that explicitly allows for some data points to be arbitrarily corrupted; however, this does not resolve the second issue, and indeed introduces the new issue that one can no longer inductively find anomalies on a test set. This paper addresses both issues in a single model, the robust autoencoder. This method learns a nonlinear subspace that captures the majority of data points, while allowing for some data to have arbitrary corruption. The model is simple to train and leverages recent advances in the optimisation of deep neural networks. Experiments on a range of real-world datasets highlight the model’s effectiveness. 12:00 - 12:20

Sentiment Informed Cyberbullying Detection in Social Media Harsh Dani, Jundong Li, Huan Liu

Cyberbullying is a phenomenon which negatively affects the individuals, the victims suffer from various mental issues, ranging from depression, loneliness, anxiety to low self-esteem. In parallel with the pervasive use of social media, cyberbullying is becoming more and more prevalent. Traditional mechanisms to fight against cyberbullying include the use of standards and guidelines, human moderators, and blacklists based on the profane words. However, these mechanisms fall short in social media and cannot scale well. Therefore, it is necessary to develop a principled learning framework to automatically detect cyberbullying behaviors. However, it is a challenging task due to short, noisy and unstructured content information and intentional obfuscation of the abusive words or phrases by social media users. Motivated by sociological and psychological findings on bullying behaviors and the correlation with emotions, we propose to leverage sentiment information to detect cyberbullying behaviors in social media by proposing a sentiment informed cyberbullying detection framework. Experimental results on two real-world, publicly available social media datasets show the superiority of the proposed framework. Further studies validate the effectiveness of leveraging sentiment information for cyberbullying detection. 12:20 - 12:40

ZooRank: Ranking Suspicious Activities in Time-Evolving Tensors Hemank Lamba, Bryan Hooi, Kijung Shin, Christos Faloutsos, Juergen Pfeffer

Most user-based websites such as social networks(Twitter,Facebook) and e-commerce websites (Amazon) have been targets of group fraud (multiple users working together for malicious purposes). How can we better rank malicious entities in such cases of group-fraud? Most of the existing work in group anomaly detection detects lock-step behavior by detecting dense blocks in matrices, and recently, in tensors. However, there is no principled way of scoring the users based on their participation in these dense blocks. In addition, existing methods do not take into account temporal features while detecting dense blocks, which are crucial to uncover bot-like behaviors. In this paper (a) we propose a systematic way of handling temporal information; (b) we give a list of axioms that any individual suspiciousness metric satisfies; (c) we propose ZOORANK, an algorithm that finds and ranks suspicious entities (users, targeted products, days, etc.) effectively in real-world datasets. Experimental results on multiple real-world datasets show that ZOORANK detected and ranked the suspicious entities with a nearly accurate precision, while outperforming the baseline approach.

104



CONGRESS HALL 4

A Novel Framework for Online Sales Burst Prediction

With the rapid growth of e-commerce, a large number of online transactions are processed every day. In this paper, we take the initiative to conduct a systematic study of the challenging prediction problems of sales bursts. Here, we propose a novel model to detect bursts, find the bursty features, namely the start time of the burst, the peak value of the burst and the o -burst value, and predict the entire burst shape. Our model analyzes the features of similar sales bursts in the same category, and applies them to generate the prediction. We argue that the framework is capable of capturing the seasonal and categorical features of sales burst. Based on the real data from JD.com, we conduct extensive experiments and discover that the proposed model makes a relative MSE improvement of 71% and 30% over LSTM and ARMA. 11:20 - 11:40

MRNet-Product2Vec: A Multi-Task Recurrent Neural Network for Product Embeddings

MONDAY

Rui Chen, Jiajun Liu

Disjoint-Support Factors and Seasonality Estimation in E-Commerce Abhay Jha

Successful inventory management in retail entails accurate demand forecasts for many weeks/months ahead. Forecasting models use seasonality: recurring pattern of sales every year, to make this forecast. In e-commerce setting, where the catalog of items is much larger than brick and mortar stores and hence includes a lot of items with short history, it is infeasible to compute seasonality for items individually. It is customary in these cases to use ideas from factor analysis and express seasonality by a few factors/basis vectors computed together for an entire assortment of related items. In this paper, we demonstrate the effectiveness of choosing vectors with disjoint support as basis for seasonality when dealing with a large number of short time-series. We give theoretical results on computation of disjoint support factors that extend the state of the art, and also discuss temporal regularization necessary to make it work on walmart e-commerce dataset. Our experiments demonstrate a marked improvement in forecast accuracy for items with short history.

12:00 - 12:20

Generalising Random Forest Parameter Optimisation to Include Stability and Cost

WEDNESDAY

11:40 - 12:00

THURSDAY

E-commerce websites such as Amazon, Alibaba, Flipkart, and Walmart sell billions of products. Machine learning (ML) algorithms involving products are often used to improve the customer experience and increase revenue, e.g., product similarity, recommendation, and price estimation. The products are required to be represented as features before training an ML algorithm. In this paper, we propose an approach called MRNetProduct2Vec for creating generic embeddings of products within an e-commerce ecosystem. We learn a dense and low-dimensional embedding where a diverse set of signals related to a product are explicitly injected into its representation. We train a Discriminative Multi-task Bidirectional Recurrent Neural Network (RNN), where the input is a product title fed through a Bidirectional RNN and at the output, product labels corresponding to fifteen different tasks are predicted. The task set includes several intrinsic characteristics about a product such as price, weight, size, color, popularity, and material. We evaluate the proposed embedding quantitatively and qualitatively. We demonstrate that they are almost as good as sparse and extremely high-dimensional TF-IDF representation in spite of having less than 3% of the TF-IDF dimension. We also use a multimodal autoencoder for comparing products from different language-regions and show preliminary yet promising qualitative results.

TUESDAY

Arijit Biswas, Mukul Bhutani, Subhajit Sanyal

Random forests are among the most popular classification and regression methods used in industrial applications. To be effective, the parameters of random forests must be carefully tuned. This is usually done by choosing values that minimize the prediction error on a held out dataset. We argue that error reduction is only one of several metrics that must be considered when optimizing random forest parameters for commercial applications. We propose a novel metric that captures the stability of random forest predictions, which we argue is key for scenarios that require successive predictions. We motivate the need for multi-criteria optimization by showing that in practical applications, simply choosing the parameters that lead to the lowest error can introduce unnecessary costs and produce predictions that are not stable across independent runs. To optimize this multi-criteria trade-off, we present a new framework that efficiently finds a principled balance between these three considerations using Bayesian optimisation. The pitfalls of optimising forest parameters purely for error reduction are demonstrated using two publicly available real world datasets. We show that our framework leads to parameter settings that are markedly different from the values discovered by error reduction metrics alone.


105

FRIDAY

Chak Hin Bryan Liu, Benjamin Chamberlain, Duncan Little, Angelo Cardoso

THURSDAY 12:20 - 12:40

SEPTEMBER 21 Session-Based Fraud Detection in Online E-Commerce Transactions Using Recurrent Neural Networks Shuhao Wang, Cancheng Liu, Xiang Gao, Hongtao Qu, Wei Xu

Transaction frauds impose serious threats onto e-commerce. We present CLUE, a novel deep-learning-based transaction fraud detection system we design and deploy at JD.com, one of the largest e-commerce platforms in China with over 220 million active users. CLUE captures detailed information on users’ click actions using neural-network based embedding, and models sequences of such clicks using the recurrent neural network. Furthermore, CLUE provides application-specific design optimizations including imbalanced learning, real-time detection, and incremental model update. Using real production data for over eight months, we show that CLUE achieves over 3x improvement over the existing fraud detection approaches.

Recommendation 14:00 - 14:20

CONGRESS HALL 1

A Regularization Method with Inference of Trust and Distrust in Recommender Systems Dimitrios Rafailidis, Fabio Crestani

In this study we investigate the recommendation problem with trust and distrust relationships to overcome the sparsity of users’ preferences, accounting for the fact that users trust the recommendations of their friends, and they do not accept the recommendations of their foes. In addition, not only users’ preferences are sparse, but also users’ social relationships. So, we first propose an inference step with multiple random walks to predict the implicit-missing trust relationships that users might have in recommender systems, while considering users’ explicit trust and distrust relationships during the inference. We introduce a regularization method and design an objective function with a social regularization term to weigh the influence of friends’ trust and foes’ distrust degrees on users’ preferences. We formulate the objective function of our regularization method as a minimization problem with respect to the users’ and items’ latent features and then we solve our recommendation problem via gradient descent. Our experiments confirm that our approach preserves relatively high recommendation accuracy in the presence of sparsity in both the users’ preferences and social relationships, significantly outperforming several state-of-the-art methods.

14:20 - 14:40

A Unified Contextual Bandit Framework for Long- and Short-Term Recommendations Maryam Tavakol, Ulf Brefeld

We present a unified contextual bandit framework for recommendation problems that is able to capture long- and short-term interests of users. The model is devised in dual space and the derivation is consequentially carried out using Fenchel-Legrende conjugates and thus leverages to a wide range of tasks and settings. We detail two instantiations for regression and classification scenarios and obtain well-known algorithms for these special cases. The resulting general and unified framework allows for quickly adapting contextual bandits to different applications at-hand. The empirical study demonstrates that the proposed long- and short-term framework outperforms both, short-term and long-term models on data. Moreover, a tweak of the combined model proves beneficial in cold start problems.

14:40 - 15:00

Perceiving the Next Choice with Comprehensive Transaction Embeddings for Online Recommendation Shoujin Wang, Liang Hu, Longbing Cao, Xiaoshui Huang

To predict customers next choice in the context of what he/she has bought in a session is interesting and critical in the transaction domain especially for online shopping. Precise prediction leads to high quality recommendations and thus high benefit. Such kind of recommendation is usually formalized as transaction-based recommender systems (TBRS). Existing TBRS either tend to recommend popular items while ignore infrequent and newly-released ones (e.g., pattern-based RS) or assume a rigid order between items within a transaction(e.g., Markov Chainbased RS) which does not satisfy real-world cases in most time. In this paper, we propose a neural network based comprehensive transaction embedding model (NTEM) which can effectively perceive the next choice in a transaction context. Specifically, we learn these comprehensive embeddings of both items and their features from relaxed ordered transactions. The relevance between items revealed by the transactions is encoded into such embeddings. With rich information embedded, such embeddings are powerful to predict the next choices given those already bought items. NTEM is a shallow wide-in-wide-out network, which is more efficient than deep networks considering large numbers of items and transactions. Experimental results on real-world datasets show that NTEM outperforms three typical TBRS models FPMC, PRME and GRU4Rec in terms of recommendation accuracy and novelty. Our implementation is available at https://github.com/shoujin88/NTEM-model

106


15:00 - 15:20

Social Regularized Von Mises-Fisher Mixture Model for Item Recommendation

Tour Recommendation for Groups Aris Anagnostopoulos, Reem Atassi, Luca Becchetti, Adriano Fazzone, Fabrizio Silvestri

Transfer & Multi-Task Learning 1 14:00 - 14:20

WEDNESDAY

Consider a group of people who are visiting a major touristic city, such as NY, Paris, or Rome. It is reasonable to assume that each member of the group has his or her own interests or preferences about places to visit, which in general may differ from those of other members. Still, people almost always want to hang out together and so the following question naturally arises: What is the best tour that the group could perform together in the city? This problem underpins several challenges, ranging from understanding people’s expected attitudes towards potential points of interest, to modeling and providing good and viable solutions. Formulating this problem is challenging because of multiple competing objectives. For example, making the entire group as happy as possible in general conflicts with the objective that no member becomes disappointed. In this paper, we address the algorithmic implications of the above problem, by providing various formulations that take into account the overall group as well as the individual satisfaction and the length of the tour. We then study the computational complexity of these formulations, we provide effective and efficient practical algorithms, and, finally, we evaluate them on datasets constructed from real city data.

CONGRESS HALL 2

Lifelong Machine Learning with Gaussian Processes Christopher Clingerman, Eric Eaton

Recent developments in lifelong machine learning have demonstrated that it is possible to learn multiple tasks consecutively, transferring knowledge between those tasks to accelerate learning and improve performance. However, these methods are limited to using linear parametric base learners, substantially restricting the predictive power of the resulting models. We present a lifelong learning algorithm that can support non-parametric models, focusing on Gaussian processes. To enable efficient online transfer between Gaussian process models, our approach assumes a factorized formulation of the covariance functions, and incrementally learns a shared sparse basis for the models’ parameterizations. We show that this lifelong learning approach is highly computationally efficient, and outperforms existing methods on a variety of data sets.

Personalized Tag Recommendation for Images Using Deep Transfer Learning

FRIDAY

14:20 - 14:40

Hanh T.H. Nguyen, Martin Wistuba, Lars Schmidt-Thieme Image tag recommendation in social media systems provides the users with personalized tag suggestions which facilitate the users’ tagging task and enable automatic organization and many image retrieval tasks. Factorization models are a widely used approach for personalized tag recommendation and achieve good results. These methods rely on the user’s tagging preferences only and ignore the contents of the image. However, it is obvious that especially the contents of the image, such as the objects appearing in the image, colors, shapes or other visual aspects, strongly influence the user’s tagging decisions. We present a personalized content-aware image tag recommendation approach that combines both historical tagging information and image-based features in a factorization model. Employing transfer learning, we apply state of the art deep learning image classification and object detection techniques to extract powerful features from the images. Both, image information and tagging history, are fed to an adaptive factorization model to recommend tags. Empirically, we can demonstrate that the visual and object-based features can improve the performance up to 1.5 percent over the state of the art.


THURSDAY

15:20 - 15:40

TUESDAY

Collaborative filtering (CF) is a widely used technique to guide the users of web applications towards items that might interest them. CF approaches are severely challenged by the characteristics of user-item preference matrices, which are often high dimensional and extremely sparse. Recently, several works have shown that incorporating information from social networks—such as friendship and trust relationships—into traditional CF alleviates the sparsity related issues and yields a better recommendation quality, in most cases. More interestingly, even with comparable performances, social-based CF is more beneficial than traditional CF; the former makes it possible to provide recommendations for cold start users. In this paper, we propose a novel model that leverages information from social networks to improve recommendations. While existing social CF models are based on popular modelling assumptions such as Gaussian or Multinomial, our model builds on the von Mises– Fisher assumption which turns out to be more adequate, than the aforementioned assumptions, for high dimensional sparse data. Setting the estimate of the model parameters under the maximum likelihood approach, we derive a scalable learning algorithm for analyzing data with our model. Empirical results on several real-world datasets provide strong support for the advantages of the proposed model.

MONDAY

Aghiles Salah, Mohamed Nadif

107

THURSDAY 14:40 - 15:00

SEPTEMBER 21 Theoretical Analysis of Domain Adaptation with Optimal Transport Ievgen Redko, Amaury Habrard, Marc Sebban

Domain adaptation (DA) is an important and emerging field of machine learning that tackles the problem occurring when the distributions of training (source domain) and test (target domain) data are similar but different. Current theoretical results show that the efficiency of DA algorithms depends on their capacity of minimizing the divergence between source and target probability distributions. In this paper, we provide a theoretical study on the advantages that concepts borrowed from optimal transportation theory can bring to DA. In particular, we show that the Wasserstein metric can be used as a divergence measure between distributions to obtain generalization guarantees for three different learning settings: (i) classic DA with unsupervised target data (ii) DA combining source and target labeled data, (iii) multiple source DA. Based on the obtained results, we provide some insights showing when this analysis can be tighter than other existing frameworks. We think that these results open the door to novel ideas and directions for DA. 15:00 - 15:20

TSP: Learning Task-Specific Pivots for Unsupervised Domain Adaptation Xia Cui, Frans Coenen, Danushka Bollegala

Unsupervised Domain Adaptation (UDA) considers the problem of adapting a classifier trained using labelled training instances from a source domain to a different target domain, without having access to any labelled training instances from the target domain. Projection-based methods, where the source and target domain instances are first projected onto a common feature space on which a classifier can be trained and applied have produced state-of-the-art results for UDA. However, a critical pre-processing step required by these methods is the selection of a set of common features (aka. pivots), this is typically done using heuristic approaches,applied prior to performing domain adaptation. In contrast to the one of heuristics, we propose a method for learning Task-Specific Pivots (TSPs) in a systematic manner by considering both the labelled and unlabelled data available from both domains. We evaluate TSPs against pivots selected using alternatives in two cross-domain sentiment classification applications. Our experimental results show that the proposed TSPs significantly outperform previously proposed selection strategies in both tasks. Moreover, when applied in a cross-domain sentiment classification task, TSP captures many sentiment-bearing pivots. 15:20 - 15:40

Varying-Coefficient Models for Geospatial Transfer Learning Matthias Bussas, Christoph Sawade, Nicolas Kühn, Tobias Scheffer, Niels Landwehr

We study prediction problems in which the conditional distribution of the output given the input varies as a function of task variables which, in our applications, represent space and time. In varying-coefficient models, the coefficients of this conditional are allowed to change smoothly in space and time; the strength of the correlations between neighboring points is determined by the data. This is achieved by placing a Gaussian process (GP) prior on the coefficients. Bayesian inference in varying-coefficient models is generally intractable. We show that with an isotropic GP prior, inference in varying-coefficient models resolves to standard inference for a GP that can be solved efficiently. MAP inference in this model resolves to multitask learning using task and instance kernels. We clarify the relationship between varying-coefficient models and the hierarchical Bayesian multitask model and show that inference for hierarchical Bayesian multitask models can be carried out efficiently using graph-Laplacian kernels. We explore the model empirically for the problems of predicting rent and real-estate prices, and predicting the ground motion during seismic events. We find that varying-coefficient models with GP priors excel at predicting rents and real-estate prices. The ground-motion model predicts seismic hazards in the State of California more accurately than the previous state of the art.

Time Series & Streams 2 14:00 - 14:20

CONGRESS HALL 3

A Multiscale Bezier-Representation for Time Series That Supports Elastic Matching Frank Hoeppner, Tobias Sobek

Common time series similarity measures that operate on the full series (like Euclidean distance or Dynamic Time Warping DTW) do not correspond well to the visual similarity as perceived by a human. Based on the interval tree of scale, we propose a multiscale Bezier representation of time series, that supports the definition of elastic similarity measures that overcome this problem. With this representation the matching can be performed efficiently as similarity is measured segment-wise rather than element-wise (as with DTW). We effectively restrict the set of warping paths considered by DTW and the results do not only correspond better to the analysts intuition but improve the accuracy in the standard 1NN time series classification.

108


14:20 - 14:40

Cost Sensitive Time-Series Classification

This paper investigates the problem of highly imbalanced time-series classification using shapelets. The current state-of-the-art approach learns generalized shapelets along with weights of the classification hyperplane via a classical cost-insensitive loss function. Cost-insensitive loss functions tend to treat different misclassification errors equally and thus, models are usually biased towards examples of majority class. The rare class (which will be referred to as positive class) is usually the important class and a false negative is always costlier than a false positive. Traditional cost-insensitive loss functions fail to differentiate between these two types of misclassification errors. In this paper, the generalized shapelets learning framework is extended and a cost-sensitive learning model is proposed. Instead of incorporating the misclassification cost as a prior knowledge, as was done by other published methods, we formulate a constrained optimization problem to learn the unknown misclassification costs along with the shapelets and their weights. First, we demonstrated the effectiveness of the proposed method on two case studies, with the objective to detect true alarms from life threatening cardiac arrhythmia dataset from Physionets MIMIC II repository. The results show improved true alarm detection rates over the current state-of-the-art method. Next, we compared to the state-of-the-art learning shapelet method on 16 balanced dataset from UCR time-series repository. The results show evidence that the proposed method outperforms the state-ofthe-art method. Finally, we performed extensive experiments across additional 18 imbalanced time-series datasets. The results provide evidence that the proposed method achieves comparable results with state-of-the-art sampling/non-sampling based approaches for highly imbalanced time-series datasets. However, our method is highly interpretable which is advantage over many other methods.

15:00 - 15:20

Forecasting and Granger Modelling with Non-linear Dynamical Dependencies Magda Gregorova, Alexandros Kalousis, Stephan Marchand-Maillet

Traditional linear methods for forecasting multivariate time series are not able to satisfactorily model the non-linear dependencies that may exist in non-Gaussian series. We build on the theory of learning vector-valued functions in the reproducing kernel Hilbert space and develop a method for learning prediction functions that accommodate such non-linearities. The method not only learns the predictive function but also the matrix-valued kernel underlying the function search space directly from the data. Our approach is based on learning multiple matrix-valued kernels, each of those composed of a set of input kernels and a set of output kernels learned in the cone of positive semi-definite matrices. In addition to superior predictive performance in the presence of strong non-linearities, our method also recovers the hidden dynamic relationships between the series and thus is a new alternative to existing graphical Granger techniques.

15:20 - 15:40

WEDNESDAY

In the time-series classification context, the majority of the most accurate core methods are based on the Bag-of-Words framework, in which sets of local features are first extracted from time series. A dictionary of words is then learned and each time series is finally represented by a histogram of word occurrences. This representation induces a loss of information due to the quantization of features into words as all the time series are represented using the same fixed dictionary. In order to overcome this issue, we introduce in this paper a kernel operating directly on sets of features. Then, we extend it to a time-compliant kernel that allows one to take into account the temporal information. We apply this kernel in the time series classification context. Proposed kernel has a quadratic complexity with the size of input feature sets, which is problematic when dealing with long time series. However, we show that kernel approximation techniques can be used to define a good trade-off between accuracy and complexity. We experimentally demonstrate that the proposed kernel can significantly improve the performance of time series classification algorithms based on Bag-of-Words.

THURSDAY

Romain Tavenard, Simon Malinowski, Laetitia Chapel, Adeline Bailly, Heider Sanchez, Benjamin Bustos

TUESDAY

Efficient Temporal Kernels Between Feature Sets for Time Series Classification

UAPD: Predicting Urban Anomalies from Spatial-Temporal Data Xian Wu, Yuxiao Dong, Chao Huang, Jian Xu, Dong Wang, Nitesh Chawla

Urban anomalies such as noise complaints and potholes in streets negatively affect our everyday life and need to be addressed in a timely manner. While significant effort has been made in detecting whether there exist anomalies in current urban data, the prediction of future urban anomalies is much less well explored and understood. In this work, we formalize the anomaly prediction problem in urban environments, such that those can be addressed in a more timely and efficient manner. We develop the Urban Anomaly PreDiction (UAPD) framework, which addresses a number of challenges, including the dynamics of different categories of anomalies, as well as varied spatial-temporal distributions. Given the urban anomaly data to date, UAPD first detects its change point in the temporal dimension, then a tensor decomposition based model decouples the interrelations among the spatial, temporal, and categorical dimensions of the urban data to detect the potential anomalies. Subsequently, based on the latent tensor of each dimension, we apply the autoregression method to predict which category of anomalies will happen at each region in the future. We perform extensive experiments to evaluate the proposed UAPD framework in two urban environments, namely New York City and Pittsburgh. Experimental results demonstrate that UAPD outperforms alternative baselines across various settings, including different region and time-frame scales, as well as diverse categories of anomalies.


109

FRIDAY

14:40 - 15:00

MONDAY

Shoumik Roychoudhury, Mohamed Ghalwash, Zoran Obradovic

THURSDAY

SEPTEMBER 21


CONGRESS HALL 4

Predicting Self-Reported Customer Satisfaction of Interactions with a Corporate Call Center Joseph Bockhorst, Shi Yu, Luisa Polania Cabrera, Glenn Fung Moo

Timely identification of dissatisfied customers would provide corporations and other customer serving enterprises the opportunity to take meaningful interventions. This work describes a fully operational system we have developed at a large US insurance company for predicting customer satisfaction following all incoming phone calls at our call center. To capture call relevant information, we integrate signals from multiple heterogeneous data sources including: speech-to-text transcriptions of calls, call metadata (duration, waiting time, etc.), customer profiles and insurance policy information. Because of its ordinal, subjective, and often highly-skewed nature, self-reported survey scores presents several modeling challenges. To deal with these issues we introduce a novel modeling workflow: First, a ranking model is trained on the customer call data fusion. Then, a convolutional fitting function is optimized to map the ranking scores to actual survey satisfaction scores. This approach produces more accurate predictions than standard regression and classification approaches that directly fit the survey scores with call data, and can be easily generalized to other customer satisfaction prediction problems. Source code and data are available at https://github.com/cyberyu/ ecml2017. 14:20 - 14:40

Have It Both Ways - from A/B Testing to A&B Testing with Exceptional Model Mining Wouter Duivesteijn, Tara Farzami, Thijs Putman, Evertjan Peer, Hilde J.P. Weerts, Jasper Adegeest, Gerson Foks, Mykola Pechenizkiy

In traditional A/B testing, we have two variants of the same product, a pool of test subjects, and a measure of success. In a randomized experiment, each test subject is presented with one of the two variants, and the measure of success is aggregated per variant. The variant of the product associated with the most success is retained, while the other variant is discarded. This, however, presumes that the company producing the products only has enough capacity to maintain one of the two product variants. If more capacity is available, then advanced data science techniques can extract more profit for the company from the A/B testing results. Exceptional Model Mining is one such advanced data science technique, which specializes in identifying subgroups that behave differently from the overall population. Using the association model class for EMM, we can find subpopulations that prefer variant A where the general population prefers variant B, and vice versa. This data science technique is applied on data from StudyPortals, a global study choice platform that ran an A/B test on the design of aspects of their website.

14:40 - 15:00

Structural Semantic Models for Automatic Analysis of Urban Areas Gianni Barlacchi, Alberto Rossi, Bruno Lepri, Alessandro Moschitti

The growing availability of data from cities (e.g., traffic flow, human mobility and geographical data) open new opportunities for predicting and thus optimizing human activities. For example, the automatic analysis of land use enables the possibility of better administrating a city in terms of resources and provided services. However, such analysis requires specific information, which is often not available for privacy concerns. In this paper, we propose a novel machine learning representation based on the available public information to classify the most predominant land use of an urban area, which is a very common task in urban computing. In particular, in addition to standard feature vectors, we encode geo-social data from Location-Based Social Networks (LBSNs) into a conceptual tree structure that we call Geo-Tree. Then, we use such representation in kernel machines, which can thus perform accurate classification exploiting hierarchical substructure of concepts as features. Our extensive comparative study on the areas of New York and its boroughs shows that Tree Kernels applied to Geo-Trees are very effective improving the state of the art up to 18% in Macro-F1.

15:00 - 15:20

Using Machine Learning for Labour Market Intelligence Roberto Boselli, Mirko Cesarini, Fabio Mercorio, Mario Mezzanzanica

The rapid growth of Web usage for advertising job positions provides a great opportunity for real-time labour market monitoring. This is the aim of Labour Market Intelligence (LMI), a field that is becoming increasingly relevant to EU Labour Market policies design and evaluation. The analysis of Web job vacancies, indeed, represents a competitive advantage to labour market stakeholders with respect to classical survey-based analyses, as it allows for reducing the time-to-market of the analysis by moving towards a fact-based decision making model. In this paper, we present our approach for automatically classifying million Web job vacancies on a standard taxonomy of occupations. We show how this problem has been expressed in terms of text classification via machine learning. Then, we provide details about the classification pipelines we evaluated and implemented, along with the outcomes of the validation activities. Finally, we discuss how machine learning contributed to the LMI needs of the European Organisation that supported the project.

110


15:20 - 15:40

SINAS: Suspect Investigation Using Offenders’ Activity Space Mohammad Tayebi, Uwe Glässer, Patricia Brantingham, Hamed Yaghoubi Shahir

Knowledge Elicitation via Sequential Probabilistic Inference for HighDimensional Prediction Pedram Daee, Tomi Peltola, Marta Soare, Samuel Kaski

Prediction in a small-sized sample with a large number of covariates, the “small n, large p” problem, is challenging. This setting is encountered in multiple applications, such as in precision medicine, where obtaining additional data can be extremely costly or even impossible, and extensive research effort has recently been dedicated to finding principled solutions for accurate prediction. However, a valuable source of additional information, domain experts, has not yet been efficiently exploited. We formulate knowledge elicitation generally as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions. In the specific case of sparse linear regression, where we assume the expert has knowledge about the relevance of the covariates, or of values of the regression coefficients, we propose an algorithm and computational approximation for fast and efficient interaction, which sequentially identifies the most informative features on which to query expert knowledge. Evaluations of the proposed method in experiments with simulated and real users show improved prediction accuracy already with a small effort from the expert. 16:20 - 16:40

Labeled DBN Learning with Community Structure Knowledge

TUESDAY

16:00 - 16:20

CONGRESS HALL 1

WEDNESDAY


MONDAY

Suspect investigation as a critical function of policing determines the truth about how a crime occurred, as far as it can be found. Understanding of the environmental elements in the causes of a crime incidence inevitably improves the suspect investigation process. Crime pattern theory concludes that offenders, rather than venture into unknown territories, frequently commit opportunistic and serial violent crimes by taking advantage of opportunities they encounter in places they are most familiar with as part of their activity space. In this paper, we present a suspect investigation method, called SINAS, which learns the activity space of offenders using an extended version of the random walk method based on crime pattern theory, and then recommends the top-K potential suspects for a committed crime. Our experiments on a large real-world crime dataset show that SINAS outperforms the baseline suspect investigation methods we used for the experimental evaluation.

Learning interactions between dynamical processes is a widespread but difficult problem in ecological or human sciences. Unlike in other domains (bioinformatics, for example), data is often scarce, but expert knowledge is available. We consider the case where knowledge is about a limited number of interactions that drive the processes dynamics, and on a community structure in the interaction network. We propose an original framework, based on Dynamic Bayesian Networks with labeled-edge structure and parsimonious parameterization, and a Stochastic Block Model prior, to integrate this knowledge. Then we propose a restoration-estimation algorithm, based on 0-1 Linear Programing, that improves network learning when these two types of expert knowledge are available. The approach is illustrated on a problem of ecological interaction network learning.

Online Sparse Collapsed Hybrid Variational-Gibbs Algorithm for Hierarchical Dirichlet Process Topic Models Sophie Burkhardt, Stefan Kramer

Topic models for text analysis are most commonly trained using either Gibbs sampling or variational Bayes. Recently, hybrid variational-Gibbs algorithms have been found to combine the best of both worlds. Variational algorithms are fast to converge and more efficient for inference on new documents. Gibbs sampling enables sparse updates since each token is only associated with one topic instead of a distribution over all topics. Additionally, Gibbs sampling is unbiased. Although Gibbs sampling takes longer to converge, it is guaranteed to arrive at the true posterior after infinitely many iterations. By combining the two methods it is possible to reduce the bias of variational methods while simultaneously speeding up variational updates. This idea has previously been applied to standard latent Dirichlet allocation (LDA). We propose a new sampling method that enables the application of the idea to the nonparametric version of LDA, hierarchical Dirichlet process topic models. Our fast sampling method leads to a significant speedup of variational updates as compared to other sampling methods. Experiments show that training of our topic model converges to a better loglikelihood than previously existing variational methods and converges faster than Gibbs sampling in the batch setting.


111

FRIDAY

16:40 - 17:00

THURSDAY

Etienne Auclair, Nathalie Peyrard, Régis Sabbadin

THURSDAY 17:00 - 17:20

SEPTEMBER 21 Partial Device Fingerprints Michael Ciere, Carlos Ganan, Michel van Eeten

In computing, remote devices may be identified by means of device fingerprinting, which works by collecting a myriad of client-side attributes such as the device’s browser and operating system version, installed plugins, screen resolution, hardware artifacts, Wi-Fi settings, and anything else available to the server, and then merging these attributes into uniquely identifying fingerprints. This technique is used in practice to present personalized content to repeat website visitors, detect fraudulent users, and stop masquerading attacks on local networks. However, device fingerprints are seldom uniquely identifying. They are better viewed as partial device fingerprints, which do have some discriminatory power but not enough to uniquely identify users. How can we infer from partial fingerprints whether different observations belong to the same device? We present a mathematical formulation of this problem that enables probabilistic inference of the correspondence of observations. We set out to estimate a correspondence probability for every pair of observations that reflects the plausibility that they are made by the same user. By extending probabilistic data association techniques previously used in object tracking, traffic surveillance and citation matching, we develop a general-purpose probabilistic method for estimating correspondence probabilities with partial fingerprints. Our approach exploits the natural variation in fingerprints and allows for use of situation-specific knowledge through the specification of a generative probability model. Experiments with a real-world dataset show that our approach gives calibrated correspondence probabilities. Moreover, we demonstrate that improved results can be obtained by combining device fingerprints with behavioral models.

17:20 - 17:40

Vine Copulas for Mixed Data : Multi-View Clustering for Mixed Data Beyond Meta-Gaussian Dependencies Lavanya Sita Tekumalla, Vaibhav Rajan, Chiranjib Bhattacharyya

Copulas enable flexible parameterization of multivariate distributions in terms of constituent marginals and dependence families. Vine copulas, hierarchical collections of bivariate copulas, can model a wide variety of dependencies in multivariate data including asymmetric and tail dependencies which the more widely used Gaussian copulas, used in Meta-Gaussian distributions, cannot. However, current inference algorithms for vines cannot fit data with mixed—a combination of continuous, binary and ordinal—features that are common in many domains. We design a new inference algorithm to fit vines on mixed data thereby extending their use to several applications. We illustrate our algorithm by developing a dependency-seeking multi-view clustering model based on Dirichlet Process mixture of vines that generalizes previous models to arbitrary dependencies as well as to mixed marginals. Empirical results on synthetic and real datasets demonstrate the performance on clustering single-view and multi-view data with asymmetric and tail dependencies and with mixed marginals.

Matrix & Tensor Factorization 16:00 - 16:20

CONGRESS HALL 2

C-SALT: Mining Class-Specific ALTerations in Boolean Matrix Factorization Sibylle Hess, Katharina Morik

Given labeled data represented by a binary matrix, we consider the task to derive a Boolean matrix factorization which identifies commonalities and specifications among the classes. While existing works focus on rank-one factorizations which are either specific or common to the classes, we derive class-specific alterations from common factorizations as well. Therewith, we broaden the applicability of our new method to datasets whose class-dependencies have a more complex structure. On the basis of synthetic and real-world datasets, we show on the one hand that our method is able to filter structure which corresponds to our model assumption, and on the other hand that our model assumption is justified in real-world application. Our method is parameter-free.

16:20 - 16:40

Comparative Study of Inference Methods for Bayesian Nonnegative Matrix Factorisation Thomas Brouwer, Jes Frellsen, Pietro Lio

In this paper, we study the trade-offs of different inference approaches for Bayesian matrix factorisation methods, which are commonly used for predicting missing values, and for finding patterns in the data. In particular, we consider Bayesian nonnegative variants of matrix factorisation and tri-factorisation, and compare non-probabilistic inference, Gibbs sampling, variational Bayesian inference, and a maximum-a-posteriori approach. The variational approach is new for the Bayesian nonnegative models. We compare their convergence, and robustness to noise and sparsity of the data, on both synthetic and real-world datasets. Furthermore, we extend the models with the Bayesian automatic relevance determination prior, allowing the models to perform automatic model selection, and demonstrate its efficiency.

112


16:40 - 17:00

Content-Based Social Recommendation with Poisson Matrix Factorization

Qiquan Shi, Yiu-ming Cheung, Qibin Zhao Extracting features from incomplete tensors is a challenging task which is not well explored. Due to the data with missing entries, existing feature extraction methods are not applicable. Although tensor completion techniques can estimate the missing entries well, they focus on data recovery and do not consider the relationships among tensor samples for effective feature extraction. To solve this problem of feature extraction for incomplete data, we propose an unsupervised method, TDVM, which incorporates low-rank Tucker Decomposition with feature Variance Maximization in a unified framework. Based on Tucker decomposition, we impose nuclear norm regularization on the core tensors while minimizing reconstruction errors, and meanwhile maximize the variance of core tensors (i.e., extracted features). Here, the relationships among tensor samples are explored via variance maximization while estimating the missing entries. We thus can simultaneously obtain lower-dimensional core tensors and informative features directly from observed entries. The alternating direction method of multipliers approach is utilized to solve the optimization objective. We evaluate the features extracted from two real data with different missing entries for face recognition tasks. Experimental results illustrate the superior performance of our method with a significant improvement over the state-of-the-art methods.

17:20 - 17:40

Structurally Regularized Non-Negative Tensor Factorization for Spatio-temporal Pattern Discoveries Koh Takeuchi, Yoshinobu Kawahara, Tomoharu Iwata

Understanding spatio-temporal activities in a city is a typical problem of spatio-temporal data analysis. For this analysis, tensor factorization methods have been widely applied for extracting a few essential patterns into latent factors. Non-negative Tensor Factorization (NTF) is popular because of its capability of learning interpretable factors from non-negative data, simple computation procedures, and dealing with missing observation. However, since existing NTF methods are not fully aware of spatial and temporal dependencies, they often fall short of learning latent factors where a large portion of missing observation exist in data. In this paper, we present a novel NTF method for extracting smooth and flat latent factors by leveraging various kinds of spatial and temporal structures. Our method incorporates a unified structured regularizer into NTF that can represent various kinds of auxiliary information, such as an order of timestamps, a daily and weekly periodicity, distances between sensor locations, and areas of locations. For the estimation of the factors for our model, we present a simple and efficient optimization procedure based on the alternating direction method of multipliers. In missing value interpolation experiments of traffic flow data and bike-sharing system data, we demonstrate that our proposed method improved interpolation performances from existing NTF, especially when a large portion of missing values exists.

16:00 - 16:20

CONGRESS HALL 3

FRIDAY


TUESDAY

Feature Extraction for Incomplete Data via Low-Rank Tucker Decomposition

WEDNESDAY

17:00 - 17:20

THURSDAY

We introduce Poisson Matrix Factorization with Content and Social trust information (PoissonMF-CS), a latent variable probabilistic model for recommender systems with the objective of jointly modeling social trust, item content and user’s preference using Poisson matrix factorization framework. This probabilistic model is equivalent to collectively factorizing a non-negative user--item interaction matrix and a non-negative item-content matrix. The user--item matrix consists of sparse implicit (or explicit) interactions counts between user and item, and the item--content matrix consists of words or tags counts per item. The model imposes additional constraints given by the social ties between users, and the homophily effect on social networks -- the tendency of people with similar preferences to be socially connected. Using this model we can account for and fine-tune the weight of content-based and social-based factors in the user preference. We develop approximate variational inference algorithm and perform experiments comparing PoissonMF-CS with competing models. The experimental evaluation indicates that PoissonMF-CS achieves superior predictive performance on held-out data for the top-M recommendations task. Also, we observe that PoissonMF-CS generates compact latent representations when compared with alternative models while maintaining superior predictive performance.

MONDAY

Eliezer De Souza da Silva, Helge Langseth, Heri Ramampiaro

A Novel Rating Pattern Transfer Model for Improving Non-Overlapping Cross-Domain Collaborative Filtering Yizhou Zang, Xiaohua Hu

Cross-Domain Collaborative Filtering (CDCF) has attracted various research works in recent years. However, an important problem setting, i.e., “users and items in source and target domains are totally different”, has not received much attention yet. We coin this problem as NonOverlapping Cross-Domain Collaborative Filtering (NOCDCF). In order to solve this challenging CDCF task, we propose a novel 3-step rating pattern transfer model, i.e. low-rank knowledge transfer via factorization machines (LKT-FM). Our solution is able to mine high quality knowledge from large and sparse source matrices, and to integrate the knowledge without losing much information contained in the target matrix via exploiting Factorization Machine (FM). Extensive experiments on real world datasets show that the proposed LKT-FM model outperforms the state-of-the-art CDCF solutions.


113

THURSDAY 16:20 - 16:40

SEPTEMBER 21 Distributed Multi-Task Learning for Sensor Network Jiyi Li, Tomohiro Arai, Yukino Baba, Hisashi Kashima, Shotaro Miwa

A sensor in a sensor network is expected to be able to make prediction or decision utilizing the models learned from the data observed on this sensor. However, in the early stage of using a sensor, there may be not a lot of data available to train the model for this sensor. A solution is to leverage the observation data from other sensors which have similar conditions and models with the given sensor. We thus propose a novel distributed multi-task learning approach which incorporates neighborhood relations among sensors to learn multiple models simultaneously in which each sensor corresponds to one task. It may be not cheap for each sensor to transfer the observation data from other sensors; broadcasting the observation data of a sensor in the entire network is not satisfied for the reason of privacy protection; each sensor is expected to make real-time prediction independently from neighbor sensors. Therefore, this approach shares the model parameters as regularization terms in the objective function by assuming that neighbor sensors have similar model parameters. We conduct the experiments on two real datasets by predicting the temperature with the regression. They verify that our approach is effective, especially when the bias of an independent model which does not utilize the data from other sensors is high such as when there is not plenty of training data available. 16:40 - 17:00

Learning Task structure via sparsity grouped multitask learning Meghana Kshirsagar, Eunho Yang, Aurelie Lozano

Sparse mapping has been a key methodology in many high-dimensional scientific problems. When multiple tasks share the set of relevant features, learning them jointly in a group drastically improves the quality of relevant feature selection. However, in practice this technique is used limitedly since such grouping information is usually hidden. In this paper, our goal is to recover the group structure on the sparsity patterns and leverage that information in the sparse learning. Toward this, we formulate a joint optimization problem in the task parameter and the group membership, by constructing an appropriate regularizer to encourage sparse learning as well as correct recovery of task groups. We further demonstrate that our proposed method recovers groups and the sparsity patterns in the task parameters accurately by extensive experiments.

17:00 - 17:20

Ranking Based Multitask Learning of Scoring Functions Ivan Stojkovic, Mohamed Ghalwash, Zoran Obradovic

Scoring functions are an important tool for quantifying properties of interest in many domains; for example, in healthcare, a disease severity scores are used to diagnose the patient’s condition and to decide its further treatment. Scoring functions might be obtained based on the domain knowledge or learned from data by using classification, regression or ranking techniques - depending on the type of supervised information. Although learning scoring functions from collected data is beneficial, it can be challenging when limited data are available. Therefore, learning multiple distinct, but related, scoring functions together can increase their quality as shared regularities may be easier to identify. We propose a multitask formulation for ranking-based learning of scoring functions, where the model is trained from pairwise comparisons. The approach uses mixed-norm regularization to impose structural regularities among the tasks. The proposed regularized objective function is convex; therefore, we developed an optimization approach based on alternating minimization and proximal gradient algorithms to solve the problem. The increased predictive accuracy of the presented approach, in comparison to several baselines, is demonstrated on synthetic data and two different realworld applications; predicting exam scores and predicting tolerance to infections score.

Computer Vision 16:00 - 16:20

CONGRESS HALL 4

Alternative Semantic Representations for Zero-Shot Human Action Recognition Qian Wang, Ke Chen

A proper semantic representation for encoding side information is key to the success of zero-shot learning. In this paper, we explore two alternative semantic representations especially for zero-shot human action recognition: textual descriptions of human actions and deep features extracted from still images relevant to human actions. Such side information are accessible on Web with little cost, which paves a new way in gaining side information for large-scale zero-shot human action recognition. We investigate different encoding methods to generate semantic representations for human actions from such side information. Based on our zero-shot visual recognition method, we conducted experiments on UCF101 and HMDB51 to evaluate two proposed semantic representations . The results suggest that our proposed text- and image-based semantic representations outperform traditional attributes and word vectors considerably for zero-shot human action recognition. In particular, the image-based semantic representations yield the favourable performance even though the representation is extracted from a small number of images per class.

114


16:20 - 16:40

Early Active Learning with Pairwise Constraint for Person ReIdentification

Guiding InfoGAN with Semi-Supervision Adrian Spurr, Emre Aksan, Otmar Hilliges

In this paper we propose a new semi-supervised GAN architecture (ss-InfoGAN) for image synthesis that leverages information from few labels (as little as 0.22%, max. 10% of the dataset) to learn semantically meaningful and controllable data representations where latent variables correspond to label categories. The architecture builds on Information Maximizing Generative Adversarial Networks (InfoGAN) and is shown to learn both continuous and categorical codes and achieves higher quality of synthetic samples compared to fully unsupervised settings. Furthermore, we show that using small amounts of labeled data speeds-up training convergence. The architecture maintains the ability to disentangle latent variables for which no labels are available. Finally, we contribute an information-theoretic reasoning on how introducing semisupervision increases mutual information between synthetic and real data.

Mathieu Cliche, David Rosenberg, Connie Yee, Dhruv Madeka Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points. We present a fully automated system for extracting the numerical values of data points from images of scatter plots. We use deep learning techniques to identify the key components of the chart, and optical character recognition together with robust regression to map from pixels to the coordinate system of the chart. We focus on scatter plots with linear scales, which already have several interesting challenges. Previous work has done fully automatic extraction for other types of charts, but to our knowledge this is the first approach that is fully automatic for scatter plots. Our method performs well, achieving successful data extraction on 89% of the plots in our test set.

17:20 - 17:40

Unsupervised Diverse Colorization via Generative Adversarial Networks Yun Cao, Zhiming Zhou, Weinan Zhang, Yong Yu

Colorization of grayscale images is a hot topic in computer vision. Previous research mainly focuses on producing a color image to recover the original one in a supervised learning fashion. However, since many colors share the same gray value, an input grayscale image could be diversely colorized while maintaining its reality. In this paper, we design a novel solution for unsupervised diverse colorization. Specifically, we leverage conditional generative adversarial networks to model the distribution of real-world item colors, in which we develop a fully convolutional generator with multi-layer noise to enhance diversity, with multi-layer condition concatenation to maintain reality, and with stride 1 to keep spatial information. With such a novel network architecture, the model yields highly competitive performance on the open LSUN bedroom dataset. The Turing test on 80 humans further indicates our generated color schemes are highly convincible.


115

WEDNESDAY

Scatteract: Automated Extraction of data from scatter plots

THURSDAY

17:00 - 17:20

FRIDAY

16:40 - 17:00

TUESDAY

Research on person re-identification (re-id) has attached much attention in the machine learning field in recent years. With sufficient labeled training data, supervised re-id algorithm can obtain promising performance. However, producing labeled data for training supervised re-id models is an extremely challenging and time-consuming task because it requires every pair of images across no-overlapping camera views to be labeled. Moreover, in the early stage of experiments, when labor resources are limited, only a small number of data can be labeled. Thus, it is essential to design an effective algorithm to select the most representative samples. This is referred as early active learning or early stage experimental design problem. The pairwise relationship plays a vital role in the re-id problem, but most of the existing early active learning algorithms fail to consider this relationship. To overcome this limitation, we propose a novel and efficient early active learning algorithm with a pairwise constraint for person re-identification in this paper. By introducing the pairwise constraint, the closeness of similar representations of instances is enforced in active learning. This benefits the performance of active learning for re-id. Extensive experimental results on four benchmark datasets confirm the superiority of the proposed algorithm.

MONDAY

Wenhe Liu, xiaojun Chang, Ling Chen, Yi Yang, Alexander Hauptmann

THURSDAY

SEPTEMBER 21

Demo 2

Spotlight at MOB Monitoring Physical Activity and Mental Stress Using Wrist-worn Device and a Smartphone Božidara Cvetković, Martin Gjoreski, Jure Šorn, Pavel Maslov, Mitja Luštrek

The paper presents a smartphone application for monitoring physical activity and mental stress. The application utilizes sensor data from a wristband and/or a smartphone, which can be worn in various pockets or in a bag in any orientation. The presence and location of the devices are used as contexts for the selection of appropriate machine-learning models for activity recognition and the estimation of human energy expenditure. The stress-monitoring method uses two machine-learning models, the first one relying solely on physiological sensor data and the second one incorporating the output of the activity monitoring and other context information. The evaluation showed that we recognize a wide range of atomic activities with an accuracy of 87%, and that we outperform the state-of-the art consumer devices in the estimation of energy expenditure. In stress monitoring we achieved an accuracy of 92% in a real-life setting.

Lit@EVE: Explainable Recommendation Based on Wikipedia Concept Vectors Muhammad Atif Qureshi, Derek Greene We present an explainable recommendation system for novels and authors, called Lit@EVE, which is based on Wikipedia concept vectors. In this system, each novel or author is treated as a concept whose definition is extracted as a concept vector through the application of an explainable word embedding technique called EVE. Each dimension of the concept vector is labelled as either a Wikipedia article or a Wikipedia category name, making the vector representation readily interpretable. In order to recommend items, the Lit@EVE system uses these vectors to compute similarity scores between a target novel or author and all other candidate items. Finally, the system generates an ordered list of suggested items by showing the most informative features as human-readable labels, thereby making the recommendation explainable.

TF Boosted Trees: A Scalable TensorFlow based framework for Gradient boosting Natalia Ponomareva, Soroush Radpour, Gilbert Hendry, Salem Haykal, Thomas Colthurst, Petr Mitrichev, Alexander Grushetsky TF Boosted Trees (TFBT) is a new open-sourced framework for the distributed training of gradient boosted trees. It is based on TensorFlow, and its distinguishing features include a novel architecture, automatic loss differentiation, layer-by-layer boosting that results in smaller ensembles and faster prediction, principled multi-class handling, and a number of regularization techniques to prevent overfitting.

ASK-the-Expert: Active Learning based knowledge discovery using the expert Kamalika Das, Ilya Avrekh, Bryan Matthews, Manali Sharma, Nikunj Oza Often the manual review of large data sets, either for purposes of labeling unlabeled instances or for classifying meaningful results from uninteresting (but statistically significant) ones is extremely resource intensive, especially in terms of subject matter expert (SME) time. Use of active learning has been shown to reduce this review time significantly. However, since active learning is an iterative process of learning a classifier based on a small number of SME-provided labels at each iteration, the lack of an enabling tool can hinder the process of adoption of these technologies in real-life, in spite of their labor-saving potential. In this demo we present ASK-the-Expert, an interactive tool that allows SMEs to review instances from a data set and provide labels within a single framework. ASK-the-Expert is powered by an active learning algorithm for training a classifier in the backend. We demonstrate this system in the context of an aviation safety application, but the tool can be adopted to work as a simple review and labeling tool as well, without the use of active learning.

TrajViz: A Tool for Visualizing Patterns and Anomalies in Trajectory Yifeng Gao, Qingzhe Li, Xiaosheng Li, Jessica Lin, Huzefa Rangwala, Ranjeev Mittu Visualizing frequently occurring patterns and potentially unusual behaviors in trajectory can provide valuable insights into activities behind the data. In this paper, we introduce TrajViz, a motif (frequently repeated subsequences) based visualization software that detects patterns and anomalies by inducing ``grammars” from discretized spatial trajectories. We consider patterns as a set of sub-trajectories with unknown lengths that are spatially similar to each other. We demonstrate that TrajViz has the capacity to help users visualize anomalies and patterns effectively.

116



117

FRIDAY

THURSDAY

WEDNESDAY

TUESDAY

MONDAY

FRIDAY

SEPTEMBER 22

FRIDAY 118


CONGRESS HALL

CONGRESS HALL

CONGRESS HALL

CONGRESS HALL

1

2

3

Tutorial

Workshop DLPM 2017

AutoML

Automatic selection, configuration and composition of machine learning algorithms


MEZZANINE HALL

MEZZANINE HALL

MEZZANINE HALL

4

2

3

4

Workshop

Tutorial

Workshop

Workshop

Workshop

NFMCP 2017 New Frontiers in Mining Complex Patterns

Deep Learning for Computer Vision Applications: Robotics & Driving

LIDTA 2017

DMSC 2017

DMNLP 2017

Workshop

Workshop

LIDTA 2017

DMSC 2017

10:40 - 11:00


Workshop

Workshop

Tutorial

DLPM 2017


Deep Learning for Computer Vision Applications: Robotics and Driving


Learning with Imbalanced Domains: Theory & Applications

Data Mining with Secure Computation

Interactions between Data Mining & Natural Language Processing

DLPM 2017


Workshop NFMCP 2017 New Frontiers in Mining Complex Patterns

Workshop

KnowMe 2017

Workshop

Tutorial

Workshop

KNOWledge Discovery from Mobility & Transportation Systems

Machine learning with LIDTA 2017 fossil data: Learning with Analyzing Imbalanced Domains: environmental & Theory & Applications climate change

WEDNESDAY

AutoML


Workshop

DARE 2017

Data Analytics for Renewable Energy Integration

COFFEE & TEA BREAK

Workshop AutoML

Workshop DLPM 2017

Workshop

Workshop


KnowMe 2017




Workshop

Workshop

Workshop

DLPM 2017


AutoML



Workshop

Tutorial

LIDTA 2017

Machine learning with fossil data: Analyzing environmental &

Learning with Imbalanced Domains: Theory & Applications

climate change

Workshop

THURSDAY

15:40 - 16:00

17:40 - 19:00

PhD Forum

Workshop

DMNLP 2017

LUNCH BREAK

Workshop

16:00 - 17:40

Interactions between Data Mining & Natural Language Processing

TUESDAY

AutoML

12:40 - 14:00

14:00 - 15:40

Data Mining Learning with with Imbalanced Domains: Theory & Applications Secure Computation

COFFEE & TEA BREAK

Tutorial 11:00 - 12:40

PhD Forum

DARE 2017

Data Analytics for Renewable Energy Integration

KnowMe 2017

FRIDAY

09:00 - 10:40

BANQUET HALL

MONDAY

PROGRAM AT A GLANCE


119

FRIDAY

SEPTEMBER 22

BANQUET HALL

PhD Forum

09:00 - 15:40

PhD Forum Chairs :

Tomislav Šmuc, Rudjer Boškovič Institute, Croatia Bernard Ženko, Jožef Stefan Institute, Slovenia

The PhD forum aims at attracting junior PhD students at the beginning of their scientific career to present and discuss their work in progress, and to receive useful comments, advice and ideas for their further work. The forum will also be attended by senior researchers with experience in supervising PhD students who will provide constructive feedback to the participants. This will be an excellent opportunity for developing person-to-person networks of PhD students, which can be extremely valuable for their future careers.

Sponsored by

IBM Research

Invited talk

50 WAYS TO TWEAK YOUR PAPER Johannes Fürnkranz TU Darmstadt, Germany Very often, we see papers that have good research ideas be rejected because the quality of the write-up does not quite live up to the quality of the idea. Whether you like it or not, the presentation of your work can often make the difference that tilts a borderline paper one way or the other. In particular for conference-style reviewing, where reviewers have to make recommendations for multiple papers in a very narrow time frame, they are often influenced by the writing and the appearance of the paper (whether they like it or not). In this talk, targeted towards junior Ph. D. students, we will make a few suggestions how to improve the presentation of your work. Most of them are obvious, but we nevertheless often see them violated in practice.

Invited talk

TIPS FOR A SUCCESSFUL PHD, AND HOW TO WIN AN AWARD WITH IT Tias Guns Vrije Universiteit Brussel (VUB), Belgium The talk will dive into some of the every day challenges a PhD student faces: working with your promoter, writing a paper, conference visits, how to get your work more widely known and up to winning an award with your thesis. The talk will have anecdotes based on my or colleagues’ experiences and indispensable references to PhD comics, XKCD and other highly valuable sources.

120


09:00

PhD Forum Program at a glance

Opening Tomislav Šmuc, Bernard Ženko

SESSION 1 09:00 - 09:35

50 Ways to Tweak Your Paper Johannes Fürnkranz

09:35 - 09:45

Automatic Object-Oriented Curriculum Generation for Reinforcement Learning Felipe Leno Da Silva, Anna Helena Real Costa

09:45 - 09:55

Scalable Generalized Dynamic Topic Models

09:55 - 10:05

MONDAY

Florian Wenzel, Patrick Jähnichen, Marius Kloft, Stephan Mandt

Heterogeneous Network Analysis for Semantic Data Mining Jan Kralj, Marko Robnik-Šikonja and Nada Lavrač

10:05 - 10:15

Burg Matrix Divergence Based Multi-Metric Learning Yan Wang, Han-Xiong

Automatically Correcting Semantic Errors in Programming Assignments

TUESDAY

10:15 - 10:21

Ben Trevett, Donald Reay, Nick Taylor

10:21 - 10:27

Gene Network Reconstruction Via Transfer Learning Across Multiple Organisms Paolo Mignone, Gianvito Pio, Michelangelo Ceci

10:27 - 10:33

Temporal Pattern Mining from Evolving Networks

10:33 - 10:40

Session Wrap-Up

10:40 - 11:00

Coffee & Tea Break

WEDNESDAY

Angelo Impedovo, Corrado Loglisci, Michelangelo Ceci

SESSION 2 11:00 - 11:35

Tips for a Successful PhD, and How to Win an Award with It Tias Guns

Integrating Knowledge Graph Information Into Neural Conversational Models

THURSDAY

11:35 - 11:45

Boulos El Asmar

11:45 - 11:55

Adaptive Shallow Architectures for Streamed Data with Parallel Deep Incremental Feature Learning Craig Bower, Ashiq Anjum

11:55 - 12:05

Differentiating Between Exogenous and Endogenous Information Propagation in Social Networks Matija Piškorec, Tomislav Šmuc, Mile Šikić

Construction and Exploration of Redescription Sets

FRIDAY

12:05 - 12:15

Matej Mihelčić, Sašo Džeroski, Nada Lavrač, Tomislav Šmuc

12:15 - 12:21

Synergy of Natural Language Processing and Statistics to Explore Food- and Nutrition-Related Data and Knowledge Tome Eftimov, Peter Korošec, Barbara Koroušić Seljak

12:21 - 12:27

Feature Ranking in Structured Data Matej Petković, Sašo Džeroski

12:27 - 12:40

Session and PhD Forum Wrap-Up

12:40 - 14:00

Lunch Break

14:00 - 15:40

Poster Session


121

FRIDAY

SEPTEMBER 22

CONGRESS HALL 1


09:00 - 18:40

AutoML Automatic Selection, Configuration and Composition of Machine Learning Algorithms Organizers :

Pavel Brazdil, LIAAD Inesc Tec., Portugal Joaquin Vanschoren, Eindhoven University of Technology, The Netherlands Holger H. Hoos, Universiteit Leiden, The Netherlands Frank Hutter, University of Freiburg, Germany

Tutorial Webpage

Workshop Webpage

https://sites.google.com/site/automl2017ecmlpkdd/tutorial/

https://sites.google.com/site/automl2017ecmlpkdd/workshop/

Tutorial

This tutorial will introduce and discuss state of the art methods in meta-learning, algorithm selection, and algorithm configuration, which are increasingly relevant today. Researchers and practitioners from all areas of science and technology face a large choice of parameterized machine learning algorithms, with little guidance as to when and how to use which technique. Data mining challenges frequently remind us that algorithm selection and configuration play a crucial role in achieving cutting-edge performance, and are indispensable in industrial applications. Meta-learning leverages knowledge of past applications of algorithms applications to learn how to select the best techniques for future applications, and offers effective techniques that are superior to humans both in terms of the quality of the end result and even more so in the time required to achieve it. Recent approaches include also (preferably very fast) partial probing runs on a given problem with the aim of determining the best strategy to be used from there onwards. This may include further probing or recommending an algorithm to be used to solve the given problem. A recent trend is to incorporate such techniques into software platforms. This synergy leads to new advances that recommend combinations of algorithms and hyperparameter settings simultaneously, and that speed up algorithm configuration by learning which parameter settings are likely most useful for dealing with the data at hand. After motivating and introducing the concepts of algorithm selection and configuration, we elucidate how they arise in machine learning and data mining, but also in other domains, such as optimization. We demonstrate how meta-learning techniques can be effectively used in this context, exploiting information gleaned from past experiments as well as by probing the data at hand. Moreover, many current applications require the use of machine learning or data mining workflows that consist of several different processes or operations. Constructing such complex systems or workflows requires extensive expertise, as well as existing metadata and software, and can be greatly facilitated by leveraging the methodologies presented at this tutorial.

Workshop

This workshop will provide a platform for discussing the recent developments in the area of algorithm selection and configuration, which arises in many diverse domains, such as machine learning, data mining, optimization and automated reasoning. Algorithm selection and configuration are increasingly relevant today. Researchers and practitioners from all areas of science and technology face a large choice of parameterized machine learning algorithms, with little guidance as to which techniques to use in a given application context. Moreover, data mining challenges frequently remind us that algorithm selection and configuration are crucial in order to achieve cutting-edge performance, and drive industrial applications. Meta-learning leverages knowledge of past algorithm applications to select the best techniques for future applications, and offers effective techniques that are superior to humans both in terms of the end result and especially in the time required to achieve it. In this workshop, we will discuss different ways of exploiting meta-learning techniques to identify the potentially best algorithm(s) for a new task, based on meta-level information, prior experiments on both past datasets and the current one. Many contemporary problems require the use of workflows that consist of several processes or operations. Constructing such complex workflows requires extensive expertise, and could be greatly facilitated by leveraging planning, meta-learning and intelligent system design. This task is inherently interdisciplinary, as it builds on expertise in various areas of AI.

Invited talk

STOCHASTIC GRADIENT DESCENT: GOING AS FAST AS POSSIBLE BUT NOT FASTER

Michele Sebag CNRS, France

122


AutoML Program at a glance 09:00

Opening Pavel Brazdil, Joaquin Vanschoren, Frank Hutter, Holger Hoos

TUTORIAL SESSION 1 Introduction Blackbox Hyperparameter Optimization and Automated Machine Learning Algorithm Configuration Meta-Learning for Algorithm Selection

MONDAY

09:00 - 10:30

Pavel Brazdil, Joaquin Vanschoren, Frank Hutter, Holger Hoos

10:40 - 11:00

Coffee & Tea Break

11:00 - 12:40

TUESDAY

TUTORIAL SESSION 2 Beyond Blackbox Optimization Automating Workflow Design AutoML Systems and Applications The Big Picture: Machine Learning for Automated Algorithm Design Pavel Brazdil, Joaquin Vanschoren, Frank Hutter, Holger Hoos

Lunch Break

WEDNESDAY

12:40 - 14:00

WORKSHOP SESSION 1 14:00 - 14:05 14:05 - 14:45

Welcome Stochastic Gradient Descent: Going As Fast As Possible But Not Faster

14:45 - 15:00

Poster Spotlights 1

15:00 - 15:40

Poster Session 1

15:40 - 16:00

Coffee & Tea Break

THURSDAY

Michele Sebag

WORKSHOP SESSION 2 16:00 - 16:25

Towards Automated Relational Data Wrangling G. Verbruggen, L. De Raedt

16:25 - 16:50

A Meta-Learning Approach to One-Step Active-Learning

16:50 - 17:05

Poster Spotlights 2

17:05 - 17:40

Poster Session 2

FRIDAY

G. Contardo, L. Denoyer, T. Artieres

WORKSHOP SESSION 3 17:40 - 18:05

Poster Session 3

18:05 - 18:35

Panel & Commentaries on Some Papers & Research Lines

18:35 - 18:40

Closing


123

FRIDAY

SEPTEMBER 22

CONGRESS HALL 2

Workshop

09:00 - 19:00

DLPM 2017 Deep Learning for Precision Medicine Organizers :

Cesare Furlanello, Fondazione Bruno Kessler, Italy Bertram Müller- Myhsok, University of Liverpool, UK Max Planck Institute of Psychiatry, Germany

Webpage http://dlpm2017.fbk.eu/ Deep Learning is beginning to exert a disruptive impact for functional genomics, with applications of high industrial and ethical relevance in pharmacogenomics and toxicogenomics. Moreover, in less than one year, Deep Learning has emerged with solutions in diagnostic imaging and pathology that have reached best human expertise, as for the success in games. Examples in miRNA prediction already demonstrated the potential for deriving implicit features with high predictive accuracy, and novel methods for genomewide association studies and prediction of molecular traits following suite are appearing both as scientific initiatives as well as key technologies of AI startups. Following the success of DLPM2016, also colocated with ECML/PKDD in 2016, we thus wish to discuss about the best options for the adoption of deep learning models, both for improved accuracy as well as for better biomedical understanding. Questions such as end-to-end modeling from structure to functionality and biological impact as well as architectures for integration of genotype, expression and epigenetics would be of immediate interest for the workshop. Further topics of interest are described in the call. This event is intended to be a one-day workshop within ECML-PKDD 2017 including lectures and discussions on this very timely, pressing and active subject. We aim to create a connection between machine learning experts and leaders in the Precision Medicine initiatives in Europe and the USA. In particular, the workshop aims to link experts from the FDA SEQC2 initiative on Precision Medicine, which will pave the way for defining optimal procedures for the development of actionable drugs that can target phenotype-selected patient groups. We wish to discuss also technical challenges such as working with very large cohorts (e.g. from 60K to 300K to 1M subjects in molecular psychiatry studies) that are now amenable for modeling with deep learning. Further, family cohorts will challenge machine learning and bioinformatics experts for new efficient solutions. In summary, both methodological aspects from deep learning, machine learning, information technology, statistics as well as actual applications, pitfalls and (medical) needs are to be featured.

Invited talk

MULTIVARIATE PREDICTIVE MODELS IN PSYCHIATRY PROMISES, PROBLEMS, PERSPECTIVES

Tim Hahn University of Münster, Germany

Invited talk

TRAINING NEURAL NETWORKS TO CALL GENOTYPES, APPLICATIONS TO KIDNEY TRANSPLANTATION AND NEPHROLOGY

Fabien Campagne Cornell University, USA

Invited talk

SYSTEMS HEALTH – CHALLENGES AND OPPORTUNITIES

Kristel Van Steen Giga Liege Universite, Belgium

124


DLPM 2017 Program at a glance 09:00 - 09:10

DLPM2017 Opening Cesare Furlanello, Bertram Müller-Myhsok

SESSION 1 Multivariate Predictive Models in Psychiatry - Promises, Problems, Perspectives

MONDAY

09:10 - 10:00

Tim Hahn

10:00 - 10:25

Tree-LSTM and Cross-Corpus Training for Extracting Biomedical Relationships from Text Joël Legrand, Yannick Toussaint, Chedy Raïssi, Adrien Coulet

10:25 - 10:40

Prediction of Cancer of Unknown Primary Site (CUP) Using Deep Learning Methods

10:40 - 11:00

Coffee & Tea Break

11:00 - 11:40

Training Neural Networks to Call Genotypes, Applications to Kidney Transplantation and Nephrology

TUESDAY

Luyao Ren, Jingcheng Yang, Bin Li, Chen Suo, Ying Yu, Yuanting Zheng, Cesare Furlanello, Leming Shi

Fabien Campagne

Augmenting Deep Learning with Random Forest for Stratification of Breast Cancer Recurrence Risk

WEDNESDAY

10:25 - 10:40

Noah Eyal-Altman, Mark Last, Eitan Rubin

10:25 - 10:40

Ph-CNN: a Keras Layer for Feature-Wise Structured Data Ylenia Giarratano, Claudio Agostinelli, Valerio Maggio, Diego Fioravanti, Marco Chierici, Calogero Zarbo, Giuseppe Jurman, Cesare Furlanello

12:40 - 14:00

Lunch Break

14:00 - 14:50

THURSDAY

SESSION 2 Systems Health – Challenges and Opportunities Kristel Van Steen 14:50 - 15:15

Deep Learning Based Integration of Clinical and Genomics Data Mooez Subhani

15:15 - 15:40

MinCall — MinION End2end Convolutional Deep Learning Basecaller Neven Miculinić, Marko Ratković, Mile Šikić

16:00 - 16:25

Coffee & Tea Break

FRIDAY

15:40 - 16:00

Read Classification Using Semi-Supervised Deep Learning Tomislav Šebrek, Jan Tomljanović, Josip Krapac, Mile Šikić

16:25 - 18:30

Tutorial: Deep Learning for Precision Medicine with Keras and Tensorflow Valerio Maggio

18:30 - 19:00

Discussion and Summary – Perspectives – All Contributors and Audience Moderators: Kristel Van Steen, Fabien Campaigne, Tim Hahn, Cesare Furlanello, Bertram Müller-Myhsok

19:00

Closing


125

FRIDAY

SEPTEMBER 22

CONGRESS HALL 3

Workshop

09:00 - 17:40

NFMCP 2017 New Frontiers in Mining Complex Patterns Organizers :

Annalisa Appice, University of Bari Aldo Moro, Italy Corrado Loglisci, University of Bari Aldo Moro, Italy Giuseppe Manco, ICAR-CNR, Italy Elio Masciari, ICAR-CNR, Italy Zbigniew W. Ras, University of North Carolina, USA

Webpage http://www.di.uniba.it/~loglisci/NFmcp17/index.html Modern automatic systems are able to collect huge volumes of data, often with a complex structure (e.g. multi-table data, XML data, web data, time series and sequences, graphs and trees). The massive and complex data pose new challenges for current research in Knowledge Discovery and Data Mining. They require new methods for storing, managing and analysing them by taking into account various complexity aspects: Complex structures (e.g. multi-relational, time series and sequences, networks, and trees) as input/output of the data mining process; Massive amounts of high dimensional data collections flooding as highspeed streams and requiring (near) real time processing and model adaptation to concept drifts; New application scenarios involving security issues, interaction with other entities and real-time response to events triggered by sensors. The purpose of the workshop is to bring together researchers and practitioners of data mining and machine learning interested in analysis of complex data. We welcome submissions focusing on recent advances and latest developments in the analysis of complex and massive data sources such as blogs, event or log data, medical data, spatio-temporal data, social networks, mobility data, sensor data and streams. Submissions discussing and introducing new algorithmic foundations and representation formalisms in complex pattern discovery are also welcome. We encourage submissions from the areas of statistics, machine learning and big data analytics, which present techniques that take advantage of the informative richness of complex massive data for efficiently and effectively identifying new patterns. Finally, submissions describing preliminary and promising studies are also welcome.

Invited talk

WHICH IS MORE INFLUENTIAL, “WHO” OR “WHEN’’ FOR A USER TO RATE IN ONLINE REVIEW SITE?

Hiroshi Motoda AFOSR/AOARD, USA and Osaka University, Japan

126


09:00 - 09:30

Opening Annalisa Appice, Corrado Loglisci, Giuseppe Manco, Elio Masciari, Elio Masciari

MORNING SESSION 1 09:30 - 10:40

Which Is More Influential, “Who’’ Or “When’’ for a User to Rate in Online Review Site? Hiroshi Motoda

10:40 - 11:00

NFMCP 2017 Program at a glance

Coffee & Tea Break MORNING SESSION 2 - PROBABILISTIC MODELING

11:00 - 11:20

Bayesian User Behavior Models Jan Reubold, Ahcène Boubekki, Thorsten Strufe, Ulf Brefeld

Generative Probabilistic Models for Positive-Unlabeled Learning

MONDAY

11:20 - 11:40

Teresa M.A. Basile, Nicola Di Mauro, Floriana Esposito, Stefano Ferilli, Antonio Vergari

MORNING SESSION 3 - CLASSIFICATION AND REGRESSION 11:40 - 12:00

Complex Localization in the Multiple Instance Learning Context Dan-Ovidiu Graur, Razvan-Alexandru Maris, Rodica Potolea, Mihaela Dinsoreanu, Camelia Lemnaru

12:00 - 12:20

Decomposition of the Output Space in Multi-Label Classification Using Feature Ranking

12:20 - 12:40

TUESDAY

Stevanche Nikoloski, Dragi Kocev, Sašo Džeroski

Understanding Climate-Vegetation Interactions in Global Rainforests Through a GP Model-Tree Analysis Anuradha Kodali, Marcin Szubert, Sangram Ganguly, Joshua Bongard, Kamalika Das

12:40 - 14:00

Lunch Break

14:00 - 14:20

WEDNESDAY

AFTERNOON SESSION 1 - BIOINFORMATICS Identifying LncRNA-Disease Relationships Via Heterogeneous Clustering Emanuele Pio Barracchia, Gianvito Pio, Donato Malerba, Michelangelo Ceci

14:20 - 14:40

Phenotype Prediction with Semi-Supervised Learning Jurica Levatić, Maria Brbić, Tomaž Stepišnik Perdih, Dragi Kocev, Vedrana Vidulin, Tomislav Šmuc, Fran Supek, Sašo Džeroski

14:40 - 15:00

Learning Association Rules for Pharmacogenomic Studies

15:00 - 15:20

THURSDAY

Giuseppe Agapito, Pietro Hiram Guzzi, Mario Cannataro

Mining a Sub-Matrix of Maximal Sum Vincent Branders, Pierre Schaus, Pierre Dupont

15:20 - 15:40

Community-Based Semantic Subgroup Discovery Blaž Škrlj, Anže Vavpetič, Jan Kralj, Nada Lavrač

15:40 - 16:00

Coffee & Tea Break

16:00 - 16:20

FRIDAY

AFTERNOON SESSION 2 - WEB AND MULTIMEDIA Segment-Removal Based Stuttered Speech Remediation Pierre Arbajian, Ayman Hajja, Zbigniew Ras, Alicja Wieczorkowska

16:20 - 16:40

Discovering of Alternative Marketplaces on the Web for Mobile App Security Monitoring Massimo Guarascio, Ettore Ritacco, Francesco Sergio Pisani, Daniele Biondo, Rocco Mammoliti, Alessandra Toma

AFTERNOON SESSION 3 - TIME SERIES 16:40 - 17:00

A Scaled-Correlation Based Approach for Generating and Analyzing Functional Networks from EEG Signals Samuel Dolean, Attila Geiszt, Raul C. Mureșan, Mihaela Dînșoreanu, Rodica Potolea, Ioana Țincaș

17:00 - 17:20

Is Unsupervised Ensemble Learning Useful for Aggregated Or Clustered Load Forecasting? Peter Laurinec and Maria Lucka

17:20 - 17:40

Closing


127

FRIDAY

SEPTEMBER 22

MEZZANINE HALL 2

Workshop

09:00 - 17:40

LIDTA 2017 Learning with Imbalanced Domains:Theory & Applications Organizers :

Luís Torgo, University of Porto; LIAAD - INESC Tec, Portugal Bartosz Krawczyk, Virginia Commonwealth University, USA Paula Branco, University of Porto; LIAAD - INESC Tec, Portugal Nuno Moniz, University of Porto; LIAAD - INESC Tec, Portugal

Webpage http://lidta.dcc.fc.up.pt/

Many real-world data-mining applications involve obtaining and evaluating predictive models using data sets with strongly imbalanced distributions of the target variable. Frequently, the least-common values are associated with events that are highly relevant for end users. This problem has been thoroughly studied in the last decade with a specific focus on classification tasks. However, the research community has started to address this problem within other contexts. It is now recognized that imbalanced domains are a broader and important problem posing relevant challenges for both supervised and unsupervised learning tasks, in an increasing number of real world applications. This workshop invites inter-disciplinary contributions to tackle the problems that many real-world domains face today. With the growing attention that this problem has collected, it is crucial to promote its development and to tackle its theoretical and application challenges.

Invited talk

MARKING THE 15-YEAR ANNIVERSARY OF SMOTE: ORIGIN, PROGRESS AND OPPORTUNITIES

Nitesh Chawla

University of Notre Dame, USA

128


09:00 - 09:10 09:10 - 10:15

LIDTA 2017 Program at a glance

Opening Remarks

Marking the 15-Year Anniversary of SMOTE: Origin, Progress and Opportunities Nitesh Chawla

10:15 - 10:40

IDEA IN A NUTSHELL (POSTERS) Tunable Plug-In Rules with Reduced Posterior Certainty Loss in Imbalanced Datasets E. Krasanakis, E. Spyromitros-Xioufis, S. Papadopoulos, Y. Kompatsiaris

MONDAY

Evaluation of Ensemble Methods in Imbalanced Regression Tasks N. Moniz, P. Branco, L. Torgo

Controlling Imbalanced Error in Deep Learning with the Log Bilinear Loss Y. Resheff, A. Mandelbom, D. Weinshall

Unsupervised Classification of Speaker Profiles As a Point Anomaly Detection Task

TUESDAY

C. Fayet, A. Delhay, D. Lolive, P.-F. Marteau

Dealing with the Task of Imbalanced, Multidimensional Data Classification Using Ensembles of Exposers

10:40 - 11:00

Coffee & Tea Break

11:00 - 12:40

PAPER PRESENTATIONS 1

WEDNESDAY

P. Ksieniewicz, M. Wozniak

Predicting Defective Engines Using Convolutional Neural Networks on Temporal Vibration Signals N. Guennemann, J. Pfeffer

Influence of Minority Class Instance Types on SMOTE Imbalanced Data Oversampling P. Skryjomski, B. Krawczyk

THURSDAY

A Network Perspective on Stratification of Multi-Label Data P. Szymański, T. Kajdanowicz

Stacked-MLkNN: A Stacking Based Improvement to Multi-Label K-Nearest Neighbours A. Pakrashi, B. Mac Namee

Sampling a Longer Life: Binary Versus One-Class Classification Revisited 12:40 - 14:00

Lunch Break

14:00 - 15:00

PAPER PRESENTATIONS 2

FRIDAY

S. Sharma, C. Bellinger, O. Zaiane, N. Japkowicz

Improving Resampling-Based Ensemble in Churn Prediction Z. Bing, S. Vanden Broucke, B. Baesens, S. Maldonado

SMOGN: a Pre-Processing Approach for Imbalanced Regression P. Branco, L. Torgo, R. P. Ribeiro

Effect of Data Imbalance on Unsupervised Domain Adaptation of Part-of-Speech Tagging and Pivot Selection Strategies X. Cui, F. Coenen, D. Bollegala

15:00 - 15:40

Poster Session

15:40 - 16:00

Coffee & Tea Break

16:00 - 17:30

Discussion Table: What’S Next?

17:30 - 17:40

Closing Remarks ECML PKDD 2017 SKOPJE MACEDONIA

129

FRIDAY

SEPTEMBER 22

MEZZANINE HALL 3

Workshop

09:00 - 12:40

DMSC 2017 Data Mining with Secure Computation Organizers :

Mykola Pechenizkiy, Eindhoven University of Technology, The Netherlands Stefan Kramer, University of Mainz, Germany Niek J. Bouman, Eindhoven University of Technology, The Netherlands

Webpage http://www.soda-project.eu/dmsc-workshop

Security and privacy aspects of data analytics become of central importance in many application areas. New legislation also pushes companies and research communities to address challenges of privacy-preserving data analytics. In our data mining community, questions about data privacy and security have been predominantly approached from the perspective of k-anonymity and differential privacy. The aim of this workshop is to draw attention to secure multiparty computation (MPC), a subfield of cryptology, as the key foundation for building privacy-preserving data mining (DM) and machine learning (ML). In this approach, sensitive data is typically secret-shared over multiple players, such that those players can jointly perform DM / ML computations, but individual players (or collusions) cannot learn anything about the data, beyond the result of the computations.

Invited talk

INTRODUCTION TO SECURE MULTIPARTY COMPUTATION

Berry Schoenmakers

Eindhoven University of Technology, The Netherlands

Invited talk

SECURE MULTIPARTY COMPUTATION FOR PRIVACY-PRESERVING STATISTICS

Kurt Nielsen

Partisia, Denmark

130


DMSC 2017 Program at a glance 09:00 - 09:05

Opening

09:05 - 09:45

Introduction to Secure Multiparty Computation Berry Schoenmakers

Secure Data Mining with the Sharemind Platform, and a Use Case in Privacy-Preserving Tax Fraud Detection

MONDAY

09:45 - 10:25

Sander Siim

10:25 - 10:40

Privacy-Preserving Patent Search: Additively Homomorphic Encryption Techniques for Private Text Mining Over Public Datasets Jennifer Katherine Fernick, Sebastian Verschoor

11:00 - 11:40

Coffee & Tea Break

TUESDAY

10:40 - 11:00

Secure Multiparty Computation for Privacy-Preserving Statistics Kurt Nielsen

11:40 - 12:10

Processing Encrypted Data Using Homomorphic Encryption

12:10 - 12:25

WEDNESDAY

Anthony Barnett, Charlotte Bonte, Carl Bootland, Joppe W. Bos, Wouter Castryck, Anamaria Costache, Louis Goubin, Ilia Iliashenko, Tancrède Lepoint, Michele Minelli, Pascal Paillier, Nigel P. Smart, Frederik Vercauteren, Srinivas Vivek, Adrian Waller

Decentralised and Privacy-Aware Learning of Traversal Time Models Thanh Le Van, Aurélien Bellet, Jan Ramon

THURSDAY

Closing Panel Discussion: Challenges and Trade-Offs in Secure Computation in a Data Mining Context

FRIDAY

12:25 - 12:40


131

FRIDAY

SEPTEMBER 22

MEZZANINE HALL 4

Workshop

09:00 - 12:40

DMNLP 2017 Interactions between Data Mining and Natural Language Processing Organizers :

Peggy Cellierm, INSA Rennes, France Thierry Charnois, Université de Paris, France Andreas Hotho, University of Kassel, Germany Marie-Francine Moens, KU Leuven, Belgium Stan Matwin, Dalhousie University, Canada Yannick Toussaint, INRIA, France

Webpage http://dmnlp.loria.fr/ The workshop will favor the use of symbolic methods. Indeed, statistical and machine learning methods (CRF, SVM, Naive Bayes) holds a predominant position in NLP researches and ”may have been too successful (…) as there is no longer much room for anything else”. They have proved their effectiveness for some tasks but one major drawback is that they do not provide human readable models. By contrast, symbolic machine learning methods are known to provide more human-readable model that could be an end in itself (e.g., for stylistics) or improve, by combination, further methods including numerical ones. Research in Data Mining has progressed significantly in the last decades, through the development of advanced algorithms and techniques to extract knowledge from data in different forms. In particular, for two decades Pattern Mining has been one of the most active field in Knowledge Discovery. Recently, a new field has emerged taking benefit of both domains: Data Mining and NLP. The objective of DMNLP is thus to provide a forum to discuss how Data Mining can be interesting for NLP tasks, providing symbolic knowledge, but also how NLP can enhance data mining approaches by providing richer and/or more complex information to mine and by integrating linguistic knowledge directly in the mining process. The workshop aims at bringing together researchers from both communities in order to stimulate discussions about the cross-fertilization of those two research fields. The idea of this workshop is to discuss future directions and new challenges emerging from this cross-fertilization of Data Mining and NLP and in the same time to initiate collaborations between researchers of both communities.

132


DMNLP 2017 Program at a glance 09:00 - 09:05

Opening Session

09:05 - 10:35

MORNING SESSION 1 Cluster-Based Graphs for Conceiving Dialog Systems

MONDAY

Jean Leon Bouraoui, Vincent Lemaire

Characterising the Difference and the Norm Between Sequence Databases Frauke Hinrichs, Jilles Vreeken

Summarising Event Sequences Using Serial Episodes and an Ontology 10:40 - 11:00

Coffee & Tea Break

11:00 - 12:30

MORNING SESSION 2

14:00 - 15:40

TUESDAY

Kathrin Grosse, Jilles Vreeken

Making Efficient Use of a Domain Expert’S Time in Relation Extraction Linara Adilova, Sven Giesselbach, Stefan Rueping

Combining Syntactic and Sequential Patterns for Unsupervised Semantic Relation Extraction

WEDNESDAY

Nadège Lechevrel, Kata Gábor, Isabelle Tellier, Thierry Charnois, Haifa Zargayouna. Davide Buscaldi

Correcting Linguistic Training Bias in an FAQ-Bot Using LSTM-VAE Mayur Patidar, Puneet Agarwal, Lovekesh Vig, Gautam Shroff

THURSDAY

Closing Session

FRIDAY

12:30 - 12:40


133

FRIDAY

SEPTEMBER 22

CONGRESS HALL 4

Workshop

14:00 - 19:00

KnowMe 2017 KNOWledge Discovery from Mobility & Transportation Systems Organizers :

Luis Moreira-Matias, NEC Laboratories Europe, Germany Roberto Trasarti, KDD Lab ISTI-CNR, Italy Rahul Nair, IBM Research Ireland

Webpage http://kdd.isti.cnr.it/knowme.eu/

The recent technological advances on telecommunications create a new reality on mobility sensing. Nowadays, we live in an era where ubiquitous digital devices are able to broadcast rich information about human mobility in real-time and at high rate. Such fact exponentially increased the availability of large-scale mobility data which has been popularized in the media as the new currency, fueling the future vision of our smart cities that will transform our lives. The reality is that we just began to recognize significant research challenges across a spectrum of topics. Consequently, there is an increasing interest among different research communities (ranging from civil engineering to computer science) and industrial stakeholders on build knowledge discovery pipelines over such data sources. However, such availability also raise privacy issues that must be considered by both industrial and academic stakeholders on using these resources. This workshop intends to be a top-quality venue to bring together transdisciplinary researchers and practitioners working in topics from multiple areas such as Data Mining, Machine Learning, Numerical Optimization, Public Transport, Traffic Engineering, Multi-Agent Systems, Human-Computer Interaction and Telecommunications, among others. The ultimate goal of this venue is to evaluate not only the theoretical contribution of the data driven methodology proposed in each research work, but also its potential deployment/impact as well as its advances with respect to the State-of-the-Art/State-of-the-Practice in the domains of the related applications.

Invited talk

BIG DATA FOR CITY SCIENCE: THE POWER HUMAN MOBILITY NETWORKS

Fosca Giannotti

ISTI-CNR, Italy

134


KnowMe 2017 Program at a glance 14:00 - 14:05

Opening

14:00 - 15:40

SESSION 1 Big Data for City Science: the Power Human Mobility Networks

MONDAY

Fosca Giannotti

On Evaluating Floating Car Data Quality for Knowledge Discovery

15:40 - 16:00

Coffee & Tea Break

16:00 - 18:00

SESSION 2

TUESDAY

Spatio-Temporal Profiling of Public Transport Delays Based on Large Scale Vehicle Positioning Data from GPS in Wroclaw

Traffic Patterns Recognition Through Big Data: Case Study in Thessalonik (Demo)

WEDNESDAY

Car Sharing Through the Data Analysis Lens Low-Dimensional Multi-Task Representation of Bivariate Histogram Data from Truck OnBoard Sensors

THURSDAY

Mining Smart Card Data for Travelers Mini Activities Automatic Recognition of Public Transport Trips from Mobile Device Sensor Data and Transport Infrastructure Information Closing

FRIDAY

18:00 -


135

FRIDAY

SEPTEMBER 22

MEZZANINE HALL 4

Workshop

14:00 - 17:40

DARE 2017 Data Analytics for Renewable Energy Integration Organizers :

Wei Lee Woon, Masdar Institute, UAE Zeyar Aung, Masdar Institute, UAE Oliver Kramer, University of Oldenburg, Germany Stuart Madnick, Massachusetts Institute of Technology, USA

Webpage http://dare2017.dnagroup.org/

Climate change, the depletion of natural resources and rising energy costs have led to an increasing focus on renewable sources of energy. A lot of research has been devoted to the technologies used to extract energy from these sources; however, equally important is the storage and distribution of this energy in a way that is efficient and cost effective. Achieving this would generally require integration with existing energy infrastructure. The challenge of renewable energy integration is inherently multidisciplinary and is particularly dependent on the use of techniques from the domains of data analytics, pattern recognition and machine learning. Examples of relevant research topics include the forecasting of electricity supply and demand, the detection of faults, demand response applications and many others. This workshop will provides a forum where interested researchers from the various related domains will be able to present and discuss their findings.

136


DARE 2017 Program at a glance 14:00 - 14:05

Welcome Address

14:05 - 14:25

Solar Energy Forecasting and Optimization System for Efficient Renewable Energy Integration Diana Manjarres, Ricardo Alonso, Sergio Gil, Itziar Landa

Gradient Boosting Models for Photovoltaic Power Estimation Under Partial Shading Conditions

MONDAY

14:30 - 14:50

Nikolaos Nikolaou, Efstratios Batzelis, Gavin Brown

14:55 - 15:15

Multi-Objective Optimization for Power Load Recommendation Considering User’S Comfort Jaroslav Loebl, Helmut Posch, Viera Rozinajova

Minimizing Grid Interaction of Solar Generation and DHW Loads InnnZEBs Using Model-Free Reinforcement Learning

TUESDAY

15:20 - 15:40

Adhra Ali, Hussain Kazmi

15:40 - 16:00 16:00 - 16:20

Coffee & Tea Break Improving Time-Series Rule Matching Performance for Detecting Energy Consumption Patterns Maël Guillemé, Laurence Rozé, Véronique Masson, Cérès Carton, René Quiniou, Alexandre Termier

Identifying Representative Load Time Series for Load Flow Calculations

WEDNESDAY

16:25 - 16:45

Janosch Henze, Tanja Kneiske, Martin Braun, Bernhard Sick

16:50 - 17:10

Scalable Gaussian Process Models for Solar Power Forecasting Astrid Dahl and Edwin Bonilla

17:15 - 17:35

NWP Ensembles for Wind Energy Uncertainty Estimates Alejandro Catalina Feliu, Jose Dorronsoro

Closing

THURSDAY

17:35 - 17:40

Papers to Be Presented As Posters An Approach for Erosion and Power Loss Prediction of Wind Turbines Using Big Data Analytics Dina Fawzy, Sherin Moussa, Nagwa Badr

Probabilistic Wind Power Forecasting By Using Quantile Regression Analysis

FRIDAY

Mehmet Baris Ozkan, Umut Guvengir, Dilek Kucuk, Ali Unver Secen, Serkan Buhan, Turan Demirci, Abdullah Bestil, Ceyda Er, Pinar Karagoz

Wind Speed Forecasting Using Statistical and Machine Learning Methods: A Case Study in the UAE Khawla Al Dhaheri, Wei Lee Woon, Zeyar Aung


137

FRIDAY

SEPTEMBER 22

CONGRESS HALL 4

Tutorial

09:00 - 12:40

Deep Learning for Computer Vision Applications: Robotics & Driving Organizers :

Anelia Angelova, Google Research / Google Brain, USA Sanja Fidler , University of Toronto, Canada

Webpage https://sites.google.com/site/deeplearningvision/home

Deep Learning methods have become ubiquitous for computer vision tasks. This tutorial will focus on recent advances in deep learning for vision applications in robotics and autonomous vehicles. The tutorial will start with basic Deep Learning techniques and will highlight state-of-the-art methods in the three major topics in computer vision: classification, detection and segmentation. Then the tutorial will continue with more concrete methods and their applications, e.g. in scene understanding, 3D analysis, perception for robotics and autonomous driving. The goal of the tutorial is to focus on relevant techniques, which are of significant impact to real-world applications, and which will benefit the broader Machine Learning community.

09:00 - 10:40

SESSION 1 PART 1 : Deep learning for Computer Vision PART 2 : Deep learning for Perception in Robotics

10:40 - 11:00

Coffee & Tea Break

11:00 - 12:40

SESSION 2 PART 3 : Deep Learning for 3D PART 4 : Deep Learning for Perception in Autonomous Driving

138


MEZZANINE HALL 3

Tutorial

14:00 - 17:40

Machine Learning with Fossil Data: Analyzing Environmental & Climate Change Indrė Žliobaitė, University of Helsinki, Finnland

MONDAY

Organizier :

Global fossil databases have been growing rapidly in the last decade. They aggregate and accumulate findings and knowledge that palaeobiologists acquired over many years. These datasets are big data in their essence - compiled from different sources, to an extent subjective, include specific biases and uncertainties, data sparseness and quality varies over time and space. In addition, to understand relations between organisms and climate high volume and large velocity satellite observations some into play that require scalability in computing. Databases of this kind offer an excellent ground for interdisciplinary machine learning research. This tutorial will outline research questions that could be addressed using computational methods, discuss characteristics of fossil data and computational tasks for machine learning and data mining, overview existing computational approaches, and discuss what more could be done from the machine learning and data mining perspective.

SESSION 1

THURSDAY

14:00 - 14:40

WEDNESDAY

http://www.zliobaite.com/fossils_tutorial

TUESDAY

Webpage

Analysis of the fossil record: research questions, challenges and trends Fossil Data and Fossil Databases 14:40 - 14:50

Short Break

14:50 - 15:40

SESSION 2

15:40 - 16:00

Coffee & Tea Break

16:00 - 16:40

SESSION 3

FRIDAY

Overview of Machine Learning Tasks and Existing Approaches

Modeling Relationships Between Organisms and their Environments 16:40 - 16:50

Short Break

16:50 - 17:40

SESSION 4 Tracking communities over time and space Analysis of Macroevolution Processes


139

140


ORGANIZATION ECML PKDD 2017 SKOPJE MACEDONIA

141

GENERAL CHAIRS Sašo Džeroski

Michelangelo Ceci

Jožef Stefan Institute Slovenia

Univeristy of Bari Italy

PROGRAM CHAIRS Michelangelo Ceci

Jaakko Hollmén

Ljupčo Todorovski

Celine Vens

Univeristy of Bari Italy

Aalto University Finland

Univeristy of Ljubljana Slovenia

KU Leuven Kulak Belgium

JOURNAL TRACK CHAIRS Kurt Driessens

Dragi Kocev

Marko Myra Robnik-Šikonja Spiliopoulu

Maastricht University The Netherlands

Jožef Stefan Instiute Slovenia

Univeristy of Ljubljana Slovenia

APPLIED DATA SCIENCE TRACK CHAIRS Yasemin Altun

Kamalika Das

Taneli Mielikäinen

Google Research Switzerland

NASA Ames Research Center USA

Yahoo! USA

142


Magdeburg University Germany

NECTAR TRACK CHAIRS Donato Malerba

Jerzy Stefanowski

University of Bari Italy

Poznan University of Technology Poland

DEMONSTRATION TRACK CHAIRS Jesse Read

Marinka Žitnik

École Polytechnique France

Stanford University USA

WORKSHOPS & TUTORIALS CHAIRS Nathalie Japkowicz

Panče Panov

American University USA


EU PROJECTS FORUM CHAIRS Petra Kralj Novak

Nada Lavrač




143

PHD FORUM CHAIRS Tomislav Šmuc

Bernard Ženko

Rudjer Bošković Institute Croatia


DISCOVERY CHALLENGE CHAIR Dino Ienco IRSTEA - UMR TETIS France

AWARDS COMMITTEE Peter Flach

Rosa Meo

Indrė Žliobaitė

University of Bristol UK

University of Torino Italy

University of Helsinki Finland

PRODUCTION & PUBLIC RELATIONS CHAIRS Dragi Kocev

Nikola Simidjievski



144


LOCAL ORGANIZATION Ivica Dimitrovski

Tina Anžič

CHAIR Ss. Cyril & Methodious Univerisy, Jožef Stefan Institute Macedonia Slovenia

Mili Bauer

Gjorgji Madjarov


Ss. Cyril & Methodious Univerisy, Macedonia

SPONSORSHIP CHAIRS Albert Bifet

Panče Panov

Télécom ParisTech France


PROCEEDINGS CHAIRS Jurica Levatić

Gianvito Pio


University of Bari Italy


145

ECML PKDD 2017 Conference brochure Editorial board: Nikola Simidjievski Dragi Kocev Panče Panov Design and pre-press: Nikola Simidjievski Photography: Nikola Simidjievski Print: Ri-grafika, Skopje Macedonia The editing of the conference brochure was closed on September 10, 2017. Changes to the program made after this date are not reflected in its contents.

September 2017, Ljubljana, Slovenia

146