Data science: from predic$on to decision making

4 downloads 0 Views 7MB Size Report
... innocent fi ng the descrip$on. • Confounding direct and inverse condi$onal probability is known as the. Prosecutor's fallacy, a common default in reasoning ...
Data science: from predic1on to decision making Gianluca Bontempi Machine Learning Group, Computer Science Department, Université Libre de Bruxelles, Belgium mlg.ulb.ac.be

World is so complex … •  •  •  •  •  •  • 

Uncertain Nonlinear Time varying Mul1agent Mul1variate Mul1criteria Irreversible

and tweets so short…

What is knowledge?

•  •  •  • 

Philosophy begins with wonder (Aristotle). Scien1fic discovery begins with the need of explaining complexity and variability. Knowledge as informa1on. Informa(on as reduc(on of uncertainty. Data science is the computa1onal version of scien1fic discovery.

What is data science? Set of methodologies (notably sta:s:cs, machine learning, econometrics, data mining) that, by means of computa:onal tools, allow the extrac'on of knowledge from data in a complex world.

Who is a data scien1st?

Someone who 1.  knows sta:s:cs be@er than a computer scien:sts and 2.  codes be@er than a sta:s:cian...

Why data science? •  Increasing availability of big data (samples, variables) •  Ubiquitous measurement systems (web,communica1ons, gps, bank, sensors). •  Desire of extrac1ng value from data •  Growing compu1ng power. •  Search of informa1ve paPerns hidden within complex masses of data. •  New adap1ve, automated, intelligent, smart, apps. •  Predic1on. •  Solving hard problems in a new manner. •  Decision support.

Credit card fraud detec1on •  Goal: predict nature of transac1on (fraud or genuine) •  Financial impact •  Millions of transac1ons per day •  Nonsta1onarity •  Unbalanced data •  Detec1on of anomalous behaviour •  Real-1me issues •  False posi1ve •  Interac1on with humans

Breast cancer diagnosis/prognosis •  Goal: predict survival on the basis of genomic measures •  Health impact •  Millions of variables •  Low number of pa1ents •  High noise •  Interpreta1on issues •  Instability •  Tradeoff false posi1ve vs. false nega1ve

Time series forecas1ng •  •  •  •  •  •  • 

Environmental impact Spa1o-temporal aspects Large dimensionality Online learning High noise Nonsta1onarity Long term forecas1ng

Supervised learning INPUT

UNKNOWN DEPENDENCY

OUTPUT

TRAINING DATASET

MODEL

PREDICTION

PREDICTION ERROR

No1on of science evolves with 1me Let no enter here no-one without geometry (Plato) In God we trust (?), all the others must bring data ! (W.E. Deming)

Deduc1on vs Induc1on HYPOTHESIS: All poli:cians are smart EVIDENCE: Parliament members are poli:cians CONCLUSION: Parliament members are smart!

OBSERVATION: Professors I met so far were smart EVIDENCE: I am listening to a professor. CONCLUSION: He is smart !

Deduc1on vs Induc1on •  Deduc1ons move from the general to the par1cular with certainty (true or false). •  Induc1ons move from the par1cular to the general with probability (degree of truth). •  Deduc1on is not amplia(ve: its conclusion is already contained, though implicitly, in the premises •  Induc1on is amplia(ve: its conclusion contains more informa1on than the premises •  Mathema1cal science is purely deduc1ve, sta1s1cs is induc1ve •  Conven1onal science is a good mix of deduc1ve and induc1ve prac1ces. •  Data science shi]s definitely the balance on the induc1ve side. Induc:on is the glory of science and the scandal of philosophy

Russell’s induc1vist turkey

The regularity gamble All reasonings concerning nature are founded on experience, and all reasonings from experience are founded on the supposi'on that the course of nature will con'nue uniformly the same or in other terms that the future will be like the past. (Hume, 1739)

What is a science according to Popper •  Demarca(on problem: what is a science and what is not? •  Hypothesis is scien1fic if and only if it makes predic1ons and can be refuted by data. •  What make a statement scien1fic is not the complexity of the formalism but the fact of providing a way to falsify it (astrology vs astronomy) •  No posi1ve experimental outcome can demonstrate the truth of a scien1fic theory, a single genuine counter instance can refute it. •  Scien1st actude is both bold and humble at the same 1me: No amount of experimenta:on can ever prove me right; a single experiment can prove me wrong. (Einstein) •  What about poli1cians?

My astrological report •  Iden1ty and Ego: Your name - which consist of 8 lePers, your date of birth and your star sign reveal your strongest personality traits... ..you are an ac1ve person, kind in rela1onships and very cau1ous •  Social Life: … you have had recent problems with someone close to you and that you are finding these problems hard to overcome. •  Future: …something important will occur in the next three months From Diana, astrologicalcoach.com

Is poli1cian language falsifiable? Ciò che vogliamo farne è una libera organizzazione di ele@rici e di ele@ori di :po totalmente nuovo: non l’ennesimo par:to o l’ennesima fazione che nascono per dividere, ma una forza che nasce invece con l’obieVvo opposto; quello di unire, per dare finalmente all’Italia una maggioranza e un governo all’altezza delle esigenze più profondamente sen:te dalla gente comune. Ciò che vogliamo offrire agli italiani è una forza poli:ca fa@a di uomini totalmente nuovi. Ciò che vogliamo offrire alla nazione è un programma di governo fa@o solo di impegni concre: e comprensibili. Noi vogliamo rinnovare la società italiana, noi vogliamo dare sostegno e fiducia a chi crea occupazione e benessere, noi vogliamo acce@are e vincere le grandi sfide produVve e tecnologiche dell’Europa e del mondo moderno. Noi vogliamo offrire spazio a chiunque ha voglia di fare e di costruire il proprio futuro, al Nord come al Sud vogliamo un governo e una maggioranza parlamentare che sappiano dare adeguata dignità al nucleo originario di ogni società, alla famiglia, che sappiano rispe@are ogni fede e che susci:no ragionevoli speranze per chi è più debole, per chi cerca lavoro, per chi ha bisogno di cure, per chi, dopo una vita operosa, ha diri@o di vivere in serenità.

Truth is a model The most common misunderstanding about science is that scien:sts seek and find truth. They don’t. They make and test models…. Making sense of anything means making models that can predict outcomes and accommodate observa:ons. Truth is a model. (Neil Gershenfeld, American physicist, 2011)

Data science and decision process

Data collec1on

DS/ Predic1on/ Modeling

Decision making

Decision making for the dummies 1) Predic1on of hidden state

Probabili'es

Sick

Healthy

State of pa1ent

P

1-P

2) Decision : •  Mul1criteria •  Risk+u1lity •  Ethics •  Closed-loop

Costs

Sick

Healthy

Drug

TP cost

FP cost

Placebo

FN cost

TN cost

•  Predic1ons are falsifiable •  Decisions not necessarily… but should be – evidence-based – risk aware – ra1onal – formalized – documented (open?)

•  Data science may definitely help

Data science challenges •  Naïve vision of data science •  Innumeracy of decision makers •  Excess of op1mism –  Overficng –  Conceal of uncertainty –  Misuse of conclusions

•  Maturity of the technology

–  Data science algorithms may be fooled –  Lack of transparency (black boxes, overparametriza1on) –  Lack of interpreta1on

•  GDPR: General Data protec1on regula1on –  –  –  – 

Right to explana1on Right to be forgoPen Data protec1on by Design and by Default Serious sanc1ons: a fine up to 20M EUR

Don’t be naive about data science •  No tabula rasa learning •  Data collec1on is not neutral –  sampling bias –  theory ladenness: hypothesis o]en precedes observa1on

•  Passive (observa1onal) vs ac1ve (experimental) data •  Associa1on vs. causa1on

Associa1on vs causa1on Sport y performance

x Number of burgers per day

Innumeracy of decision makers in medicine

•  The probability that a woman has breast cancer is 1 percent (prevalence). •  If a woman has breast cancer, the probability is 90 percent that she will have a posi1ve mammogram (sensi:vity). •  If a woman does not have breast cancer, the probability is 9 percent that she will s1ll have a posi1ve mammogram (false posi:ve rate). What is the probability that a woman who tested posi1ve has breast cancer ? 1.  “It is not certain that you have breast cancer, yet the probability is about 81 percent.” [14] 2.  “Out of 10 women who test posi1ve as you did, about 9 have breast cancer.” [47] 3.  “Out of 10 women who test posi1ve as you did, only about 1 has breast cancer.” [20] 4.  “The chance that you have breast cancer is about 1 percent.” [19] Excerpt from Gigerenzer “Ra:onality of mortals” book

Bayes’ theorem

Innumeracy of decision makers in law •  A crime took place in a big city (1M of inhabitants) and eye witnesses described the murderer as a tall man, wearing glasses, with Italian accent and supporter of Fioren1na. •  Later in the day a man ficng the descrip1on was arrested by policemen. •  A sta1s1cal expert was contacted and his study showed that the probability of having such profile is very low and about 1 out of 100,000. •  The prosecu1on lawyer asked for condemna1on since "the chance of him being innocent is so very small, i.e. 1 out of 100,000". •  Thet defendant replied " Do you know that the popula1on of the city is 1M? So the average number of persons matching such descrip1on is 10. His chance of being innocent is not so 1ny since it is 9/10 and not one in 100000" •  Probability of ficng the descrip1on being innocent is not the probability of being innocent ficng the descrip1on. •  Confounding direct and inverse condi1onal probability is known as the Prosecutor’s fallacy, a common default in reasoning

The use of probabilis:c reasoning is likely to increase drama:cally in the future. Are decision makers and people ready for this?

Overficng vs underficng

Overficng: finding illusory paPerns in data, ficng noise not signal Underficng: using too simple models, no signal retrieval

Invented paPerns

Lost in (tweeted) transla1on…

•  A prevision bePer than previsions… •  Where has uncertainty gone? •  Who’s the merit?

Nguyen A et al. paper on how to fool DNN

Another hype cycle? Where are we?

Perspec1ves •  Data is the new oil •  Interdisciplinary domain –  –  –  –  –  – 

Archaeology Astronomy High energy physics Social sciences Environment Mobility

•  Accurate predic1ons about real world is the only convincing evidence that we gain our understanding of the world. •  Poten1al for enterprise crea1on and youngsters (but also for reconversion of middle age computer scien1sts) •  From open data to open decisions

Open society (Popper) A cri:cal and fallibilist vision of the sciences, in which refuta:ons are taken seriously and conjectures are regarded as risky, is the best founda:on of an open society

In an open society •  the ci1zen may have access to government policies and evaluate cri(cally the consequences of their implementa1on •  poli1cian statements are falsifiable •  undesirable policies will be eliminated in a manner analogous to the elimina1on of falsified scien1fic theories •  differences between people on social policy will be resolved by cri1cal discussion and argument rather than by force.

The data scien1st’s dream

Floa(ng down through the clouds Memories come rushing up to meet me now. And in the corner of some foreign field I had a dream. I had a dream. … Take heed of his dream. Take heed. (Pink Floyd)

Keep on discussing on the web

•  •  •  •  • 

Just Google “Bontempi MLG” Web page mlg.ulb.ac.be Blog page Facebook page LinkedIn page

7 sins in predic1ng AI future (Brooks) 1.  Amara’s law: We tend to overes:mate the effect of a technology in the short run and underes:mate the effect in the long run. Example: GPS 2.  Any sufficiently advanced technology is indis:nguishable from magic. 3.  Too op:mis:c generaliza:on of AI agents competence. –  A tool recognizing cats into images do not knows anything about cats physiology… 4.  Suitcase words. –  Machine learning is a pale imita1on of human learning 5.  Exponen:alism: –  People tend to forget satura1on, no necessity for exponen1al growth 6.  Hollywood scenarios usually do not have any connec:on to future reality. –  2001 Odyssey was a very bad predictor 7.  Speed of deployment: Almost all AI innova:ons take far, far, longer to be really widely deployed than people in the field and outside the field imagine. –  First successful trial of an autonomous van occurred in 1987

Logical fallacies If it rains, the pavement is wet. Correct deduc1ons: •  It rained then the pavement is wet. •  The pavement is not wet then it did not rain Logical fallacies •  It did not rain then the pavement is not wet (denying the antecedent) •  The pavement is wet then it rained (affirming the consequent)

Machines don’t think like us..

State-of-the-art DNNs trained on ImageNet believe with 99.6% certainty that those are familiar object.

Suggest Documents