Fundamental Artificial Intelligence

Fundamental Artificial Intelligence: Machine Performance in Practical Turing tests Huma Shah, Kevin Warwick, Ian M. Bland, Chris D. Chapman School of Systems Engineering, The University of Reading, UK

ICAART 2014, Angers Saint Laud, 6-8 March

Overview • Turing’s imitation Game • Position statement • Turing’s two tests to practicalise his imitation game • Experiment: tests’ strength comparison • Results • Purpose & Outcome ICAART 2014, Angers Saint Laud, 6-8 March

Turing1947 • Lecture to London Mathematical Society: – “… the machine must be allowed to have contact with human beings in order that it may adapt itself to their standards” – “… game of chess may perhaps be rather suitable for this purpose” – Sense of “fair play” – machine hidden so not judged on beauty or tone of voice


Turing1948 • 1948 “first manifesto of AI” (Jack Copeland, 2004) Intelligent Machinery essay at NPL: – Idea of intelligent is itself emotional rather than mathematical – Believing in possibility of making thinking machinery …. It is possible to make machinery to imitate any small part of a man – Learning of languages most impressive – Analogy with human brain – guiding principle – Suitable education ICAART 2014, Angers Saint Laud, 6-8 March

Turing1950 • Section 2 Critique of the New Problem: – “Question-answer method suitable for introducing almost any one of the fields of human endeavour” – “May not machines carry out something which ought to be described as thinking but which is very different from what a man does?” – “When playing the ‘imitation game’ the best strategy for the machine .. is to try to provide the answers that would naturally be given…”

• Section 6 Contrary Views on the Main Question – 6(3)The Mathematical Objection : “some questions to which it [machine] will either give a wrong answer, or fail to give an answer” – “we too often give wrong answers to questions ourselves to be justified in being very pleased at such evidence of fallibility on the part of the machines” – 6 (4) Argument from Consciousness: “If [the machine’s ] answers are satisfactory and sustained … [not] ‘an easy contrivance’ ” – “Let us listen in to a part of such a viva voce:

Interrogator: Would you say Mr. Pickwick reminded you of Christmas? Witness: In a way” ICAART 2014, Angers Saint Laud, 6-8 March

Turing 1951a • Intelligent Machinery, A Heretical Theory – “It is clearly possible to produce a machine which would give a very good account of itself for any range of tests, if the machine were made sufficiently elaborate” – Orders of magnitude

– “Education process.. Essential to production of a reasonably intelligent machine within a reasonably short space of time. The human analogy alone suggests this”


Turing1951b • Can Digital Computers Think?: – “The whole thinking process is still rather mysterious to us, but I believe that the attempt to make a thinking machine will help us greatly in finding out how we think ourselves”


Turing1952 • BBC radio discussion (Anchor-Braithwaite, Turing, Jefferson, Maxwell): – Turing: “I would like to suggest a particular kind of test … to see whether the machine thinks”… idea of the test – machine answers questions… “a considerable portion of a jury, who should not be expert about machines, must be taken in by the pretence” – Thinking .. “sort of buzzing … inside my head” – Prediction: “At least 100 years” [machine does well in imitation game] ICAART 2014, Angers Saint Laud, 6-8 March

Turing’s imitation game 1947-1952 • Two methods to practicalise:

• simultaneous comparison test – 3 participants – Judge + 2hidden entities ICAART 2014, Angers Saint Laud, 6-8 March

viva voce test 2 participants Judge + 1 hidden entity

First ever viva voce test • Eliza first NLU programme, but PARRY – programme simulating schizophrenic patient in first ‘test’ • Heiser et al. 1979 study expanded on previous work of psychiatrist Colby et al.: – Could 5 psychiatrists in a one-to-one situation distinguish between real (22 year-old, in-patient) or simulated paranoia (PARRY) – Results were random ICAART 2014, Angers Saint Laud, 6-8 March

Simultaneous comparison • Format of Kurzweil/Kapor 2001 wager on Turing test • Conducted from 2004 in Loebner Prize previously viva voce implemented in Hugh Loebner’s contests (See Shah & Warwick chapter on downward trend of machines in recent Loebner Prizes)


Position Statement • AI is NOT hung up on TT: see Searle, Block, Hayes & Ford, and other “tedious granny objections”(Harnad, 2002) • Turing’s imitation game fundamental for robotics (e.g. care robots need to talk to cared-for) • In Turing’s imitation game the simultaneous comparison scenario is a more difficult test for the machine to achieve deception in, than Turing’s viva voce scenario ICAART 2014, Angers Saint Laud, 6-8 March

2012 Bletchley Park Tests • Implemented on 100th anniversary of Alan Turing’s birth: 23 June 2012 • Staged simultaneous comparison and viva voce side-by-side • Turing’s 5 mins (‘thin slice’ / first impression) • Tests conducted in English • Unrestricted questions (whatever judge / interrogator wished to ask) • Message-by-message screen display ICAART 2014, Angers Saint Laud, 6-8 March

Participants • Machines: recruited based on performance in previous Turing tests • Humans (M/F; adult/teenagers; native/non-native): – Turing’s “average interrogators” to question hidden entities – Confederates: hidden foils for the machines

• Recruitment: general public through press release, calls on social media (Facebook; Twitter, Blogs), academia (philosophers, computer scientists) and invites: – 30 human interrogator-judges – 25 hidden humans – 5 elite machines ICAART 2014, Angers Saint Laud, 6-8 March

Method • Tests: 30 conducted across 5 Sessions • Each Session had 6 ‘Chat-Screens’ – 6 rounds • Each Interrogator-Judge sat at a Terminal and assessed 6 ‘chat screens’ (screen changed from split to single between rounds): – 4 hidden pairs (2 pairs = machine-human; one pair each of 2machines, and 2humans) – 2 single hidden entities(one hidden single = machine, other human) – End of each Session judge and hidden humans replaced with fresh participants ICAART 2014, Angers Saint Laud, 6-8 March

Scoring Method • End of each round Interrogator Judge completed appropriate score sheet:


Results 1 Turing’s Imitation Game

Strength of Turing’s two tests Viva voce

one-to-one direct tests

Number of tests

Type of error

% inaccurate classification

Machine-human tests

30

60

5

8

7 (twice machine classified as Unsure)

8

Number of deceptions

Total inaccurate classification

Simultaneous comparison

Eliza effect

23.33%

4 tests: both human 4 tests: machine considered human & human considered machine

13.33%

Machines deceived judges more in viva when direct comparison with a human ICAART 2014,voce Angerstest Saint Laud, 6-8 no March

Results 2

Simultaneous test more difficult for machine to achieve deception in ICAART 2014, Angers Saint Laud, 6-8 March

Issues with viva voce • As Heiser et al. (1979) found in their PARRY study: – order of the judge-interrogations, machine first or human first, is an important factor

• Our tests showed judge made identification error based on their higher assumption of machine level talk if faced with a human first in viva voce tests – classifying it as machine ICAART 2014, Angers Saint Laud, 6-8 March

Issues with simultaneous comparison • Match for the machines: – Machines imitating ‘adult conversation’ – Two machines do not pretend to be human, overt robot personalities – One machine imitates specific child personality (non-native English) – One is a wacky dialogist (canned-replier)


Simultaneous test from Turing100

LEFT: Judge J19 with Hidden-Entity: human or machine?

[12:23:13] Judge: hi there [12:23:18] Entity: Howdy pal! Today is a kind of never-ending talk-show, I'd say! [12:23:38] Judge: yes. nice! [12:23:43] Entity: So we are on a same page. Oooh. Anything else? [12:24:13] Judge: what page? FB? [12:24:19] Entity: Huh? Could you tell me what are you? I mean your profession. [12:24:56] Judge: i am not allowed to say that. what are you doing for living? [12:25:09] Entity: I'm wondering if Doctor Loebner is a human. I suspect that he is a cyborg - and he's playing games with us by running this competition. [12:25:46] Judge: slow down a bit. where do you put your money? [12:25:52] Entity: Money-money-money, not so funny... Do you want to win a million? [12:26:23] Judge: not really, are you a saver or spender? [12:26:31] Entity: Hmmm... I would assume otherwise. And I forgot to ask you where you are from... [12:26:56] Judge: I am from bristol. [12:27:02] Entity: Tell me more about Bristol. I like to learn about different places! [12:27:22] Judge: ok. but please let me know more about you? [12:27:38] Entity: Oh, if I tell you about myself, I'll definitely be dishonest:-) Well, I am a son of Rockefeller, I've arrived here in the Limo... I know you are supposed to trick me. [12:28:07] Judge: wow. go on [12:28:14] Entity: Yes, everyone would be amazed hearing that. I know you are supposed to trick me

RIGHT: Judge J19 with Hidden-Entity : human or machine?

[12:23:23] Judge: good evening [12:23:28] Entity: Wonderful evening isn't it? [12:23:58] Judge: yes indeed. it is raining over there? [12:24:02] Entity: What time is it anyway? [12:24:24] Judge: min-night? [12:24:28] Entity: Ugh. I need more sleep. [12:25:18] Judge: only joking? do you like jokes? [12:25:24] Entity: Alright then. [12:26:05] Judge: do you like diamond jubilee? [12:26:13] Entity: lol. [12:26:37] Judge: why lol? [12:26:44] Entity: oh, I get it. :) [12:27:48] Judge: I really like you. do you think machine can fell in love? [12:27:55] Entity: I like the Man Machine album.


Viva Voce from Turing100 [12:59:25] Judge: HI, how are you doing? [12:59:35] Entity: I am very well [12:59:47] Judge : Good. What are your plans this weekend? [13:00:01] Entity : well not really a lot [13:00:12] Judge: Will you watch the game tomorrow? [13:00:54] Entity : I think I will not.... I have to spend some time with friends I have not seen so far... [13:01:34] Judge: Where are your friends from? [13:02:22] Entity: they are from all over the world. What about you? any plans for the weekend? [13:03:03] Judge: I'm staying with my godmother tonight. Then back to London tomorrow to watch the football. Who do you think will win? [13:04:10] Entity: I am not really sure who is playing against whoom. Is it UK with someone? ICAART 2014, Angers Saint Laud, 6-8 March

Purpose & outcome • Conversations/transcripts from tests more useful than statistics: show how far machines are from providing “satisfactory and sustained” answers to any questions • Practicalising Turing’s tests useful exercise – care & other robots will better serve humans if they communicate in humanlike discourse • Repeating experiment in June 2014 – get involved: [email protected] ICAART 2014, Angers Saint Laud, 6-8 March

Turing’s future • Machine Eugene Goostman – Used as speech engine in building bionic man – (Channel 4 Documentary) http://www.channel4.com/programmes/how-to-build-a-bionic-man

• “We can only see a short distance ahead but we can see plenty there that needs to be done” (Turing 1950) ICAART 2014, Angers Saint Laud, 6-8 March

Some References • • •

• • •

•

Applying Turing’s Imitation Game. (Forthcoming) In Wilson, R., Bowen, J., Copeland, J. and Sprevak, M. (Eds) The Turing Guide. Oxford University Press, Warwick, K. & Shah, H. Effects of Lying in Practical Turing tests. 2014. AI & Society. DOI: 10.1007/s00146-013-0534-3, Warwick, K. & Shah, H. Good Machine Performance in Practical Turing tests. 2013. IEEE Transactions on Computational Intelligence and AI in Games. DOI: 10.1109/TCIAIG.2013.2283538 , Warwick, K. & Shah, H. Conversation, Deception and Intelligence: Turing’s question-answer Game, 2013. In Cooper, S.B. & van Leeuwen, J. (Eds) Alan Turing: His Work and Impact, pp. 614-620. Elsevier: Shah, H Turing’s Imitation Game: Role of Error-making in Intelligent Thought, 2012. Turing in Context II, Brussels, 10-12 Oct: Shah, H., Warwick, K., Bland, I.M., Chapman, C. & Allen, M. Emotion in the Turing test: a downward trend for machines in recent Loebner Prizes, 2009. Chapter 17 in Vallverdú, J. & D. Casacuberta (Eds) Handbook of Research on Synthetic Emotions and Sociable Robotics: New Applications in Affective Computing and Artificial Intelligence, pp. 325-349. Information Science Reference: Shah, H. & Warwick, K. The Essential Turing. 2004. BJ Copeland, OUP, 2004


Thank you for listening Turing’s papers and commentaries in Alan Turing: His Life and Impact

Winner in categories: a) Physical Sciences b) Mathematics and c) RR Hawkins Prize in 2013 Prose Awards: http://www.proseawards.com/current-winners.html


Thank you also to • ICAART 2014 • Reviewers: for very useful questions, comments and suggestions to improve paper. • Marc Allen – for MATT communications protocol (computer programme facilitating interaction between Judge/Interrogator terminals and hidden terminals)


Fundamental Artificial Intelligence

Fundamental Artificial Intelligence

Suggest Documents

NORTHEAST ARTIFICIAL INTELLIGENCE Artificial Intelligence ...

ARTIFICIAL INTELLIGENCE ARTIFICIAL ...

ARTIFICIAL INTELLIGENT Artificial Intelligence

Artificial intelligence, simulation and modelling Artificial intelligence ...

Three Fundamental Misconceptions of Artificial Intelligence - Temple CIS

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

ARTIFICIAL INTELLIGENCE

Artificial Intelligence

artificial intelligence

artificial intelligence

The Quest for Artificial Intelligence - Stanford Artificial Intelligence ...

The Quest for Artificial Intelligence - Stanford Artificial Intelligence ...

Artificial Intelligence - Academia.dk

Artificial Intelligence Market

Applied Artificial Intelligence

Hybrid systems: Artificial Intelligence

AoW-artificial intelligence - Squarespace

ARTIFICIAL INTELLIGENCE AND YOU

CMSC 421, Artificial Intelligence

Incompleteness and Artificial Intelligence