a clojure fusion of symbolic and data driven ai

112 downloads 0 Views 4MB Size Report
Nov 29, 2018 - Token based regular expressions. 01 [I love :1-. pizza]. ; sequence with a wild car. 02 [love ? [:1- pizza bacon]] ; nested, inner alternativ. 03 [:0.
A CLOJURE FUSION OF SYMBOLIC AND DATA DRIVEN AI Huahai Yang, Ph.D. Juji, Inc. Nov. 29, 2018  Navigate : Space / Arrow Keys | M - Menu | F - Fullscreen | O - Overview | B - Blackout | S - Speaker | ? - Help

1 / 34

Huahai Yang  huahaiy  Psychologist Computer scientist

Coding in Clojure since 2012 Cofounder & CTO Juji Inc.  © 2018 Juji, Inc.

2 / 34

JUJI BUILDS CHATBOT PLATFORM A challenging problem

 © 2018 Juji, Inc.

3 / 34

JUJI BUILDS CHATBOT PLATFORM It is not hard to pass Turing Test, done in 70s' PARRY: 33 psychiatrists cannot tell it from paranoid patients

 © 2018 Juji, Inc.

4 / 34

PARRY SAMPLE

 © 2018 Juji, Inc.

5 / 34

JUJI BUILDS CHATBOT PLATFORM It is harder to build useful chatbots interview people to collect feedback receive visitors to sites screen job candidates check up on trainees  © 2018 Juji, Inc.

6 / 34

JUJI BUILDS CHATBOT PLATFORM When used for survey 2X completion rate 76% better quality responses  the whole time i was doing this survey it felt like i was talking to a friend and sharing the same common ground. i loved that i wish it didnt have to end



 very dynamic and very fluid conversation you have great quality thanks © 2018 Juji, Inc.

7 / 34

JUJI APPROACH: SYMBOLIC + DATA DRIVEN Symbolic system as the bones Data-driven component as the flesh Done in a Clojure DSL, REP

 © 2018 Juji, Inc.

8 / 34

AI SUMMER IS BACK Watson Jeopardy beats human AlphaGo beats human Many AI assistants on phone and in home Many commercial products in enterprises

 © 2018 Juji, Inc.

9 / 34

RISE OF DEEP LEARNING (DL) Recently hugely successful For many: AI = DL

 © 2018 Juji, Inc.

10 / 34

DL SOLVES PERCEPTION PROBLEM  Perception is the organization, identification, and interpretation of sensory information in order to represent and understand the presented information, or the environment.



DL maps raw data (pixels, text characters) into: known labels (classification) desirable numbers (regression) fixed length vectors (embedding) © 2018 Juji, Inc.

11 / 34

PERCEPTION FEELS LIKE INTELLIGENCE



Reporter: "How many moves do you see ahead while playing chess?" Capablanca: "Only one, but it's always the right one." © 2018 Juji, Inc.

12 / 34

PERCEPTION IS NOT YET INTELLIGENCE

 © 2018 Juji, Inc.

13 / 34

INTELLIGENCE CANNOT BE SOLVED WITH DATA ALONE Bottom-up data driven sub-symbolic

Top-down goal/hypothesis driven symbolic (human-readable)

 © 2018 Juji, Inc.

14 / 34

TIME TO BRING BACK SYMBOLIC AI (Semi-)solving perception lays the foundation for symbolic AI The same forces leading to the rise of DL apply to symbolic AI More powerful hardware, help graph search More abundant realistic data, help knowledge base construction Better so ware tools and practices  © 2018 Juji, Inc.

15 / 34

TRADE-OFFS Data driven Easy to defeat/abuse by adversaries (e.g. Tay) Hard to debug and bend it to the creator's will By design, unlikely to be fixable Symbolic



Easy to build rigid/brittle systems Hard to develop, hard for human to think like machines In principle fixable, in practice, not so easy © 2018 Juji, Inc.

16 / 34

DL IS FOOLED

 © 2018 Juji, Inc.

17 / 34

GIBBON

 © 2018 Juji, Inc.

18 / 34

TWO ROADS TO INTEGRATION Implement symbolic phenomenon with subsymbolic system Mimic brain Not yet practical Symbolic + sub-symbolic Engineer's method Practical today  © 2018 Juji, Inc.

19 / 34

SYMBOLIC + DATA DRIVEN Symbolic system as the bones for its potential for growth and adaptability, despite the rigidity DL/ML components as the flesh for its flexibility and ease of development, despite the obscurity  © 2018 Juji, Inc.

20 / 34

JUJI ARCHITECTURE

 © 2018 Juji, Inc.

21 / 34

EDN DATA ALL THE WAY

User select chat template 2 User configure chat in GUI 3 Generate script from GUI 4 Chat: script compiles and runs 1

 © 2018 Juji, Inc.

22 / 34

DEFTOPIC: THE BUILDING BLOCK Topic: a set of rules 01 (deftopic hello-world 02 [] 03 04 [] 05 ["Hello world!"])

; topic name ; parameters ; trigger ; action

 © 2018 Juji, Inc.

23 / 34

PRODUCTION RULE Rule: trigger (if), action (then) and associated followup topics Followup topics are primed when a rule fired 01 [:1 hello hi hey howdy] ; alternative pattern trigger 02 ["Nice to meet you!"] ; action: a string output 03 (talk-about-wheather) ; followup topic invocation

 © 2018 Juji, Inc.

24 / 34

TOPIC COMPOSITIONS A topic may include rules of other topics 01 (deftopic greetings 02 [] 03 {:include-before [(morning-greetings) 04 (evening-greetings)]} 05 06 [:1 hello hi hey howdy] 07 ["Hello"])

 © 2018 Juji, Inc.

25 / 34

PATTERNS Token based regular expressions 01 [I love :1-. pizza] ; sequence with a wild car 02 [love ? [:1- pizza bacon]] ; nested, inner alternativ 03 [:0. "I love pizza"] ; start and string pattern

 © 2018 Juji, Inc.

26 / 34

ML BASED TAG AND CLASS PATTERNS Tags for annotating text keywords for placeholders of content classes 01 02 03 04

[he #pos/verb dog tree] [she love :phrase/NP] [I work at :entity/org] [did in :entity/duration]

; ; ; ;

parts of speech tag noun phrase class organization entity class duration entity class

 © 2018 Juji, Inc.

27 / 34

DL/ML FOR CLASSIFICATION FUNCTIONS Neural networks are universal function approximator, should be used as such 01 [programming 02 (input-in-this-category? "self-intro-relevance" 0.7) 03 ["You must be a smart person"]

Patterns are and together in a rule Rules are or together, so a topic matches a DNF  © 2018 Juji, Inc.

28 / 34

DL FOR SIMILARITY BASED MATCHES Calculate similarity using Tensorflow sentence embedding 01 [(> (max-similarity-score 02 ["What does your product cost?" 03 "How much does your product cost?" 04 "What's the price of your product?" 05 "How expensive is your product?"]) 06 0.9)]

 © 2018 Juji, Inc.

29 / 34

ROLES ML/DL models cover broad cases Symbolic covers specific cases misses by DL/ML detailed refinement



01 [(input-in-this-category? "self-intro-relevance" 02 0.7)] 03 ([programming] 04 "You must be a smart person." 05 06 [art] 07 "I enjoy art too." 08 09 "Thank you for the introduction.") © 2018 Juji, Inc.

30 / 34

META-CIRCULARITY Turn a topic into a function, then use the function in another topic 01 [(create-topic-func 02 custom/why-u-here :extract-why-u-here) 03 "I see, you are here to " 04 (exec-topic-func :extract-why-u-here)]

 © 2018 Juji, Inc.

31 / 34

AUTOMATIC DIALOG MANAGEMENT REP is a declarative language System pushes topics around Agenda queue Ad-lib queue Exception queue Main stack  © 2018 Juji, Inc.

32 / 34

CONCLUSION Symbolic + data driven = practical AI today Clojure is a great choice for doing so Lisp was and still is the language of symbolic AI Data orientation of Clojure makes it easy to integrate both

 © 2018 Juji, Inc.

33 / 34

Huahai Yang Juji, Inc. https://juji.io

 © 2018 Juji, Inc.

34 / 34