Classification of Chatbot Inputs

28 downloads 0 Views 452KB Size Report
24 Jul 2017 - Is there still a city-bus today? • When does a public transport go? • I want a bus. • How many minutes does a bus arrive? • When can I go home?
Classification of Chatbot Inputs Andreas Stöckl 2017-07-24

2

Contents 1 Introduction 1.1 A first small Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 A larger testset of chat-inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Pre 2.1 2.2 2.3 2.4

Processing text Transforming text . . . Constructing vectors . . Calculating distances . . Modelling with bigrams

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

5 6 6 11 11 13 13 17

3 Visualisation 19 3.1 The small example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Visualisation of the large dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4 Supervised Learning 4.1 k nearest neighbors . . . 4.2 Naive Bayes . . . . . . . 4.3 Support vector machines 4.4 Neural nets . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

23 23 25 27 28

5 Machine Learning APIs 31 5.1 API.AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3

4

CONTENTS

Chapter 1

Introduction Chat bots are pieces of software that enable an user to communicate with a software system in natural language. These communication can be done by writing text with the keyboard or with a speech-recognition system. The user writes short sentences and the system has to understand the intent of the users commando to give the right answers. Machine learning methods can be used to train the chat-bot in understanding the inputs. For example we want to construct a bot-service for facebookmessenger that gives information from various services like the weather or the arrival of the next bus at our station. The user may write “Where is the next bus for me?” or “How is the weather tomorrow?”. The system has to detect the intent of the user and then call the right web-service for an answer.

Figure 1.1: Facebook mesenger In the next chapters we want to bring simple examples that helps us to understand machine learning in 5

6

CHAPTER 1. INTRODUCTION

the context of chat-bots. We will use various algorithms to show how a chat bot knows if the user wants information about the next bus or the weather or something else. In the second chapter we show how the texts have to be processed for the machine learning methods. In the third chapter we do some visualizations of the data to get some ideas how the methods can work. Chapter 4 experiments with different methods of supervised learning for our test cases. We use “Nearest Neighbors” for classification, probabilistic learning, neural networks and support vector machines. For the code examples we use R. R is a free software environment for statistical computing and graphics http://www.r-project.org. A good introduction to machine learning can be found in (Andreas C. Müller, Sarah Guido, 2017) and the implementation of the methods in R can be found in (Brett Lanz, 2013). Text mining in R can be found in (Ashish Kumar, Avinash Paul, 2016).

1.1

A first small Example

We use the following text data as an first example in this article. 10 phrases where the intent is to get information about the bus: • • • • • • • • • •

When is our bus coming? When does the bus come to station 1? Is there still a city-bus today? When does a public transport go? I want a bus How many minutes does a bus arrive? When can I go home? How do I get home quickly? When is the scheduled bus in the evening? Will there be a bus in this hour?

And 10 requests for weather-information: • • • • • • • • • •

How will the weather be tomorrow? Do I need an umbrella tomorrow? Will it be warm tomorrow? What are the daily high temperatures? Do I need an ice scraper? Will the weather be nice tomorrow? How cold will it be tomorrow in the evening? Is this week a rainy day? Is it sunny tomorrow? What is the rain probability on Wednesday?

We use these sentences to show how the text has to be processed and in combinations with different machine learning approaches to train a chat-bot.

1.2

A larger testset of chat-inputs

After we had a look at the different methods with the small example test case of the last section, we will test the algorithms with a more challenging test set. Were we take 111 text inputs with 7 different user intents. In the examples the users request for: • bus information (bus - 13 cases) • weather forecast (weather - 17 cases)

1.2. A LARGER TESTSET OF CHAT-INPUTS • • • • •

7

hotel booking (hotel - 15 cases) food delivery (food - 10 cases) smart home commands (home - 23 cases) TV program (TV - 11 cases) manage emails and phone calls (contacts - 22 cases)

In a first step a chat-bot has to identify the intent of the user. We will use this data to train and test various methods of machine learning in recognizing the users intent. The chat-bot uses such methods and then process the given data, for example calls an appropriate web service to give an answer. In the next three tables you can have a look at the input texts and the Intent of the user. In chapter 5 we use the platform API.AI for the classification of the inputs. It is a service for natural language processing owned by Google

8

CHAPTER 1. INTRODUCTION

Table 1.1: Dataset of chat-inputs Part1 Input.Text

Intent...Category

When is our bus coming When does the bus come to station 1? Is there still a city-bus today? When does a public transport go? I want a bus

bus bus bus bus bus

How many minutes does a bus arrive? When can I go home? How do I get home quickly? When is the scheduled bus in the evening? Will there be a bus in this hour?

bus bus bus bus bus

Can i go home in 30 minutes? Is there a bus at my station in 20 minutes? I am done with my work in 1 hour! How will the weather be tomorrow? Do I need an umbrella tomorrow?

bus bus bus weather weather

Will it be warm tomorrow? What are the daily high temperatures? Do I need an ice scraper? Will the weather be nice tomorrow? How cold will it be tomorrow in the evening?

weather weather weather weather weather

Is this week a rainy day? Is it sunny tomorrow? What is the rain probability on Wednesday? How was the temperature yesterday? Will i get wet in the evening?

weather weather weather weather weather

How hot will it be tomorrow? Will there be snow tomorrow? Start the weather app! Give me the weather of today! How is the temperature in paris?

weather weather weather weather weather

Remove the junkmails Call the last number i called Call my wife! Send an e mail to the office. Open my inbox

contacts contacts contacts contacts contacts

Go to the junk mail folder Answer the last e-mail Write a new mail Create a new contact Reply to johns email

contacts contacts contacts contacts contacts

1.2. A LARGER TESTSET OF CHAT-INPUTS

9

Table 1.2: Dataset of chat-inputs Part2 Input.Text

Intent...Category

41 42 43 44 45

I want so see my inbox Open the sent mails folder Call my phone mailbox Forward the last mail to mike Check for new mails

contacts contacts contacts contacts contacts

46 47 48 49 50

Can i see my mail inbox? Who sent the last email? Delete the newest email Answer the latest phone call Give me the numer of unread mails!

contacts contacts contacts contacts contacts

51 52 53 54 55

Delete all messages from last week Search for messages from horst I need a hotel in vienna Start the booking app I want to book a hotel in july

contacts contacts hotel hotel hotel

56 57 58 59 60

I want to book for my next holyday Search for a hotel in Rome for July 25 Book hotel in Paris from 20 June to 2 July Need a hotel nearby the trainstation I like to book 5 star hotel in austria

hotel hotel hotel hotel hotel

61 62 63 64 65

I need a hotel for 3 nights from Sunday I seach a hotel with free wi-fi I am searching for a place to sleep in london. I want a place for a family holiday Where can i go with my family in summer?

hotel hotel hotel hotel hotel

66 67 68 69 70

I like all inclusive ressorts in greece Are there some family hotels in korsika? I want to order a pizza Where can i get hamburgers at night? Where can i order a pizza online?

hotel hotel food food food

71 72 73 74 75

Where is a chinese restaurant with delivery? Order a salami pizza for me! Is there a food delivery in this city? Find an online pizza service for me! Is there a fastfood delivery service?

food food food food food

76 77 78 79 80

I whould like eat fast food at home I want to eat pizza at home! Turn on the TV! Switch on the Television! I want to hear radio.

food food home home home

10

CHAPTER 1. INTRODUCTION

Table 1.3: Dataset of chat-inputs Part3 Input.Text

Intent...Category

81 82 83 84 85

Turn the lights in the bathroom on Turn off the lights in the kitchen Put up the lights at 8 pm Turn off the heating when I go out Turn on the heating at 7 am on weekends

home home home home home

86 87 88 89 90

Put down the heating when I leave theb bathroom Switch down heating for 20 degrees Lock the front door! Make sure the door is locked at 6 pm Lock the back door from 11 pm till 7 am

home home home home home

91 92 93 94 95

Unlock the window when I leave the bathroom Close the door when I go out Switch off TV when I leave Turn the computer off at 7 Switch the light on in conference room

home home home home home

96 97 98 99 100

Start the video player in bedroom Stop the air conditioner Put the coffee machine in the lobby on Start the washing machine in one hour Stop the TV in 10 minutes

home home home home home

101 102 103 104 105

Show me the TV programm of today Are there some documentations at todays TV-Programm? What’s on channel 5? What’s on tv today? When does Big Bang Theory start?

TV TV TV TV TV

106 107 108 109 110 111

When does the movie Manhattan come back on TV? Is Silvester Stallone on tv today? Which films are on screen this evening? Are some sports on television tomorrow? List the shows from channel 4 from tomorrow! Show me todays films from all channels

TV TV TV TV TV TV

Chapter 2

Pre Processing text 2.1

Transforming text

Before we can use and test the different machine learning methods we have to process some steps with the text inputs. Let us have a look at the first five texts in our list from the example of chapter 1. Look at table 2.1 for the results. In the first row you find the original five texts. The first step in preparing the texts is to split into words, convert to lowercase and remove punctuation, numbers and white spaces. The second row of the table shows the result of this steps. The number of the station and the question marks are removed and city-bus is treated as one word “citybus”. Some words do not contribute a lot of information to the whole text. This words are called stopwords. You find examples of such words in table 2.2. In our preprocessing this words are removed from the texts, as you can see in row 3 of the table. In the last step the words are transferred, by using stemming. Stemming is the process of reducing inflected (or sometimes derived) words to their word stem, base or root form. In our example “coming” is reduced to “come”. Implementation in R: We use the package “tm”. The input texts are stored in a vector “input”. library(tm) input_corpus