presentation - Computer Science - Harvard University

The E&ects of Task!Contexts on Decision!Making between People and Computers

Motivation Computer agents and people are making decision together in task settings $e.g., ebay proxies, military exercises, scheduling% Task settings include

Ya#akov $Kobi% Gal with Avi Pfe&er, Barbara Grosz, Stuart Shieber, Alex Allain

• •

a relationship between goals, tasks, and resources. multiple decision!makers, dependency relationships. $ex.: ebay proxies%

How do task settings a&ect the performance of computer agents that interact with people? Division of Engineering and Applied Sciences Harvard University

Colored Trails 'Grosz and Kraus#04( A formalism for investigating decision!making in task settings CT is a 2!player computer game • includes a board of colored squares, one of which is the "goal# state • players are allocated colored chips; surrender chips of appropriate color to move around the board • proposer player can suggest an exchange to responder player who can reject or accept

•

score depends on individual performance Division of Engineering and Applied Sciences Harvard University

Division of Engineering and Applied Sciences Harvard University


Why Colored Trails ?


Task Abstraction Continuum

Provides analogy for task setting in the real world • squares represent tasks; chips represent resources; getting to goal equals task completion • compact representation of large strategy set Abstracts away game complexity, not decision!making complexity. Flexible $e.g., dependency relationships, incomplete information, cooperative vs. competitive scenarios% Enables both people and computers to play $through GUI and API% Division of Engineering and Applied Sciences Harvard University

Fully speci)ed domain $e.g., robocup, diplomacy%

Colored Trails

C

D

C

!2,2"

!0,3"

D

!3,0"

!1,1"

payo& tables and decision trees $e.g., prisoners# dilemma%


The Approach

Experimental Set!up Task context • used CT Table context • used a payo& matrix The same decision!making situation was used in both task and table contexts.

Compare between human behavior and a computational model of human behavior in task vs. non!task settings. Criteria of evaluation de)ned in terms of social factors $e.g., helpfulness, competitiveness%





Subject Manipulation Subjects divided into two rooms. Played a series of 30 games that varied in • dependency relationships • board/chip layout Task setting analogy de)ne score. No mention of tasks in table condition.

Analyzing Behavior

Room 1 Terminal 1

P Game r1,1

Room 2 Terminal 2

P

R

Terminal 3

R

R

Terminal 4

P R

P Game r2,1

Game r1,2

Game r2,2

Server



A&ect of Context on Proposer Behavior Table

Task proposers are • less sel)sh • more helpful • less competitive than table proposers

Table proposers Task proposers Characteristics of o&er • No negotiation alternative • Proposed outcome • helpful vs. sel)sh • competitive vs. social

O$er Bene%t to

Task

36

Responder 48

98 Proposer 82


A&ect of Context on Performance Table No signi)cant di&erence in Performanc# proposer performance between conditions. 41 Responder is signi)cantly Responder more successful in task condition. Overall, combined performance higher in the Proposer task condition

Task

56

86 80


Num.. offers Task 13 (15%) Table 57 (59%)

To compare the extent to which the exchanges made by proposers in the two type of contexts differed from the NE exchange, we plotted the average benefit offered by NE exchanges and by proposed exchanges for both task and table conditions, as shown in Figure 4.

Comparison with Game Theoretic O&ers

Nash equilibrium for CT: 70 Task Exchange Propose o&er that is best 65 Table Exchange 60 for proposer given that it is NE Exchange 55 bene)cial for responder. 50 (82.3, 47.6) 15* Nash equilibrium o&ers 45 in Task context vs. 57* in 40 (100.24 37.21) (98, 36) table context. 35 30 No signi)cant di&erence 60 70 80 90 100 110 120 Proposer Benefit between task proposal and Nash equilibrium. Fig. 4: Benefit from Proposed Exchanges vs. NE Exchanges Responder Benefit

Formalize a social utility function that depends on social preferences, such as individual bene)t, social welfare, competitiveness. Build a model of human play that incorporates players# social utility. Social Utility for Responder

Division of Engineering and Applied Sciences The difference between the average benefit to responders from the NE offer and Harvard University the average proposed exchange was close to zero in the table condition, and large and positive in the task condition (t-test p < 0.05). Similarly, the difference between the benefit to proposers from the NE offer and the average proposed exchange was close

Features Given a game and proposal cj , features xj are responder’s social preferences • Selfishness

j xj,1 = P OR − N NR

• Inequality

j xj,3 = P OR − P OPj

• Social Welfare

• Competitiveness

j xj,2 = (P OR − N NR ) + (P OPj − N NP ) j xj,4 = (P OR − N NRj ) − (P OPj − N NPj )

The E&ect of Context on Predictive Models

Selfishness

Social Competitiveness Welfare Division of Engineering and Applied Sciences Harvard University

Modeling the Responder • Given exchange x, social utility for Deliberator u(x) is a weighted sum of its social preferences. • Probability of acceptance of exchange e is 1 P (accept|x) = 1+e−u(x) • Utility also measures the degree to which a decision is preferred.

PO = Proposed Outcome NN = No negotiation Alternative

19



Condition Learned weights Task (5.20, 3.2, 0.40) Table (8.20, 1.3, 8)

De)ning Social Preferences in CT Scoring Rule • 100 point bonus for getting to goal • 10 point bonus for each chip left at end of game • 15 point penalty for each square in the shortest path from end!position to goal

As shown in the table, both task proposers and table proposers are selfish, in that they place high weight on their own benefit. However, table proposers assi weight to their own benefit than do task proposers, suggesting they are more se task proposers. Task proposers also assign a higher weight to helpfulness an cantly lower weight to competitiveness than table proposers. These values a trends reported in the Results and Analysis section. Usedthe data consisting of people#s play in both task and We evaluated both models on test sets comprised of held out data from tableand conditions. table conditions. We report the average negative log likelihood for all m the following table as ten-fold cross validation. A lower valu Use machine learning tocomputed estimateusing weights of social criteria means that the test set was given a higher likelihood by the model.

Capturing Diversity of Play

preferences for each type Training and testing on task contexts provided a better )t for the data.


Implications and Future Work Agent designers needs to consider the contexts in which decisions are made. Need to evaluate computational model by using it to play with people in situations such as • one shot interaction • repeated play

Training / Testing Average Log Condition Likelihood Task / Task 0.144 Table / Task 1.2 Table / Table 0.220 Task / Table 1.2

As shown by the table, the model trained and tested on the task condition wa fit the data better than the model trained and tested in the table condition, i Division of Engineering and Applied Sciences that computer agents participating in mixedHarvard human-computer task settings mu University human performance in a way that reflects the context under which the dec made. In addition, the model trained in the task condition outperformed the mod in a table context when both models were evaluated in task contexts. (And c for the model trained in the table condition.) The extent to which both mode performed when evaluated in the context they were not trained on was simila conditions. These results clearly imply that the context in which decisions a affects the performance of computer models that learn to interact with people

Colored Trails

5 Related Work

A series aoffamily studiesofspawned by theinseminal work of Tversky and Kahnema • De)nes games used research and show thatof thedecision!making way decisions, outcomes, and choices are described to people teaching in groups their behavior, and these different “framings” fundamentally affect people’s pe comprising people, computers and mixed and conceptualizations. For example, people’s decision-making is sensitive to networks.

sentation of outcomes as losses or wins and to the presence of alternative cho

• Available for use $GNU public license% In addition, decisions are influenced by the labeling of interactions with terms http://www.eecs.harvard.edu/ai/ct3



presentation - Computer Science - Harvard University

presentation - Computer Science - Harvard University

Suggest Documents

AYAN CHAKRABARTI - Computer Science - Harvard University

Worst Case Analysis - Computer Science - Harvard University

Podcasting Computer Science E-1 - Harvard University

Hybrid Transactional Memory - Computer Science - Harvard University

Sustainability science - Harvard University

Psychological Science - Harvard Vision Lab - Harvard University

marzo, gal, grosz, pfeffer - Computer Science - Harvard University

Cuckoo Hashing with a Stash - Computer Science - Harvard University

Optimal Plans for Aggregation - Computer Science - Harvard University

David J. Malan - Computer Science - Harvard University [PDF]

Quarterly Administrative Science - Harvard University

Quarterly Administrative Science - Harvard University

Quarterly Administrative Science - Harvard University

ComPUter sCienCe CompUter sCienCe - University College Dublin

Exploring Computer Science - Computer Science - Duke University

ComPUter sCienCe CompUter sCienCe - University College Dublin

kiran-kumar muniswamy-reddy - Computer Science - Harvard ...

Software-Assisted Hardware Reliability - Computer Science - Harvard

Computer Science E-259 - Fas Harvard

Harvard University - Houghton Library / Harvard University. Harvard ...

Era of Responsibility - Harvard Computer Society - Harvard University

Era of Responsibility - Harvard Computer Society - Harvard University

The Critic in Question - Harvard Computer Society - Harvard University

Profiling a warehouse-scale computer - Harvard University