Agents Participating in Internet Auctions - Semantic Scholar

1 downloads 0 Views 188KB Size Report
Mar 13, 1999 - t; e ?j t ). We found that a linear regression method does not work well. .... 16] Peter R. Wurman, William E. Walsh, and Michael P. Wellman.
Agents Participating in Internet Auctions Junling Hu, Daniel Reeves, and Hock-Shan Wong Arti cial Intelligence Laboratory University of Michigan Ann Arbor, MI 48109-2110, USA fjunling, dreeves,[email protected] http://ai.eecs.umich.edu/people/fjunling,dreeves,hswongg

March 13, 1999 Abstract

We have designed agents to participate in one of the internet auctions|Michigan AuctionBot. The agents communicate with the AuctionBot through API. For the inference mechanism underlying the agent, we use regression methods for online derivation of functional relations between other agents' actions and their internal states. The functional form of a regression model is based on an agent's assumptions of other agents' underlying behaviors. We implemented four types of agents, distinguished by their different assumptions about other agents.

keywords: Web agents, internet auctions, online learning

1 Introduction Intelligent agents for electronic commerce are a popular research topic. We have seen shopping agents that collect price information for users [3], and information ltering agents that collect interesting publications[1]. We are interested in designing agents for market environments where multiple buyers and sellers interact with each other. Such interactions can repeat over time, leading to dynamic changes of the systems. The interesting research issue for us is how an agent takes advantage of the information available and achieves maximal pro t in the market transaction. This usually refers to how an agent uses past observations to make predictions about other agents. By forming correct predictions of others, an agent can choose its best response policy and extract maximal gain in the trade. The process of forming predictions involves learning about other agents. The online environment imposes special diculties on learning because the environment consists of other agents who are similarly adaptive. This naturally leads to the notion of modeling other agents and their beliefs. In our previous research, we studied the general issue of learning in dynamic multiagent systems [8, 9], and how agents learn recursive models of other agents in a simulated auction market [10]. In this paper, we take the learning agents to an actual internet auction|The Michigan AuctionBot. By implementing the agents on the internet, we address the practical issue of agent communication and agent interface. This will also allow us to test the performance of the agent in a real environment. In this paper we show how an agent can communicate with the AuctionBot by adopting the existing API (Application Programming Interface) infrastructure. For the inference mechanism underlying the agent, we use regression methods for online derivation of functional relations between other agents' actions and their internal states. The functional form of a regression model is based on an agent's assumptions of other agents' underlying behaviors. We implemented four types of agents, distinguished by their di erent assumptions about other agents. We addressed the design issues in this paper. What we plan to do next is to test the agent performance. Results will be available in the revised version.

1

2 The auction environment 2.1 Michigan Auctionbot

The Michigan AuctionBot [17, 16] is a con gurable auction server. It allows human agents to create auctions and submit bids via web forms, and software agents to perform the same operations via TCP/IP. This auction server has been operational since September 1996. Currently, Michigan AuctonBot supports many auction types including English auctions, Dutch auctions, and Vickrey auctions. These di erent auctions are distinguished by the way bidders submit bids and how the allocations and prices are determined [11]. As far as we know, Michigan AuctionBot is the only auction site that provides an API to enable software agents to directly talk to the server. Figure 1: The AuctionBot Architecture

Server (AuctionBot)

Client

Agent

API

TCP/IP Socket

Scheduler Auctioneer Database

The AuctionBot API [12] is a client/server communication protocol that is straightforward to implement for client developers in any language on any platform. The AuctionBot API functions reside on a server. Interfaces to the functions are well-de ned messages encoded as strings that are sent to the server through a socket and invoke the API functions that run on the 2

server. The server functions return string-based messages through the socket to the API client, informing it of the results of the request. By using the features of the API, we can easily link our agents to the auction server.

2.2 Double auctions

We are interested in one type of auction available on AuctionBot: the double auction. In a double auction, both buyers and sellers submit bids. A single agent may even submit both, o ering to buy or sell depending on the price. Double auctions [6] come in di erent forms. Most of them have the following features: (1) There are many buyers and many sellers; (2) One unit of a good is traded each time; (3) Bids are observable to all agents once they are submitted; (4) Each agent's preferences are unknown to other agents. Based on the timing of the bidding protocol, double auctions can be classi ed as synchronous (or synchronized ) double auctions and asynchronous double auctions. In a synchronized double auction, all agents submit their bids in lockstep. Bids are \batched" during the trading period, and then cleared at the end of the period. This type of auction can be seen in a clearing house. Asynchronous double auctions are also called continuous double auctions where agents o er to buy or sell and accept other agents' o ers at any moment. Continuous double auctions have been widely used in stock exchange markets [5] and internet auctions. In this paper, we are interested in the synchronized double auction. The auction process can be described as follows: At the start of the auction, each agent is endowed with designated amounts of m goods, each| except the last|associated with an auction. Each time period constitutes a bidding round for one auction, rotated in turn. Agents bid in an auction by posting buy and sell prices for one unit of the good j . All prices are expressed in units of good m. After all agents submit their bids, the auction matches the highest buyer to the lowest seller if the buy price is greater than the sell price. The trading price for the match is equal to kP buy + (1 ? k)P sell, where P buy is the buyer's o er (price), P sell is the seller's ask (price), k is a constant in the range of 0 ? 1. The auction then matches the second highest buyer with the second lowest seller, and so on, until the highest remaining buyer is o ering a lower price than the lowest remaining seller. At this point, the market proceeds in turn to the next auction. In the new round, agents post their buy and sell 3

prices for the next good, and they are matched according to the prices they post. The market continues until no agents can be matched in any auction. Figure 2 shows the matching process in one period of an auction. Figure 2: Bidding and trading in the synchronized double auction

Buyers

Sellers

6 5 4 3 2 1

1.1 2.1 3.1 4.1 5.1 6.1 Agents

3 Agent design

3.1 Previous work

The book edited by Friedman and Rust [6] collects several studies of double auctions, including both simulations and game-theoretic analyses. Gametheoretic studies [14, 5] on double auctions generally adopt the framework of static (one-shot) games with incomplete information, for which the equilibrium solution is Bayesian Nash equilibrium. Since double auctions are essentially dynamic games where agent interaction takes more than one round, 4

the static game framework fails to address the basic dynamics of the system. Other theoretical studies [4] try to explain the experimental data generated from human subjects. They assume that each buyer or seller has a reservation price and has a way to re-calculate its reservation price after trading. While the study of human behavior is interesting, we are more interested in designing arti cial agents who can bid as intelligently as possible to get maximum payo s from the double auctions. Gode and Sunder [7] designed zero-intelligence agents who submit random bids with the constraint that their utilities never decrease. Such agents can be viewed as one type of 0-level agent de ned in this paper. To improve upon zero-intelligence agents, Cli [2] designed zero-intelligence-plus agents who submit bids within the utility increasing range, but the bids are chosen so that their utilities will increase by a certain proportion. The proportion is adjusted over time. The adjusting process can be seen as an online learning process. The learning depends on several parameters such as the learning rate. Cli implemented a genetic algorithm (GA) to let the agent learn about these parameters. The training with the GA requires the agent to know the nal convergence price of the whole auction. It is not clear how such GA training can be applied to online settings. Other types of intelligent agents have also been designed for double auctions. In the Santa Fe Tournament [13], 30 di erent intelligent programs competed in a synchronized double auction, and a simple non-adaptive agent won. That simple strategy is as follows: wait in the background and let the others do the negotiation, but when bid and ask get suciently close, jump in and steal the deal. This problem that agents can take advantage of the structure of the auctions would not happen in our study because the auctions we are interested in have no negotiation phases.

3.2 Modeling the auction

A double auction can be modeled as a stochastic game [9], where the state is a vector consisting of all agents' individual endowments. The action is the vector of bids submitted by all agents. At time t, agent i's local state sit , is described by its endowment vector, eit = (ei1 (t); : : : ; eim (t)). Agent i's action at time t, ait = (Pti;buy ; Pti;sell), is its buying and selling price o ered for one unit of the good in the current auction. 5

The reward for agent i at time t is given by

rti = Uti+1 ? Uti = U (ei (t + 1)) ? U (ei (t)) where

eig (t + 1) = eig (t) + zt ; eim(t + 1) = eim (t) ? Ptzt ; and eil(t + 1) = eil (t); l 6= g and l 6= m: In these state update equations, Pt is the trading price, and zt 2 f?1; 0; 1g is the quantity the agent trades at time t. When zt = 0, the agent does not trade, and rt = 0. We assume that each agent i has a CES (Constant Elasticity of Substitution) utility function,

U (x) =

m X g=1

! 1

g xg

;

(1)

where x = (x1; : : : ; xm ) is a vector of goods, the g are preference weights, and  is the substitution parameter. We choose the CES functional form for its convenience and generality|including quadratic, logarithmic, linear, and many other forms as special cases. In constructing agent strategies, we dictate that they always choose actions leading to nonnegative payo s. We can characterize this in terms of the agents' reservation prices [15]. The reservation price is de ned as the maximum (minimum) price an agent is willing to pay (accept) for the good it wants to buy (sell). We can de ne agent i's buying and selling reservation prices, Pb and Ps, as the prices such that its utility keeps constant when buying or selling one unit of good. ( g ?1 m+ ( g +1 m ?

s ?g;?m ) = b ?g;?m ) =

( g m ?g;?m ) ( g m ?g;?m )

U e

;e

P ;e

U e ;e

;e

;

U e

;e

P ;e

U e ;e

;e

:

6

(2) (3)

3.3 Four types of agents

The agents we implemented follow our previous work [10]. We designed four types of agents. Our rst type of agent is a 0-level non-estimating agent, who does not attempt to model other agents. Such a agent is called competitive agent. The competitive agent always chooses its reservation price Pb and Ps. The other three types of agents are learning agents who choose their bidding prices (Pb; Ps) based on their predictions of other agents. Although they di er in how they estimate other agents' actions, they are identical in acting according to their best response to this estimate. A 0-level learning agent does not model the underlying policy functions of other agents. It models the actions of other agents by looking at the history data of those actions, and uses time series technique to predict the actions in next time period. For any other agent j , the 0-level agent predicts Ptj by P^tj = Ptj?1 + (Ptj?1 ? Ptj?2 ): (4) A 1-level agent tries to estimate a xed policy functions of other agents. It assumes that other agents are 0-level competitive agents, who choose actions based on their individual optimization problem. That means other agent's actions are only function of their own states. Assuming the policy functions are linear, the estimation of Ptj is P^tj = + ejt : where ejt is a vector of dimension m (m is the number of di erent goods the agent holds), P^tj = (P^tj;buy ; P^tj;sell) is a 2-dimension vector. A 2-level agent, like a 1-level agent, models the policy functions of other agents. But a 2-level agent assumes others are 1-level learning agents. We adopt a simpli ed 2-level model: Ptj = f (ejt ; e?t j ). We found that a linear regression method does not work well. One reason is the correlation between the independent variables ejt ; e?t j . In addition, the high dimensionality of input data requires large amount of data in order to get unbiased estimation. Since we assume agents take only a small sample of history data ( xed window length), we need to adopt a nonparametric regression method. We use the K-Nearest Neighbor method in this paper. For current joint state (e1t ; : : : ; emt), take its K nearest neighbors f(e1t1 ; : : : ; emt1 ); : : : ; (e1tk ; : : : ; emtk )g as the inputs and the corresponding actions of agent j , fPtj1 ; : : : ; Ptjk g as the 7

outputs. The estimation of Ptj is:

P^tj =

K X l=1

Wl Ptjl 1

where Wl is the weight of data point el and is de ned as Wl = Pk dl 1 where l =1 dt l 1 m dl is the distance p between the data point (el ; : : : ; el ) and the query point (e1t ; : : : ; emt), dl = (e1l ? e1t )2 +    + (eml ? emt )2. 0

0

3.4 Best-response bidding

Let agent i be the learning agent, with reservation prices Pbi and Psi. Let fP^b1; : : : ; P^bng and fP^s1; : : : ; P^sng be agent i's projected buying and selling prices of other agents. Suppose agent i can be matched as a buyer according to its reservation price, and is matched to seller j . By the matching criterion, we must have Pbi  P^sj . From the discussion before, we know that agent i wants to bid a buy price lower than its own reservation buy price, i.e. Pbi  Pb, and indeed should bid as low as possible in order to maximize its payo . However, if it bids lower than P^sj , it will lose the chance to trade with agent j . 1 > b

i

> k

1 < s

j


Pb n