Multi-agent learning in auctions: The design and analysis of markets and traders

Michael Kaisers

Sponsored by an NWO TopTalent 2008 grant.

Summary

This project investigates how machine learning techniques can help to analyze and improve markets and trading strategies within the domain of auctions. An empirical approach allows new insights into auctions, which grow in importance in resource allocation problems and e-commerce.

Introduction

Auctions are widely deployed in real markets. The New York Stock Exchange (NYSE) for example reported a trading volume of 11,060 billion USD in 2000 [NYSE00]. In these dimensions improvements of even small magnitudes will have a significant positive impact on the economy. A slight increase in efficiency of, say 0.1 % for example, leads to an additional social welfare of 11 billion USD, and that in one of the stock markets alone.

The enormous growth of the internet also brings e-commerce and online auctions into the scope. Consumer-to-consumer markets like e-Bay are commonly known but business-to-business online markets established as well. For example General Electric Corporation purchased over USD 6 billion worth of goods and services via online auctions already in 2000 [GE00]. As a result of today's wide spread of online auctions, understanding the design of market rules gains popular interest.

Auctions are known to be particularly successful in commodity markets where recent insights from theoretical analysis triggered the design of new auction mechanisms (e.g. for California's deregulated electricity market [Cramton01] or for government licenses for the telecommunications spectrum [Klemperer02, McMillan94]). However, they are increasingly considered efficient means to solve resource allocation problems in general [Cagliano95, Wellman01]. For example, auctions have been transferred to efficiently control internet traffic routing with considerable success [Roughgarden03].

Research in the domain of auctions is of interest to traders as well as entities that run markets, may it be businesses, governments or individuals. While traders commonly aim to increase their profit by advanced trading strategies, auctioneers seek mechanisms that enforce desired economic properties of the market. These properties may include high transaction volumes to make the market attractive or maximizing the sum of profits made. This research project will use techniques from machine learning to empirically evaluate and devise new trading strategies and market rules.

The computational platform of online auctions also allows using automated traders in real markets which implies that the models for human traders may become real traders themselves. In this context, the question if obtained knowledge actually transfers to human trading resolves and insights have a direct impact of applied business technology.

Sensitivity analysis has shown that even small alterations of market rules or trading strategies have considerable effects on the intricate interactions of markets and traders. E.g. adapting a trading strategy such that it achieves a relatively small increase of profit may make it dominant over its alternatives [Phelps06]. The increasing use of auctions and the significance of small improvements combined with promising approaches to obtain new knowledge make auctions worth studying.

Research questions

The research questions are three-fold. The first two represent open questions from the trader and market view of auctions while the last question is concerned with the transfer of the obtained knowledge.

How can machine learning techniques be incorporated into trading strategies to increase the profit of traders?
How can machine learning techniques be used to effectively design and evaluate rules for markets?
How can approaches to the previous two problems be generalized and to which class of problems can solutions be applied?

Method

My scientific background is artificial intelligence hence machine learning techniques will be applied to gain new insights into the domain of auctions but also to study the learning behavior in multi-agent environments itself. This section describes how learning algorithms can be applied in simulation in order to address the research questions and how their success can be measured by evaluation of the learned strategies.

The most common model for auctions is the private value model that assumes a private value for the good to each trader. Their performance can be measured by the profit they make by trading that good, which equals the difference of their private value to the transaction price and is positive for sellers that sell above their private value and buyers that buy below their private value.

Parameterized trading strategies allow searching the parameter space using machine learning algorithms to find those parameters that maximize the profit . For example the Gjerstad Dickhaut (GD) trading strategy bases its offer on a sliding window of the last n offers; the optimal window length n can be determined by machine learning techniques which can analyze a large space of possible trading techniques and compare, evaluate, or invent alternatives. However, learning in multi-agent environments is significantly more complex than single-agent learning as the dynamics to learn change by the learning process of other agents. Therefore, adapting algorithms for the multi-agent case will be considered, as this can lead to significant performance increases [Kaisers07].

Trading strategies as well as markets can be considered a significant contribution if they outperform the existing alternatives. Trading strategies can be benchmarked by comparison and in competition to Gjerstad Dickhaut (GD), Roth-Erev (RE) and Zero Intelligence Plus (ZIP), which are well established automated trading strategies.

The success of devised markets can be measured by comparison to well established markets like the continuous double auction (CDA) or clearing house (CH) that are models of the New York Stock Exchange as it is today (CDA) and as it was before the 1860s (CH). The performance of markets is characterized by market efficiency, market power of the participating traders and the coefficient of market price convergence alpha. Efficiency is the fraction of traders’ profit that is actually made over the maximal profit possible in that market. Market power determines the increase in the amount of profit that buyers and sellers get because the market is not trading at an equilibrium price and alpha indicates how close the market trades to the theoretical equilibrium price in general.

The rules of markets can be parameterized where parameters define how much the participants are charged for different actions and which other rules are enforced (e.g. do submitted bids have to exceed previous ones). Learning algorithms can search the space of possible market rules by simulating auctions with automated traders, e.g. using the ones mentioned above (ZIP, GD, and RE). This methodology has been successfully applied to the validation of the optimal k-pricing rule. This market rule determines to set a transaction price pt for a matched bid and ask to a weighted average with parameter k: pt = k*pbid + (1-k)*pask. The hypothesis that k=0.5, so the mid-point between bid and ask, delivers the most efficient results for the k-pricing rule has been confirmed [Phelps03].

Evolutionary Game Theory (EGT) will be used to gain insight into the learning process for reinforcement learning and helps tuning the parameters of the automated search [Cramton00, Bowling02]. Furthermore, other machine learning algorithms will be explored by reformulating the model, e.g. for an event based simulation, such that advanced search techniques from other areas such as Monte Carlo Tree Search (MTCS) can be applied. Learning algorithms that are found suitable for the search will be analyzed for performance and the most successful ones will be linked back to the properties of the multi-agent environments they can cope with, aiming to establish a framework of reference which properties of multi-agent systems support or inhibit the success of learning algorithms.

Innovation

The success of trading strategies is highly dependent on the market, but even more on the competitors in that market. Sensitivity analysis has shown that a slight increase of profit leads to significantly superior trading strategies [Phelps06]. In addition, a budged balanced and strategy proof auction for rational agents cannot be efficient [Myerson83] which makes many markets sacrifice some strategy proofness for higher efficiency. This decrease in efficiency implies available extra profit and a strategy that can acquire slightly more of that additional profit than its competitors is superior and likely to exist. Due to the dependence on the competition, this research aims to rather develop a methodology to learn such a strategy than a specific strategy.

Recent approaches in automated mechanism design focus on single-dimensional optimization [Phelps03] while the problem at hand is inherently multi-dimensional. Therefore, the extension to the multi-dimensional case is the natural next step and will be explored by this research. Furthermore, current research considers adaptive strategies like RE or GD but assumes its parameters, e.g., the sliding window length for GD, are fixed. It is plausible to question that assumption and investigate situations where these parameters may adapt as well.

The performance of markets and traders is very situation specific, e.g. the best market rules are dependent on the traders that operate in that market. For that reason finding better markets and trading strategies will be an ongoing process. The last research question also acknowledges the importance that transfer of knowledge yields among different fields of research in today’s international research community. Evolutionary game theory has demonstrated the advances that are possible by linking the right techniques.

Relevance for science, technology or society

The rise of e-commerce has established the online auction as a platform that is easily accessible for automated traders. Their superior performance has been shown in a comparison of automated traders to human traders and suggests that future auctions will be increasingly populated by automated traders [Kephart02]. The tested trading strategies can already outperform non-expert humans, but this project aims to increase the level of automated trading strategies, potentially above the one of human experts.

As outlined in the introduction, understanding auctions affects individuals, businesses and even whole economies. Self interested entities that participate in a market are interested in a trading strategy that maximizes their profit while market-based businesses want to offer a competitive market that attracts traders and possibly creates revenues. Trading strategies and market rules will evolve in a co-evolutionary fashion, which means that markets adapt to their participants and competing markets, while trading strategies continuously adapt to a best response to the new market conditions and opposing trading strategies. This research will increase the speed of adaption by extending the set of means to devise and evaluate new market rules and trading strategies. If the obtained insights can be transferred to stock exchange markets and trigger a reduction of price shocks, increased market efficiency or similar they may have considerable effects on today’s society.

Literature references

Bowling02: M. Bowling and M. Veloso. Scalable learning in stochastic games. In Piotr Gmytrasiewicz and Simon Parsons, editors, Proceedings of the Workshop on Game Theoretic and Decision Theoretic Agents, pages 11–18, 2002.
Cagliano95: R. A. Cagliano, M. D. Fraser, and Mark E. Shaefer. Auction allocation of computing resources. Communications of the ACM, 38(6):88–102, 1995.
Cramton00: P. Cramton and J. A. Schwarz. Collusive bidding: Lessons from the FCC Spectrum Auctions. Journal of Regulatory Economics, 17:229–252, 2000.
Cramton01: P. Cramton, A. E. Khan, R. H. Porter, and R. D. Tabors. Pricing in the California power exchange electricity market: Should California switch from uniform pricing to pay-as-bid pricing? Technical report, California Power Exchange, California, USA, 2001.
GE00: GE. Letter to Share Owners. Annual report, General Electric Corporation, USA, 2000.
Kaisers07: H. J. van den Herik, D. Hennes, M. Kaisers, K. Tuyls, and K. Verbeeck. Multi-agent learning dynamics: A survey. In M. Klusch, K. V. Hindriks, M. P. Papazoglou, and L. Sterling, editors, CIA, volume 4676 of Lecture Notes in Computer Science, pages 36-56. Springer, 2007.
Kephart02: J. O. Kephart. Software agents and the route to the information economy. Proceedings of the National Academy of Sciences, 99:7207–7213, May 2002.
Klemperer02: P. Klemperer. How (not) to run auctions: the European 3G telecom auctions. European Economic Review, 46(4–5):829–845, 2002.
McMillan94: J. McMillan. Selling spectrum rights. Journal of Economic Perspectives, 8:191–199, 1994.
Myerson83: R. Myerson, M. Satterwaite, Efficient mechanism for bilateral trading. Journal of Economic Theory, 28:265-281, 1983.
NYSE00: New York Stock Exchange (2002), Stock Market Activity report available at: http://www.nyse.com/pdfs/02_STOCKMARKETACTIVITY.pdf
Phelps03: S. Phelps, S. Parsons, E. Sklar, and P. McBurney. Using genetic programming to optimise pricing rules for a double auction market, 2003.
Phelps06: S. Phelps, M. Marcinkiewicz, and S. Parsons. A novel method for automatic strategy acquisition in n-player non-zero-sum games. In AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pages 705-712, New York, NY, USA, 2006. ACM.
Roughgarden03: Tim Roughgarden. The price of anarchy is independent of the network topology. Journal of Computer and System Sciences, 67(2):341-364, 2003.
Wellman01: M. P. Wellman, W.E. Walsh, P.R. Wurman, and J.K. MacKie-Mason. Auction protocols for decentralized scheduling. Games and Economic Behaviour, 35:271–303, 2001.