Katia Sycara
*
The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213
*
Voice: 412-268-8815 Fax: (412) 268-5569 katia@cs.cmu.edu
*
February 1997
Project Summary
This research aims to develop a domain independent computational model of negotiation capable of addressing several complex issues, such as multi-issue negotiation and decision making under incomplete information. Our research is based on a sequential decision making view of negotiation that provides a natural representation of the multi-stage nature of negotiation. Issues such as learning associated with updating beliefs about a partially-known world will be addressed. This research is intended to contribute to the development of computational models of negotiation that can answer fundamental research questions on the nature of negotiation and also can serve as basis for designing automated negotiation systems. We are also interested in answering interesting research questions in the emerging field of multi-agent learning. For example, what are the benefits of learning to the learner and to the multi-agent society? How can belief updating in a partially known world be efficiently conducted?
We are extending the sequential decision making based negotiation model to explicitly model strategic parts of negotiation. The resulting formalism can be computationally tractable by applying dynamic programming solution strategies. Under this extended model, many issues that have not been currently addressed by game theoretic models, such as asymmetric information among agents, dynamic process of negotiation, changing environment, etc., can be analyzed or explored experimentally. In addition, computationally efficient multi-agent learning algorithms will be developed and the impact of introducing learning in the model will be explored. To evaluate the research, we are developing a multi-agent simulation-testbed and planning to utilize this testbed to conduct empirical studies to answer significant theoretical questions, such as the effectiveness of different negotiation strategies and learning algorithms. Real world domains of theoretical and practical significance, such as supply contracting will be used to provide realistic problem scenarios.
We have conducted simulations in a simple bargaining setting to observe the interactions between the agents that learn and the agents that use fixed strategies. In our initial experiments, the bargaining process (proposals/counter-proposals) is symmetrical for the buyer and the supplier. In our experiments, a non-learning agent, makes decisions based solely on his own reservation price. A learning agent uses a fundamentally different negotiation strategy: he makes decisions based on both his own and his opponent's reservation price. Note that reservation prices are private information and there is no way that an agent can know the exact value of his opponent's reservation price, even after an agreement has been reached. However, each learning agent can have some a priori estimation of his opponent's reservation price and update his estimation during the negotiation process using Bayesian belief updating mechanism. In our implementation, an agent represents his subjective beliefs about his opponent's reservation price using a piecewise probability distribution function.
We ran experiments in various experimental conditions: learning agents vs. non-learning agents; learning agents vs. learning agents. Results from non-learning vs. non-learning were used as the baseline for comparison. We measured the quality of a particular bargaining process using the normalized joint utility fashioned after the Nash solution. It can be easily shown that the joint utility reaches the maximum .25 when the agreed price is the arithmetic average of buyer's reservation price and supplier's reservation price. The cost of a bargaining process is measured by the number of proposals exchanged before reaching an agreement. In the following table we report the average performance of all three configurations.
==========================================================
Configuration | Joint Utility | # of Proposals
| | exchanged
----------------------------------------------------------
both learn | 0.22 | 24
neither learn | 0.18 | 34
only buyer learns | 0.15 | 28
----------------------------------------------------------
Our observations regarding these experimental results are as follows.
We examined the data of ``neither learn and ``both learn'' in more detail by further dividing all the experiment instances into different categories according to the size of the zone of agreement. Then, we calculated the differences of the corresponding joint utilities between ``neither learn'' and ``both learn'' and plotted the percentage difference in joint utility improvement against the size of the zone of agreement. We observed that there seems to be a positive correlation between these two variables. An intuitive explanation could be that the greater the room for agreement flexibility (greater the zone of agreement), the better the learning agents seize the opportunity.