Yousef RAZEGHI, Ozan YAVUZ, Reyhan AYDOĞAN

Deep reinforcement learning for acceptance strategy in bilateral negotiations

This paper introduces an acceptance strategy based on reinforcement learning for automated bilateral negotiation, where negotiating agents bargain on multiple issues in a variety of negotiation scenarios. Several acceptance strategies based on predefined rules have been introduced in the automated negotiation literature. Those rules mostly rely on some heuristics, which take time and/or utility into account. For some negotiation settings, an acceptance strategy solely based on a negotiation deadline might perform well; however, it might fail in another setting. Instead of following predefined acceptance rules, this paper presents an acceptance strategy that aims to learn whether to accept its opponent’s offer or make a counter offer by reinforcement signals received after performing an action. In an experimental setup, it is shown that the performance of the proposed approach improves over time.

PDF

___

1] Jennings NR, Faratin P, Lumiscio AR. Automated negotiation: prospects, methods and challenges. Group Decision and Negotiation 2001; 10 (2): 199-215. doi: 10.1023/A:1008746126376
[2] Barslaag T, Gerding EH, Aydoan R, Schraefel MC. Optimal negotiation decision functions in time-sensitive domains. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT); Singapore; 2015. pp. 190-197.
[3] Sánchez-Anguix V, Julián V, Botti V, García-Fornes A. Studying the impact of negotiation environments on negotiation teams’ performance. Information Sciences 2013; 219 (1): 17-40. doi: 10.1016/j.ins.2012.07.017
[4] Sanchez-Anguix V, Aydoan R, Julian V, Jonker CM. Intra-team strategies for teams negotiating against competitor, matchers, and conceders. In: Marsa-Maestre I, Lopez-Carmona M, Ito T, Zhang M, Bai Q et al (editors). Novel Insights in Agent-based Complex Automated Negotiation. Japan: Springer, 2013, pp. 3-22.
[5] Tunal O, Aydoan R, Sanchez-Anguix V. Rethinking frequency opponent modeling in automated negotiation. In: International Conference on Principles and Practice of Multi-Agent Systems; Nice, France; 2017. pp. 263-279.
[6] Aydoan R, Yolum P. Ontology-based learning for negotiation. In: IEEE/WIC/ACM Inter-national Joint Conference on Web Intelligence and Intelligent Agent Technology; Milan, Italy; 2009. pp. 177-184.
[7] Baarslag T, Hendrikx MJC, Jonker CM, Hindriks KV. Learning about the opponent in automated bilateral negotiation: A comprehensive survey of opponent modeling techniques. Autonomous Agents and Multi-Agent Systems 2016; 30 (1): 849-898. doi: 10.1007/s10458-015-9309-1
[8] Aydoan R, Festen D, Hindriks KV, Jonker CM. Alternating offers protocols for multilateral negotiation. In: Fujita K, Bai Q, Ito T, Zhang M, Ren F, Aydoan R, Hadfi R (editors). Modern Approaches to Agent-based Complex Automated Negotiation, USA: Springer, 2017, pp. 153-167.
[9] Sutton RS, Barto AG. Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, 2018. [10] Baarslag T, Hindriks KV, Jonker CM. Effective acceptance conditions in real-time automated negotiation. Decision Support Systems 2014; 60 (1):68-77. doi: 10.1016/j.dss.2013.05.021
[11] Cai H, Ren K, Zhang W, Malialis K, Wang J et al. Real-time bidding by reinforcement learning in display advertising. In: The Tenth ACM International Conference on Web Search and Data Mining; Cambridge, UK; 2017. pp. 661-670.
[12] Borissov N, Anandasivam A, Wirström N, Neumann D. Rational bidding using reinforcement learning. In: Altmann J, Neumann D, Fahringer T (editors). Germany: Springer, 2008, pp. 73-88.
[13] Hindriks KV, Jonker CM, Kraus S, Lin R, Tykhonov D. Genius: negotiation environment for heterogeneous agents. In: The 8th International Conference on Autonomous Agents and Multiagent Systems-Volume 2; Budapest, Hungary; 2009. pp. 1397-1398.
[14] Baarslag T, Hindriks KV, Hendrikx MJC, Dirkzwager A, Jonker CM. Decoupling negotiating agents to explore the space of negotiation strategies. In: Marsa-Maestre I, Lopez-Carmona MA, Ito T, Zhang M, Bai Q et. al. (editors). Novel Insights in Agent-based Complex Automated Negotiation, Japan: Springer, 2014, pp. 61-83.
[15] Pomerol J, Barba-Romero S. Multicriterion Decision in Management: Principles and Practice. USA: Springer, 2000.
[16] Raiffa H. The Art and Science of Negotiation. Cambridge, MA, USA: Harvard University Press, 1982.
[17] Adar MB, Sofy N, Elimelech A. Gahboninho: Strategy for balancing pressure and compromise in automated negotiation. In: Ito T, Zhang M, Robu V, Matsuo T (editors). Complex Automated Negotiations: Theories, Models, and Software Competitions, Germany: Springer, 2013, pp. 205-208.
[18] Kawaguchi S, Fujita K, Ito T. Agent K: Compromising strategy based on estimated maximum utility for automated negotiating agents. In: Ito T, Zhang M, Robu V, Matsuo T (editors). New Trends in Agent-Based Complex Automated Negotiations. Germany: Springer, 2012, pp. 137-144.
[19] Last N.G. Agent smith: Opponent model estimation in bilateral multi-issue negotiation. In: Ito T, Zhang M, Robu V, Matsuo T (editors). New Trends in Agent-based Complex Automated Negotiations. Germany: Springer, 2012, pp. 167-174.
[20] Baarslag T, Hindriks KV, Jonker MC, Kraus S, Lin R. The first automated negotiating agents competition (ANAC 2010). In: Ito T, Zhang M, Robu V, Matsuo T (editors). New Trends in agent-based complex automated negotiations. Germany: Springer, 2012, pp. 113-135.
[21] An B, Lesser V. Yushu: A heuristic-based agent for automated negotiating competition. In: Ito T, Zhang M, Robu V, Matsuo T (editors). New Trends in Agent-Based Complex Automated Negotiations. Germany: Springer, 2012, pp. 145-149.
[22] Ito T, Zhang M, Robu V, Fatima S, Matsuo T. New Trends in Agent-based Complex Automated Negotiations. Germany: Springer, 2011.
[23] Williams CR, Robu V, Gerding EH, Jennings NR. Iamhaggler: A negotiation agent for complex environments. In: Ito T, Zhang M, Robu V, Matsuo T (editors). New Trends in Agent-based Complex Automated Negotiations. Germany: Springer, 2012, pp. 151-158.
[24] Khosravimehr Z, Nassiri-Mofakham F. Pars agent: Hybrid time-dependent, random and frequency-based bidding and acceptance strategies in multilateral negotiations. In: Fujita K, Bai Q, Ito T, Zhang M, Ren F et al (editors). Modern Approaches to Agent-based Complex Automated Negotiation. Germany: Springer, 2017, pp. 175-183
[25] Jonker CM, Aydoan R, Baarslag T, Fujita K, Ito T et al. Automated negotiating agents competition (ANAC). In: The Thirty-First AAAI Conference on Artificial Intelligence; San Francisco, California USA; 2017. pp. 5070-5072.
[26] Fujita K, Aydoan R, Baarslag T, Hindriks KV, Ito T et al. The sixth automated negotiating agents competition (ANAC 2015). In: Fujita K, Bai Q, Ito T, Zhang M, Ren F et al (editors). Modern Approaches to Agent-based Complex Automated Negotiation. Germany: Springer, 2017, pp. 139-151.
[27] Bakker J, Hammond A, Bloembergen D, Baarslag T. RLBOA: A modular reinforcement learning framework for autonomous negotiating agents. In: The 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS’19); Budapest, Hungary; 2019. pp. 260-268.
[28] Papangelis A, Georgila K. Reinforcement learning of multi-issue negotiation dialogue policies. In: The 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue; Prague, Czech Republic; 2015. pp. 154-158.
[29] Rudnicky A, Xu W. An agenda-based dialog management architecture for spoken language systems. In: IEEE Automatic Speech Recognition and Understanding Workshop; Sentosa, Singapore; 1999. pp. 1-17
[30] Zou Y, Zhan W, Shao Y. Evolution with reinforcement learning in negotiation. PLOS One 2014; 9 (7): 1-7. doi: 10.1371/journal.pone.0102840
[31] Kröhling D, Hernández F, Martínez E, Chiotti OJA. The importance of context- dependent learning in negotiation agents. Inteligencia Artificial 2018; 22 (63): 135-149. doi: 10.4114/intartif.vol22iss63pp135-149
[32] Rodriguez-Fernandez J, Pinto T, Silva F, Praça I, Vale Z et al. Context aware Q-learning-based model for decision support in the negotiation of energy contracts. International Journal of Electrical Power & Energy Systems 2019; 104: 489-501. doi: 10.1016/j.ijepes.2018.06.050
[33] Sunder V, Vig L, Chatterjee A, Shroff G. Prosocial or selfish? agents with different behaviors for contract negotiation using reinforcement learning. In: Ito T, Aydoğan R, Zhang M (editors). In Advances in Automated Negotiations. Germany: Springer, 2020, pp. 69-88