Actor-critic reinforcement learning for bidding in bilateral negotiation

Furkan Arslan; Reyhan Aydoğan

doi:10.55730/1300-0632.3899

Actor-critic reinforcement learning for bidding in bilateral negotiation

Furkan Arslan, Reyhan Aydoğan^*

^*Corresponding author for this work

Interactive Intelligence

Research output: Contribution to journal › Article › Scientific › peer-review

1 Citation (Scopus)

25 Downloads (Pure)

Abstract

Designing an effective and intelligent bidding strategy is one of the most compelling research challenges in automated negotiation, where software agents negotiate with each other to find a mutual agreement when there is a conflict of interests. Instead of designing a hand-crafted decision-making module, this work proposes a novel bidding strategy adopting an actor-critic reinforcement learning approach, which learns what to offer in a bilateral negotiation. An entropy reinforcement learning framework called Soft Actor-Critic (SAC) is applied to the bidding problem, and a self-play approach is employed to train the model. Our model learns to produce the target utility of the coming offer based on previous offer exchanges and remaining time. Furthermore, an imitation learning approach called behavior cloning is adopted to speed up the learning process. Also, a novel reward function is introduced that does take not only the agent’s own utility but also the opponent’s utility at the end of the negotiation. The developed agent is empirically evaluated. Thus, a large number of negotiation sessions are run against a variety of opponents selected in different domains varying in size and opposition. The agent’s performance is compared with its opponents and the performance of the baseline agents negotiating with the same opponents. The empirical results show that our agent successfully negotiates against challenging opponents in different negotiation scenarios without requiring any former information about the opponent or domain in advance. Furthermore, it achieves better results than the baseline agents regarding the received utility at the end of the successful negotiations.

Original language	English
Pages (from-to)	1695-1714
Number of pages	20
Journal	Turkish Journal of Electrical Engineering and Computer Sciences
Volume	30
Issue number	5
DOIs	https://doi.org/10.55730/1300-0632.3899
Publication status	Published - 2022

Keywords

automated bilateral negotiation
bidding strategy
Deep reinforcement learning
entropy reinforcement learning
imitation learning
multi-agent systems

Access to Document

10.55730/1300-0632.3899

Actor-critic reinforcement learning for bidding in bilateral negoFinal published version, 887 KBLicence: CC BY

Cite this

@article{bee7c91f916a4756b9621fd995709341,

title = "Actor-critic reinforcement learning for bidding in bilateral negotiation",

abstract = "Designing an effective and intelligent bidding strategy is one of the most compelling research challenges in automated negotiation, where software agents negotiate with each other to find a mutual agreement when there is a conflict of interests. Instead of designing a hand-crafted decision-making module, this work proposes a novel bidding strategy adopting an actor-critic reinforcement learning approach, which learns what to offer in a bilateral negotiation. An entropy reinforcement learning framework called Soft Actor-Critic (SAC) is applied to the bidding problem, and a self-play approach is employed to train the model. Our model learns to produce the target utility of the coming offer based on previous offer exchanges and remaining time. Furthermore, an imitation learning approach called behavior cloning is adopted to speed up the learning process. Also, a novel reward function is introduced that does take not only the agent{\textquoteright}s own utility but also the opponent{\textquoteright}s utility at the end of the negotiation. The developed agent is empirically evaluated. Thus, a large number of negotiation sessions are run against a variety of opponents selected in different domains varying in size and opposition. The agent{\textquoteright}s performance is compared with its opponents and the performance of the baseline agents negotiating with the same opponents. The empirical results show that our agent successfully negotiates against challenging opponents in different negotiation scenarios without requiring any former information about the opponent or domain in advance. Furthermore, it achieves better results than the baseline agents regarding the received utility at the end of the successful negotiations.",

keywords = "automated bilateral negotiation, bidding strategy, Deep reinforcement learning, entropy reinforcement learning, imitation learning, multi-agent systems",

author = "Furkan Arslan and Reyhan Aydoğan",

year = "2022",

doi = "10.55730/1300-0632.3899",

language = "English",

volume = "30",

pages = "1695--1714",

journal = "Turkish Journal of Electrical Engineering and Computer Sciences",

issn = "1300-0632",

publisher = "Turkiye Klinikleri",

number = "5",

}

TY - JOUR

T1 - Actor-critic reinforcement learning for bidding in bilateral negotiation

AU - Arslan, Furkan

AU - Aydoğan, Reyhan

PY - 2022

Y1 - 2022

N2 - Designing an effective and intelligent bidding strategy is one of the most compelling research challenges in automated negotiation, where software agents negotiate with each other to find a mutual agreement when there is a conflict of interests. Instead of designing a hand-crafted decision-making module, this work proposes a novel bidding strategy adopting an actor-critic reinforcement learning approach, which learns what to offer in a bilateral negotiation. An entropy reinforcement learning framework called Soft Actor-Critic (SAC) is applied to the bidding problem, and a self-play approach is employed to train the model. Our model learns to produce the target utility of the coming offer based on previous offer exchanges and remaining time. Furthermore, an imitation learning approach called behavior cloning is adopted to speed up the learning process. Also, a novel reward function is introduced that does take not only the agent’s own utility but also the opponent’s utility at the end of the negotiation. The developed agent is empirically evaluated. Thus, a large number of negotiation sessions are run against a variety of opponents selected in different domains varying in size and opposition. The agent’s performance is compared with its opponents and the performance of the baseline agents negotiating with the same opponents. The empirical results show that our agent successfully negotiates against challenging opponents in different negotiation scenarios without requiring any former information about the opponent or domain in advance. Furthermore, it achieves better results than the baseline agents regarding the received utility at the end of the successful negotiations.

AB - Designing an effective and intelligent bidding strategy is one of the most compelling research challenges in automated negotiation, where software agents negotiate with each other to find a mutual agreement when there is a conflict of interests. Instead of designing a hand-crafted decision-making module, this work proposes a novel bidding strategy adopting an actor-critic reinforcement learning approach, which learns what to offer in a bilateral negotiation. An entropy reinforcement learning framework called Soft Actor-Critic (SAC) is applied to the bidding problem, and a self-play approach is employed to train the model. Our model learns to produce the target utility of the coming offer based on previous offer exchanges and remaining time. Furthermore, an imitation learning approach called behavior cloning is adopted to speed up the learning process. Also, a novel reward function is introduced that does take not only the agent’s own utility but also the opponent’s utility at the end of the negotiation. The developed agent is empirically evaluated. Thus, a large number of negotiation sessions are run against a variety of opponents selected in different domains varying in size and opposition. The agent’s performance is compared with its opponents and the performance of the baseline agents negotiating with the same opponents. The empirical results show that our agent successfully negotiates against challenging opponents in different negotiation scenarios without requiring any former information about the opponent or domain in advance. Furthermore, it achieves better results than the baseline agents regarding the received utility at the end of the successful negotiations.

KW - automated bilateral negotiation

KW - bidding strategy

KW - Deep reinforcement learning

KW - entropy reinforcement learning

KW - imitation learning

KW - multi-agent systems

UR - http://www.scopus.com/inward/record.url?scp=85139306597&partnerID=8YFLogxK

U2 - 10.55730/1300-0632.3899

DO - 10.55730/1300-0632.3899

M3 - Article

AN - SCOPUS:85139306597

SN - 1300-0632

VL - 30

SP - 1695

EP - 1714

JO - Turkish Journal of Electrical Engineering and Computer Sciences

JF - Turkish Journal of Electrical Engineering and Computer Sciences

IS - 5

ER -

Actor-critic reinforcement learning for bidding in bilateral negotiation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this