Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models

Qingyang Wu, Yichi Zhang, Yu Li, Zhou Yu

Dialogue and Interactive Systems Long paper Paper

Zoom-5A: Apr 22, Zoom-5A: Apr 22 (12:00-13:00 UTC) [Join Zoom Meeting]
Gather-3B: Apr 23, Gather-3B: Apr 23 (13:00-15:00 UTC) [Join Gather Meeting]

You can open the pre-recorded video in separate windows.

Abstract: Existing dialog system models require extensive human annotations and are difficult to generalize to different tasks. The recent success of large pre-trained language models such as BERT and GPT-2 (Devlin et al., 2019; Radford et al., 2019) have suggested the effectiveness of incorporating language priors in down-stream NLP tasks. However, how much pre-trained language models can help dialog response generation is still under exploration. In this paper, we propose a simple, general, and effective framework: Alternating Recurrent Dialog Model (ARDM). ARDM models each speaker separately and takes advantage of the large pre-trained language model. It requires no supervision from human annotations such as belief states or dialog acts to achieve effective conversations. ARDM outperforms or is on par with state-of-the-art methods on two popular task-oriented dialog datasets: CamRest676 and MultiWOZ. Moreover, we can generalize ARDM to more challenging, non-collaborative tasks such as persuasion. In persuasion tasks, ARDM is capable of generating human-like responses to persuade people to donate to a charity.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EACL2021

Similar Papers

The Gutenberg Dialogue Dataset
Richard Csaky, Gábor Recski,
Few-shot learning through contextual data augmentation
Farid Arthaud, Rachel Bawden, Alexandra Birch,
Zero-shot Generalization in Dialog State Tracking through Generative Question Answering
Shuyang Li, Jin Cao, Mukund Sridhar, Henghui Zhu, Shang-Wen Li, Wael Hamza, Julian McAuley,