Keep Learning: Self-supervised Meta-learning for Learning from Inference

Akhil Kedia, SAI CHETAN CHINTHAKINDI

Machine Learning for NLP Long paper Paper

Zoom-6B: Apr 23, Zoom-6B: Apr 23 (07:00-08:00 UTC) [Join Zoom Meeting]
Gather-3D: Apr 23, Gather-3D: Apr 23 (13:00-15:00 UTC) [Join Gather Meeting]

You can open the pre-recorded video in separate windows.

Abstract: A common approach in many machine learning algorithms involves self-supervised learning on large unlabeled data before fine-tuning on downstream tasks to further improve performance. A new approach for language modelling, called dynamic evaluation, further fine-tunes a trained model during inference using trivially-present ground-truth labels, giving a large improvement in performance. However, this approach does not easily extend to classification tasks, where ground-truth labels are absent during inference. We propose to solve this issue by utilizing self-training and back-propagating the loss from the model’s own class-balanced predictions (pseudo-labels), adapting the Reptile algorithm from meta-learning, combined with an inductive bias towards pre-trained weights to improve generalization. Our method improves the performance of standard backbones such as BERT, Electra, and ResNet-50 on a wide variety of tasks, such as question answering on SQuAD and NewsQA, benchmark task SuperGLUE, conversation response selection on Ubuntu Dialog corpus v2.0, as well as image classification on MNIST and ImageNet without any changes to the underlying models. Our proposed method outperforms previous approaches, enables self-supervised fine-tuning during inference of any classifier model to better adapt to target domains, can be easily adapted to any model, and is also effective in online and transfer-learning settings.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EACL2021

Similar Papers

DOCENT: Learning Self-Supervised Entity Representations from Large Document Collections
Yury Zemlyanskiy, Sudeep Gandhe, Ruining He, Bhargav Kanagal, Anirudh Ravula, Juraj Gottweis, Fei Sha, Ilya Eckstein,
Meta-Learning for Effective Multi-task and Multilingual Modelling
Ishan Tarunesh, Sushil Khyalia, vishwajeet kumar, Ganesh Ramakrishnan, Preethi Jyothi,