Multilingual and cross-lingual document classification: A meta-learning approach

Niels van der Heijden, Helen Yannakoudakis, Pushkar Mishra, Ekaterina Shutova

Multilinguality Long paper Paper

Gather-3E: Apr 23, Gather-3E: Apr 23 (13:00-15:00 UTC) [Join Gather Meeting]

You can open the pre-recorded video in separate windows.

Abstract: The great majority of languages in the world are considered under-resourced for successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in low-resource languages and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation to previously unseen languages; and multilingual joint-training when limited target-language data is available during trai-ing. We conduct a systematic comparison of several meta-learning methods, investigate multiple settings in terms of data availability, and show that meta-learning thrives in settings with a heterogeneous task distribution. We propose a simple, yet effective adjustment to existing meta-learning methods which allows for better and more stable learning, and set a new state-of-the-art on a number of languages while performing on-par on others, using only a small amount of labeled data.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EACL2021

Similar Papers

Cross-lingual Contextualized Topic Models with Zero-shot Learning
Federico Bianchi, Silvia Terragni, Dirk Hovy, Debora Nozza, Elisabetta Fersini,
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT
Benjamin Muller, Yanai Elazar, Benoît Sagot, Djamé Seddah,
Beyond the English Web: Zero-Shot Cross-Lingual and Lightweight Monolingual Classification of Registers
Liina Repo, Valtteri Skantsi, Samuel Rönnqvist, Saara Hellström, Miika Oinonen, Anna Salmela, Douglas Biber, Jesse Egbert, Sampo Pyysalo, Veronika Laippala,
MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark
Haoran Li, Abhinav Arora, Shuohui Chen, Anchit Gupta, Sonal Gupta, Yashar Mehdad,