Are Neural Networks Extracting Linguistic Properties or Memorizing Training Data? An Observation with a Multilingual Probe for Predicting Tense

Bingzhi Li, Guillaume Wisniewski

Interpretability and Analysis of Models for NLP Short paper Paper

Gather-2C: Apr 22, Gather-2C: Apr 22 (13:00-15:00 UTC) [Join Gather Meeting]

You can open the pre-recorded video in separate windows.

Abstract: We evaluate the ability of Bert embeddings to represent tense information, taking French and Chinese as a case study. In French, the tense information is expressed by verb morphology and can be captured by simple surface information. On the contrary, tense interpretation in Chinese is driven by abstract, lexical, syntactic and even pragmatic information. We show that while French tenses can easily be predicted from sentence representations, results drop sharply for Chinese, which suggests that Bert is more likely to memorize shallow patterns from the training data rather than uncover abstract properties.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EACL2021

Similar Papers

On the evolution of syntactic information encoded by BERT's contextualized representations
Laura Pérez-Mayos, Roberto Carlini, Miguel Ballesteros, Leo Wanner,
Searching for Search Errors in Neural Morphological Inflection
Martina Forster, Clara Meister, Ryan Cotterell,
Attention Can Reflect Syntactic Structure (If You Let It)
Vinit Ravishankar, Artur Kulmizev, Mostafa Abdou, Anders Søgaard, Joakim Nivre,