An Expert Annotated Dataset for the Detection of Online Misogyny

Ella Guest, Bertie Vidgen, Alexandros Mittos, Nishanth Sastry, Gareth Tyson, Helen Margetts

Computational Social Science and Social Media Long paper Paper

Gather-2B: Apr 22, Gather-2B: Apr 22 (13:00-15:00 UTC) [Join Gather Meeting]

You can open the pre-recorded video in separate windows.

Abstract: Online misogyny is a pernicious social problem that risks making online platforms toxic and unwelcoming to women. We present a new hierarchical taxonomy for online misogyny, as well as an expert labelled dataset to enable automatic classification of misogynistic content. The dataset consists of 6567 labels for Reddit posts and comments. As previous research has found untrained crowdsourced annotators struggle with identifying misogyny, we hired and trained annotators and provided them with robust annotation guidelines. We report baseline classification performance on the binary classification task, achieving accuracy of 0.93 and F1 of 0.43. The codebook and datasets are made freely available for future researchers.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EACL2021

Similar Papers

Exploiting Emojis for Abusive Language Detection
Michael Wiegand, Josef Ruppenhofer,
Civil Rephrases Of Toxic Texts With Self-Supervised Transformers
Léo Laugier, John Pavlopoulos, Jeffrey Sorensen, Lucas Dixon,