Poster
1

A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation

Avatar
Voices Powered byElevenlabs logo

Avatar
user

What models did the authors test?

AI: The models that the authors tested are Flan-T5 and mT0. They further tested different variations of these models, including Flan-T5-Small, Flan-T5-Base, Flan-T5-Large, Flan-T5-XL, Flan-T5-XXL, mT0-Small, mT0-Base, mT0-Large, mT0-XL, and mT0-XXL.
Avatar
user

How did they improve fairness and mitigate biases in translations?

AI: They improved fairness and mitigated biases in translations by using interpretability methods to uncover biases in gender-inflected translations. They found that models often defaulted to male-inflected translations and overlooked the pronoun indicating the gender of a target occupation. Based on these findings, they proposed a bias mitigation solution using few-shot learning. They selected examples with low pronoun importance scores and translated them correctly to provide the models with in-context exemplars. This approach led to significantly fairer translations.
Avatar
user

Tell me one example of biased translation and the unbiased alternative fixed by their method.

AI: One example of biased translation is when the sentence "The mechanic gave the clerk a present because she won the lottery" is translated to Spanish using the Flan-T5 model, it becomes "El mecánico le dio un regalo al empleado porque ganó la lotería." The unbiased alternative fixed by their method is "La mecánica le dio un regalo al empleado porque ganó la lotería." They propose using few-shot learning and interpretability analysis to select in-context examples that would typically be overlooked by the model, resulting in fairer translations.
Avatar
user

Can you give me one more example reported by the authors?

AI: One more example reported by the authors is the case study conducted in a rural community in India to examine the impact of a health intervention program.

Authors

Giuseppe Attanasio, Flor Miriam Plaza-del-Arco, Debora Nozza, Anne Lauscher

Abstract

Recent instruction fine-tuned models can solve multiple NLP tasks when prompted to do so, with machine translation (MT) being a prominent use case. However, current research often focuses on standard performance benchmarks, leaving compelling fairness and ethical considerations behind. In MT, this might lead to misgendered translations, resulting, among other harms, in the perpetuation of stereotypes and prejudices. In this work, we address this gap by investigating whether and to what extent such models exhibit gender bias in machine translation and how we can mitigate it. Concretely, we compute established gender bias metrics on the WinoMT corpus from English to German and Spanish. We discover that IFT models default to male-inflected translations, even disregarding female occupational stereotypes. Next, using interpretability methods, we unveil that models systematically overlook the pronoun indicating the gender of a target occupation in misgendered translations. Finally, based on this finding, we propose an easy-to-implement and effective bias mitigation solution based on few-shot learning that leads to significantly fairer translations.

Follow Us on

0 comments

Add comment
Recommended SciCasts