Cross-lingual gender prediction with multi-lingual embeddings and linguistic features

less than 1 minute read

Abstract:

Most systems for gender profiling have focused on mono-lingual use-cases. In this paper we implement two systems for cross-lingual gender prediction and compare them: One system is based on linguistic features while the other leverages a multi-lingual embedding approach (XLMRoBERTa). Moreover, we analyse which linguistic properties are most predictive for gender profiling across languages. We find that XLM-RoBERTa performs best with accuracy scores of up to 0.87. Classification on top of linguistic features did not consistently generalize cross-lingually. However, linguistic feature analysis supported previously observed divergences of male and female language use across languages.

The research paper is avaliable here.

Share on

Twitter Facebook LinkedIn

Faizan E Mustafa

Cross-lingual gender prediction with multi-lingual embeddings and linguistic features

Abstract:

Share on

Leave a comment

You may also enjoy

Research Seminar

Emotion Analysis

Statistical Dependency Parsing

Keras Implementation of Image Captioning Model