Turkish is a gender-neutral language, and in translation from English to Turkish, the gendered pronouns were lost. Translating back to English required the AI-based translation algorithm to analyse and associate a gender with the type of role being described. The data that is used to teach the AI has strong gender stereotypes. The AI therefore decided it must be HE who is president, driving a truck or a leader, and it must be a SHE who is cooking and is a nurse.
The question is, where is this gender-biased data coming from?
AI algorithms are software that is written to learn and create associations in large amounts of existing data. The algorithms’ decisions are as good as the data they utilise for training and understanding. If an algorithm is given a larger amount of text or pictures depicting women doing chores at home and men doing work outside the home, it will increase the statistical probability of these associations and will learn to use them for its future decision-making processing.
The field of computing has always been male-dominated, and the learning data sets for AI are created by the majority of male computer scientists who lack a feminist point of view. The data sets have been proven to have both conscious and unconscious gender biases, and to be biased in how they are collected – namely, by acquiring what is cheap and easily available.
Social media provides a conveniently available and cheap source of data for training AI, especially in social science research. Twitter and Facebook have been used extensively in monitoring the social trends and sentiments of people around the globe. Twitter has around 330 million monthly active users, with 500 million tweets being sent per day. The accessibility and volume of that data has led to a lot of research using tweets as training data sets for AI algorithms to learn social trends.
The repercussions of gender-biased AI
Online hate speech has escalated the war on gender equality in cyber space. The amount of hateful content posted on social media against female public figures is far worse than that targeted at their male counterparts. This content provides a pattern of gender-based hate and stereotypes for AI to learn from.
We carried out research which tracked tweets associated with the names of five male and five female leaders on 27 January 2018. We found that, overall, more negative sentiments were expressed on that day against the female leaders compared to the male leaders. Male Twitter users, on average, also tweeted about the leaders more than female users.