In recent years AI has become one of the most powerful tools in data analysis. It is being used in more and more areas of life, thanks to its ability to analyse massive amounts of data and recognise patterns in them that would be impossible to notice for humans. However, it is starting to show that machine learning was created to be – and in most areas still is – an engineer’s tool.
To quickly explain how machine learning works, let us use the popular example of image recognition, given a dataset of millions of pictures of dogs and cats. From this point on, we can proceed with two different methods: supervised and unsupervised learning.
When using supervised learning methods, we provide the machine with labels for the data, giving it the information of which picture contains a dog or a cat. Given this information, the machine starts to analyse the images. What colour they are, what shapes they form, what is the background and so on. After recognising some patterns, the machine develops a model that can be applied to an image that does not have a label yet, meaning that the machine does not know what kind of animal the picture contains. However, once it has been trained, it can analyse every other picture that comes up, freeing people from this easy, but endless task. While it can only give a guess, based on the probability of the picture containing a dog given the parameters of the model, it is far more efficient and useful than a person going through every picture on the internet.
If we use unsupervised learning, the machine looks at a data set without labels and tries to find recurring themes; then, based on those, it groups the data. In recruitment, this is rarely used. The models are usually based on which candidates have been hired in the past. Thus providing the labels: “hired” and “not hired”.
One famous experience is when computer scientists tried to train a machine learning model to differentiate huskies from wolves, but in the dataset most wolves were presented in a snowy environment, making the machine “think” that if there was snow in the picture – meaning a large number of white pixels – then it would be a wolf. As a result, it categorised most huskies in a snowy environment as wolves.
This shows how fallible AI can be even for a dataset where all information is provided. A picture showing a dog has all information about it being a dog. It is true for physics and engineering problems and chess as well, where machine learning was first properly implemented. The problem is that we are now using it in a social context, where we do not and, truly, cannot have all information. Our model is only as good as our data, and when it comes to humans the data itself is ambiguous and, most of the time, biased.
A famous example of this is when Amazon implemented its AI hiring tool for resume screening. It became clear very quickly that the machine was biased against women, especially for software development and other technical positions of the company. It happened because most people already working in these positions were men. The only thing the machine did was recognise the pattern of previous HR executives, hiring more men for these kinds of jobs. The word woman started to be penalised by the algorithm, and such things as women’s college or women’s football team became red flags, having those resumes immediately excluded.
Other stages of the hiring process are starting to be automated as well, such as interviews, which can use speech recognition AI. In this case, if the starting data set is not diverse enough, discrimination can be further perpetuated. Studies have found that dialects are underrepresented in training datasets, hindering their users. Siri finding the Scottish accent hard to comprehend is a mild case, compared to hiring tools finding Black candidates less understandable and thus less likely to be hired.
Racial and gender bias comes forward in the recruiting process as well. Sites like Facebook or Glassdoor can provide targeted job advertisements based on their users’ profiles, allowing companies to target an audience of their choice which matches their requirements. The sites may have already categorised women as less interested in tech jobs, since mostly men on those sites have those occupations, or Black and immigrant people as riskier to hire, since non-white names are more likely to occur in dangerous headlines. This one might be the hardest bias to correct since a below-average number of applications from minority groups can be attributed to many things, not only to them not seeing the job opportunity.
AI, if implemented correctly, can be more objective in the hiring process, and it is undeniable how much time-consuming work it can take on, allowing humans to deal with the more complicated, creative and ambiguous parts of the task. However, even if the algorithm is good, the model might come out overly biased if the data itself is biased. This is why we need more economists and social scientists to work in these fields, since it is easy to overlook these problems if one only concentrates on the algorithm.
Sources:
R. Allan (2020, Feb 12). A Gentle Introduction to Machine Learning Concepts, Medium
N. Parikh (2021, Oct 14). Understanding Bias In AI-Enabled Hiring. Forbes
S. K. White (2021, Sep 17). AI in hiring might do more harm than good. Cio
AI in hiring might do more harm than good
J. Dastin (2018, Oct 11). Amazon scraps secret AI recruiting tool that showed bias against women. Reuters
J. Link (2020, Jul 26). Why Racial Bias Still Haunts Speech-Recognition AI. Built in
https://builtin.com/artificial-intelligence/racial-bias-speech-recognition-systems