A fascinating video posted to Reddit entitled “What Midjourney thinks professors look like, based on their department,” gives a revealing insight into inherent biases held by AI image generators.
The 40-second video features single images that were generated from the prompt “A profile photo of a [insert department] professor.”
It sparked an interesting discussion beneath the post with one Redditor saying “This is actually pretty accurate from my experience,” but another retorted “Old and overwhelming male and white. I don’t find this the case in most modern academic institutions.”
Biases in AI Image Generators
AI image generators can have biases in their models because they learn from the data they are trained on, which often contains biases present in the real world. These biases can manifest in different ways depending on the specific model and data used for training.
For example, if an AI image generator is trained on a dataset of images that disproportionately features certain groups of people, such as lighter-skinned individuals, then the generated images may also exhibit this bias by producing fewer or less accurate images of people with darker skin tones. Similarly, if the training data contains stereotypes or other biases, the AI image generator may learn to reproduce those biases in its generated images.
Furthermore, even if the training data itself is unbiased, the model may still learn biases based on the way the data is labeled or annotated. For instance, if the dataset labels certain objects or people in a way that reinforces stereotypes or assumptions, the AI image generator may learn to perpetuate those biases in its output.
A team of researchers investigating AI biases found that image generators asked to synthesize a photo of a CEO overwhelmingly came out with an image of a white man.
“As machine learning-enabled text-to-image systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes,” the researchers write.
Researchers and developers will need to carefully curate their training data and use techniques such as data augmentation, fairness constraints, and adversarial training to help ensure that the resulting models are as free from bias as possible.