
TLDR
AI bias occurs when an AI system produces systematically unfair or skewed outputs due to biased training data, flawed design choices, or the unequal representation of groups in the data it learned from.
AI systems learn from data. If that data reflects historical inequalities, cultural assumptions, or gaps in representation, the AI will learn and reproduce those patterns. This is the core mechanism of AI bias.
There are several types of bias. Historical bias occurs when training data reflects past discrimination (e.g., hiring data skewed toward men). Representation bias occurs when certain groups are underrepresented in the training set. Measurement bias occurs when the features used to train the model are poor proxies for what we actually care about.
AI bias is not always obvious. A loan approval model might seem fair while systematically approving fewer applications from certain zip codes that correlate with race. A facial recognition system might have near-perfect accuracy overall but perform poorly on darker-skinned faces.
Addressing AI bias requires diverse training data, careful evaluation across different demographic groups, human oversight, and ongoing monitoring after deployment. It is an active area of research and regulation.
Hiring algorithms
Amazon scrapped an AI hiring tool in 2018 after finding it penalized resumes that included the word "women's" because it had been trained on a decade of male-dominated hiring decisions.
Facial recognition
Multiple studies found that major facial recognition systems had error rates 10-30x higher for darker-skinned women compared to lighter-skinned men, leading to wrongful arrests.
Language model bias
Early language models often associated certain professions with specific genders: completing "The doctor said she" was statistically less likely than "The doctor said he".
Not entirely, because all training data reflects some perspective or historical context. The goal is to identify, measure, and reduce bias to acceptable levels, and to be transparent about remaining limitations.
Yes. All large language models have biases from their training data. OpenAI and others work to reduce harmful biases, but models still exhibit tendencies related to gender, culture, and political perspective.
Be skeptical of AI outputs about people or groups. Test the tool with diverse inputs. Do not use AI as the sole decision-maker in high-stakes situations. Look for documentation about how the model was evaluated for fairness.
Bottom line
AI bias occurs when an AI system produces systematically unfair or skewed outputs due to biased training data, flawed design choices, or the unequal representation of groups in the data it learned from.