Thomas Z. Ramsøy

AI Models Share Human Cognitive Biases

How much do you trust artificial intelligence (AI) to make decisions for you? Would you believe that AI can be smarter than humans, or even surpass human intelligence? Are AI models free from cognitive biases?

If you answered yes to these questions, you might be surprised to learn that AI is not as objective and rational as you think. In fact, AI may inherit — and even exacerbate — the many human biases that affect human thinking and behavior.

I have previously shown how AI models can lead to horrific examples of human behaviors. This time, however, I am more concerned.

In this blog post, I will share with you an experiment I conducted with a large language model (LLM), which revealed how it was influenced by the anchoring effect, a well-known cognitive bias. I will also discuss the broader implications of this finding, and how it challenges the emerging narrative of LLMs being ‘smarter’ than humans. I will then examine some of the limitations and drawbacks of LLMs, and why they need to be calibrated with data about human behavior and established truths. Finally, I will provide some suggestions and recommendations for future research and practice on LLMs and human biases.

AI Recap: LLMs and Generative AI

As a recap, LLMs are a type of AI that can generate natural language texts based on a given input, such as a word, a phrase, a sentence, or a paragraph. They are trained on vast amounts of human-generated texts, such as books, articles, blogs, tweets, etc., and learn to mimic the style, tone, and content of these texts. Some of the applications and benefits of LLMs include natural language understanding, generation, and translation, text summarization, question answering, and more.

Generative AI is a broader term that refers to AI that can create new content or data, such as images, music, videos, etc., based on a given input or a latent space. Generative AI uses various techniques, such as generative adversarial networks (GANs), variational autoencoders (VAEs), or transformers, to produce realistic and diverse outputs.

Cognitive Biases in AI: A Case Study

The anchoring effect is a cognitive bias where initial information, such as a number, a word, or an image, heavily influences subsequent judgments and decisions. For example, if you are asked to estimate the value of a vase, and you are given a high or a low anchor, such as €1000 or €100, you are likely to adjust your estimate towards the anchor, rather than away from it.

I recently conducted an experiment with GPT-4, a state-of-the-art LLM that can generate texts based on a given input. I presented it with an image of a vase and inquired about its value.

The catch? I gave the AI one of two different values – €100 and €1.000.

The results were telling: the AI estimated the vase’s worth around the suggested figures (€150 and €1500, respectively). This stark difference highlights the anchoring effect, and how it affects the AI’s judgment.

In other words, GPT-4 acts in the same way as humans. You could even say that the anchoring effect is exacerbated in GPT-4, as it would be surprising if human vase valuation was off by 10x…

The two GPT-4 models demonstrated human biases such as the anchoring effects from the prior price information given. Please note that to get GPT to make a price valuation, some prompting and role-playing is needed, but this was kept identical across both prompts.

Broader Implications: AI Reflecting Human Biases

This isn’t an isolated incident. For example, it was recently reported (PDF) that GPT-4’s responses varied in length and content depending on what part of the year it believed it was in. For example, it tended to generate shorter and more negative texts in December than in other months.

At this point, we need to know which cognitive biases that AI model are inherinting from the human-based data they are trained on. We should also know whether these biases are smaller or larger than observed in humans.

These instances suggest that LLMs, trained on vast human-generated datasets, are prone to mirroring our cognitive biases. This realization challenges the emerging narrative of LLMs being ‘smarter’ than humans, or that they can surpass human intelligence or performance. This is not to say that AI models over time can develop super-human intelligence. Rather, it shows that what we’re currently operating on, is falling dramatically short of such hopes and promises…

The Limitations of LLMs: Not Smarter Than Humans (Yet)

As LLMs learn from human data, they are susceptible to absorbing not only our knowledge but also our prejudices and biases. These models are not arbiters of truth but rather reflectors of our collective discourse. They lack the ability to discern factual accuracy or moral correctness autonomously. For example, LLMs may generate texts that are sexist, racist, homophobic, or otherwise offensive or harmful, depending on the data they are trained on or the input they are given. They may also generate texts that are false, misleading, or contradictory, without any regard for the consequences or the context.

LLMs are not truly intelligent or superior to humans, despite their impressive capabilities. They suffer from many limitations and drawbacks, such as the lack of common sense, world knowledge, causal reasoning, explainability, and accountability. Instead of deeper understanding and logic, they often rely on statistical patterns and superficial associations. Tasks that require creativity, originality, or novelty, such as generating poems, stories, jokes, or songs, are also challenging for them. Moreover, they may struggle to adapt to new or changing situations, or to learn from their own mistakes or feedback.

The Calibration of Generative Models: Towards More Reliable and Robust LLMs

Given this context, it’s crucial that we calibrate generative models with data about human behavior and established truths. We cannot allow LLMs to become post-truth social constructionists where truth does not exist and is always socially negotiated. We need to ensure that LLMs generate texts that are accurate, reliable, consistent, and ethical, and that they align with our values and expectations.

In short, AI model calibration is the process of adjusting or tuning a model’s parameters or outputs to match a desired target or criterion. For generative models, calibration is important for in-context learning and adaptation, where the model learns from its own generated texts and adapts to the input or the task at hand. Calibration can also improve the quality and diversity of the generated texts, and reduce the uncertainty and variability of the model.

There are various methods and techniques for calibrating generative models, such as generative calibration, uncertainty estimation, data augmentation, adversarial training, etc. However, there are also challenges and open problems for calibrating generative models, such as data scarcity, data diversity, data reliability, etc. For example, how can we obtain enough and diverse data to calibrate a model for a specific domain or task? How can we ensure that the data is reliable and trustworthy, and not biased or manipulated? How can we measure and evaluate the calibration performance of a model, and compare it with other models or human standards?

Conclusion

The anchoring effect is one of the many cognitive biases that shape our perception and judgment of the world. In this post, I have shown you how GPT-4, one of the most advanced large language models (LLMs) to date, is not immune to this bias. By manipulating the first piece of information that GPT-4 received on a topic, I was able to influence its subsequent responses. This finding challenges the assumption that LLMs are superior to humans in terms of intelligence and rationality.

But the anchoring effect is not the only bias that LLMs may exhibit. Recent studies have revealed that LLMs can also learn and amplify harmful social biases, such as racism, sexism, and political polarization123. These cognitive biases can have serious consequences for the users and society, especially as LLMs are increasingly integrated into systems that affect our lives, such as education, health, and entertainment.

Human bias in AI is a serious issue that requires our vigilance and criticism. To address this issue, we should first understand the sources and mechanisms of bias in LLMs, and how they affect their performance and behavior. Then, we should calibrate LLMs with data that is representative, diverse, and fair, and that aligns with our ethical and moral values. Moreover, we should evaluate LLMs not only on their accuracy and fluency, but also on their fairness and accountability. Finally, we should design LLMs that are transparent, explainable, and controllable, and that can correct their own mistakes and learn from feedback.

Only by addressing the human bias in AI can we ensure that LLMs are truly beneficial and trustworthy for us and our society. Only by doing so can we unleash the full potential and power of LLMs, and use them for good and not evil.