When it comes to estimating how good we are at something, research consistently shows that we tend to rate ourselves as slightly better than average. This tendency is stronger in people who perform low on cognitive tests. Itโs known as the Dunning-Kruger Effect (DKE) โโ the worse people are at something the more they tend to overestimate their abilities and the โsmarterโ they are, the less they realise their true abilities.
However, a study led by Aalto University reveals that when it comes to AI, specifically, Large Language Models (LLMs), the DKE doesnโt hold, with researchers finding that all users show a significant inability to assess their performance accurately when using ChatGPT. In fact, across the board, people overestimated their performance. On top of this, the researchers identified a reversal of the Dunning-Kruger Effect โโ an identifiable tendency for those users who considered themselves more AI literate to assume their abilities were greater than they really were.
โWe found that when it comes to AI, the DKE vanishes. In fact, whatโs really surprising is that higher AI literacy brings more overconfidence,โ says Professor Robin Welsch. โWe would expect people who are AI literate to not only be a bit better at interacting with AI systems, but also at judging their performance with those systems โ but this was not the case.โ
The finding adds to a rapidly growing volume of research indicating that blindly trusting AI output comes with risks like โdumbing downโ peopleโs ability to source reliable information and even workforce de-skilling. While people did perform better when using ChatGPT, itโs concerning that they all overestimated that performance.
โAI literacy is truly important nowadays, and therefore this is a very striking effect. AI literacy might be very technical, and itโs not really helping people actually interact fruitfully with AI systemsโ, says Welsch.
โCurrent AI tools are not enough. They are not fostering metacognition [awareness of oneโs own thought processes] and we are not learning about our mistakes,โ adds doctoral researcher Daniela da Silva Fernandes. โWe need to create platforms that encourage our reflection process.โ
The article was published on October 27th in the journal Computers in Human Behavior.

The researchers designed two experiments where some 500 participants used AI to complete logical reasoning tasks from the USโs famous Law School Admission Test (LSAT). Half of the group used AI and half didnโt. After each task, subjects were asked to monitor how well they performed โโ and if they did that accurately, they were promised extra compensation.
โThese tasks take a lot of cognitive effort. Now that people use AI daily, itโs typical that you would give something like this to AI to solve, because itโs so challengingโ, Welsch says.
The data revealed that most users rarely prompted ChatGPT more than once per question. Often, they simply copied the question, put it in the AI system, and were happy with the AIโs solution without checking or second-guessing.
โWe looked at whether they truly reflected with the AI system and found that people just thought the AI would solve things for them. Usually there was just one single interaction to get the results, which means that users blindly trusted the system. Itโs what we call cognitive offloading, when all the processing is done by AIโ, Welsch explains.
This shallow level of engagement may have limited the cues needed to calibrate confidence and allow for accurate self-monitoring. Therefore, itโs plausible that encouraging or experimentally requiring multiple prompts could provide better feedback loops, enhancing usersโ metacognition, he says.
So whatโs the practical solution for everyday AI users?
โAI could ask the users if they can explain their reasoning further. This would force the user to engage more with AI, to face their illusion of knowledge, and to promote critical thinking,โ Fernandes says.
IMAGE CREDIT: ThisIsEngineering





Leave a Reply