A study led by Adam Rodman, MD, MPH, Director of AI Programs at Beth Israel Deaconess Medical Center (BIDMC), reveals that, rather than helping to reduce racial and ethnic biases, AI-driven chatbots may instead perpetuate and exacerbate disparities in medicine. The study appeared in JAMA Network Open.   

Itโ€™s well-documented that physicians undertreat Black patientsโ€™ pain compared to that of white patients. This disparity, seen across various healthcare settings and types of pain, is often attributed to the underassessment of Black patients’ pain. It’s just one example where the use of artificial intelligence (AI) was initially seen as a promising way to eliminate human biases from medicine, with the hope that data-driven algorithms could offer objective assessments, free from the prejudices and misconceptions that influence human judgment. 

โ€œThese models are very good at reflecting human biasesโ€”and not just racial biasesโ€”which is problematic if you’re going to use them to make any sort of medical decision,โ€ Rodman said. โ€œIf the system is biased the same way humans are, itโ€™s going to serve to magnify our biases or make humans more confident in their biases. It’s just going to get the human to double down on what they’re doing.โ€ 



Large language models (LLMs), also known as chatbots, have been increasingly integrated into the clinic. Googleโ€™s Gemini Pro and OpenAIโ€™s GPT-4 can assist in clinical decision-making by processing vast amounts of data scraped from existing sources, offering diagnostic suggestions, and even assessing patient symptoms. Yet, as this new research shows, as the LLMs scour reams of human knowledge, the human biases baked into the source material come right along with it.  

To investigate this issue, Rodman and his colleague, lead author Brototo Deb, MD, MIDS, of Georgetown Universityโ€“MedStar Washington Hospital Center and the University of California, Berkeley, designed a study replicating a 2016 experiment that examined racial biases among medical trainees. In the original study, 222 medical students and residents were presented with two medical vignettes describing two individualsโ€”one white and one Blackโ€”then, asked to rate their pain levels on a 10-point scale. Additionally, participants rated their agreement with false beliefs about racial biology, such as the erroneous but widespread notion that Black people have thicker skin. 

Rodman and Deb took this previous research one step further, and applied an analogous experimental setup to Gemini Pro and GPT-4to see how the LLMs would assess pain across race and ethnicity, as well as their understanding of racial biology.


Sign up for the Daily Dose Newsletter and get every morning’s best science news from around the web delivered straight to your inbox? It’s easy like Sunday morning.

Processingโ€ฆ
Success! You're on the list.

While AI models and human trainees assigned similar pain ratings, racial disparities persisted. Across the board, Black patients were underassessed for their pain compared to white patients, regardless of whether the rater was human or AI. The Gemini Pro AI model exhibited the highest percentage of false beliefs (24 percent), followed by the human trainees (12 percent), and GPT-4 with the lowest (9 percent). 

With more hospitals and clinics adopting AI for clinical decision support, this research shows chatbots could perpetuate racial and ethnic biases in medicine, leading to further inequalities in healthcare. More research is needed to explore how humans will interact with AI systems, especially in clinical settings. As physicians rely more on AI for guidance, confirmation biasโ€”the tendency for people to trust machine outputs only when they match their pre-existing beliefsโ€”could lead to even more entrenched disparities. 

โ€œIโ€™m not worried about an LLM system making autonomous decisionsโ€”that’s certainly not happening anytime soon,โ€ Rodman said. โ€œBut thereโ€™s a theme we’re seeing in our research that when these systems confirm the things humans already think, the humans agree with it, but when it provides a better answer than humans, something that disagrees with the human, humans have a tendency to just ignore it.โ€     

Dr Rodman reported receiving grants from the Gordon and Betty Moore Foundation, as well as the Macy Foundation for artificial intelligence research outside the submitted work. No other disclosures were reported.


Conversations with Stephen Meyer: On finding God through science and whether the scientific God is the Christian God.
Stephen C. Meyer advocates for intelligent design, arguing that discoveries in science …
The interstellar comet 3I/ATLAS was born somewhere much different from our solar system
Less than a year ago, astronomers discovered a comet soaring through our …

Leave a Reply

Trending

Discover more from Scientific Inquirer

Subscribe now to keep reading and get access to the full archive.

Continue reading